logo le blog invivoo blanc

Slots : an unknown optimization

22 July 2019 | Python | 0 comments

As a former C developer working in high-performance computing, I wondered very early about the compactness of objects that were commonly created in Python. It quickly became clear to me that this was not optimal in most cases, resulting from the choice of original designs that were wanted and assumed.

Nevertheless, I discovered on this occasion that there was a way to partially solve this problem since the appearance of new classes in Python 3: resort to attribute declarations called slots.

1. A few reminders and observations

When we write a class in Python, we do not make a static statement of its attributes. At best, in the __init__ method of the object we initialize a number of instance attributes. But that does not prevent us from adding on the fly thereafter. Let’s illustrate this:

>>> class MaClasse:
. . .      def __init__(self, a, b):
. . .          self.a = a
. . .          self.b = b
. . .
>>> mon_objet = MaClasse(1, 2)
>>> mon_objet.c = 3

We see that the interpreter accepts that I added the attribute c. How does it work? Without dwelling on the subtleties of access to attributes, it is the existence of a special attribute called __dict__ that makes the operation possible. Indeed, the last line of code is equivalent to:

>>> mon_objet.__dict__[‘c’] = 3

All the instance attributes added via an instruction of the type self.x, whether in the __init__ method or elsewhere, end in this dictionary.

It is from this simple observation that I began very early to ask myself the question of performance and compactness in memory of the objects thus constructed. Since they are extensible, their implementation cannot be optimal with regard to these criteria. CPython is the main implementation of Python and we are interested here. It is in the obligation to allocate memory to the blind, a first time. Then others, according to a heuristic that it seems we do not need to know. And in terms of performance, having to look for an attribute in a Python dictionary is probably not the most optimal way either.

It is actually mainly for this performance issue that Guido van Rossum created the slots.

2. Presentation

Slots are a way of declaring instance attributes that make up our objects. From the beginning, I took care to talk about instance attributes to make the distinction with the other attributes accessible from an object, like the methods that can be applied to it for example, which are class attributes.

In the previous example, I defined three attributes during the life of my object. Let’s see how to redefine this one with slots.

>>> class MaClasseSlottee:
. . .  
. . .      __slots__ = (‘a’, ‘b’, ‘c’)
. . . 
. . .      def __init__(self, a, b):
. . .          self.a = a
. . .          self.b = b
. . .
>>> mon_objet_slotte = MaClasseSlottee(1, 2)
>>> mon_objet_slotte.c = 3

That’s all! Just add an attribute of class _slots_ returning to an iterable with the names of the instance attributes that you want to manipulate. No other will be accepted:

>>> mon_objet_slotte.d = 4
Traceback (most recent call last):
    File “<input>”, line 1, in <module>
AttributeError: ‘MaClasseSlottee’ object has no attribute ‘d’

Indeed, the attribute d does not appear in the list of those we have declared. What are the other consequences of adding __slots__? First, the complete disappearance of __dict__:

>>> mon_objet_slotte.__dict__[‘c’]
Traceback (most recent call last):
    File “<input>”, line 1, in <module>
AttributeError: ‘MaClasseSlottee’ object has no attribute ‘__dict__’

Then, the gain in terms of memory occupancy and performance. The first aspect is difficult to measure in Python because the function getsizeof of the module sys, provided for this purpose, is based on a magic method __sizeof__ whose behavior by default will not proceed to a deep inspection of the object. Let’s see what this gives:

>>> getsizeof(mon_objet)
56
>>> getsizeof(mon_objet_slotte)
64

At first glance, adding slots results in a heavier object in memory. But, at first glance only. In fact, it’s just that we have omitted a small detail. The first object that we created keeps its attributes in its __dict__, whereas it is not the case for the second, since its __dict__ has disappeared. Gold getsizeof applied on the entire object does not measure the memory occupation of __dict__ content. Let’s check by ourselves:

>>> getsizeof(mon_objet.__dict__)
112

Because of its size, __dict__ was not measured by getsizeof. On the other hand, this is also true for __slots__ in the second object, which must therefore be considered to be fair:

>>> getsizeof(mon_objet_slotte.__slots__)
72

The comparison that really makes sense is rather 56 + 112 against 64 + 72 (168 against 136)! As promised, we observe a gain.

Nevertheless it is essential to take a step back because the more we add attributes, the difference between the two approaches becomes anecdotal compared to the total size of the object. This is the reason why one can often read and hear that slots are useful when one has to work with many instances of the same class whose size is relatively small. It’s true, but not only!

In fact, as I mentioned earlier, the main reason for creating this system is the question of performance. Eliminating search in a dictionary and calling a descriptor instead makes attribute access faster. This effect can be measured relatively simply by profiling a piece of code repeating read / write operations:

>>> import timeit
>>> def test(avec_slots=False):
. . .      instance = MaClasse(1, 2) if not avec_slots else MaClasseSlottee(1, 2)
. . .      def repeat_basic_operations(instance=instance):
. . .          for _ in range(10):
. . .              instance.a, instance.b = 3, 4
. . .              instance.a, instance.b
. . .      return repeat_basic_operations
. . .
>>> for _ in range(10):
. . .      avec_slots = min(timeit.repeat(test(avec_slots=True)))
. . .      sans_slots = min(timeit.repeat(test()))
. . .      printf(f’Le test sans slots prend {(sans_slots/avec_slots – 1)*100:.4}% de temps en plus’)
. . .  
Le test sans slots prend 15.21% de temps en plus
Le test sans slots prend 14.62% de temps en plus
Le test sans slots prend 14.76% de temps en plus
Le test sans slots prend 16.23% de temps en plus
Le test sans slots prend 15.6% de temps en plus
Le test sans slots prend 14.8% de temps en plus
Le test sans slots prend 18.42% de temps en plus
Le test sans slots prend 15.78% de temps en plus
Le test sans slots prend 15.76% de temps en plus
Le test sans slots prend 15.27% de temps en plus

These times were obtained from a Python 3.6.7 version on a Macbook Pro.

3. Disadvantages

All is unfortunately not perfect and I will now discuss the problems that I consider major with this technique.

Let’s immediately dismiss the impossibility of dynamically adding attributes to objects thanks to __dict__, because that is precisely what is behind the observed gain. But above all, because it is quite possible to add __dict__ among __slots__! We then end up with a kind of partial optimization that concerns only the attributes that are declared among the slots, while we can dynamically add all the other attributes we want.

The major problem is the inheritance and reusability of the code. There are three rules to keep in mind when trying to add slots to an inheritance tree:

  • The slots of a parent class are added to those of the file class
  • There can only be one parent class with a sequence of non-empty slots
  • Only one class in the inheritance tree fails to declare slots, even empty, so that the resulting instances have a _dict_

So we can factor a number of slots in a parent class, which can be very useful. On the other hand, the last two points are problematic. When you want to define slots in a class, it implies to go back to all the parent classes to add either relevant slots, even if you have to introduce a __dict__ if it is absolutely necessary to be able to dynamically add attributes (despite the losses performance that we saw previously).

In the case of multiple inheritances, because of the second point, it means that we are faced with a difficult situation. Either we can declare a sequence of empty slots because the class does not provide an attribute (only methods), or we have no way to prevent the creation of a __dict__. This last disadvantage is also found if the classes which one inherit are found in a library which did not foresee this case of use and that we can not edit it. For those who have already navigated a little in the code of some modules of the standard library (I am thinking for example of typing and collections) that define new types of objects, that’s why you will find many definitions of slots, mostly insignificant.

CONCLUSION

We have seen that slots are an optimization technique potentially very simple to implement. As seen by experience, I know that it is not uncommon at all to have objects whose structure never varies and from which derives either directly from the object class or from one or two parent classes can edit freely. In these cases, I see no reason not to take advantage of the slots. Possible girl classes can choose to reuse this mechanism or not, free to their developer. The only potentially awkward point is that the class we are writing is found to be used as a mixin among others who declare non-empty slots, but this is far from being a common case.

Be that, as it may, keep in mind that premature optimization is the source of many ills. The best way to proceed is certainly is to add this optimization a posteriori and gradually, profiling its code, again and again.