2014-03-24
Python has an intriguing feature called "slots".
Ordinarily, every instance of a user-defined Python class has a
__dict__
attribute. This is a dictionary used to dynamically
store attributes as key-value pairs.
The __slots__
class attribute can be used to statically define
available instance attributes, which by default removes the instance
__dict__ altogether.
The main reason to define __slots__ is to save memory. Every instance __dict__ takes up space; if all instance attribute names are known at class creation time, then there's no need to have an instance __dict__.
One twist on __slots__ is that it's possible to explicitly define a __dict__ slot, which will lead to the instance dict being created after all.
Note that the "d" attribute appears in the instance __dict__, but the "c" attribute does not.
A final piece of background information is that slots accumulate through inheritance. A slot should be defined once per class hierarchy.
Don't believe everything you read
The Python docs say, "When inheriting from a class without __slots__, the __dict__ attribute of that class will always be accessible, so a __slots__ definition in the subclass is meaningless." The last statement is actually incorrect.
I noticed this when using the
pympler memory measurement
tool. Python's standard library includes sys.getsizeof
, but that
doesn't count nested objects. We can see that Pympler provides
more intuitive numbers (measurements taken on Python 2.7.5).
Now, the real test. A
is a class without __slots__, and B
is
a subclass with a __slots__ definition. Do the __slots__ matter?
An instance of the class with slots defined starts out slightly larger, presumably because it has storage allocated for attributes defined as slots. But after we set six attributes on each instance, the instance without __slots__ uses substantially more memory. Why's that?
Python's dict
objects resize after a certain number of key-value
pairs are added. The first five resizes occur at 6, 22, 86, 342, and
1366 elements (increase by alternating powers of 2: +16, +64, +256,
+1024).
An instance of class B
has only five elements in its dict, with one
in special slots storage. The instance of A
has six key-value pairs
in its __dict__, which explains the large leap in memory consumption.
Conclusion
So, perhaps there's a minor inaccuracy in the stellar Python documentation. In any case, I learned a lot in the lead-up to this post.