logo le blog invivoo blanc

Introduction to metaclasses

5 February 2019 | Python | 0 comments

Python is a strongly object-oriented language and is becoming increasingly popular over the years. Lots of projects continuously emerge which size and complexity make them much more than just a bunch of scripts. Such projects make extensive usage of OOP, defining their own classes and potentially dealing with multiple inheritance, mixins, ABCs, etc. At some point, the need for customization of these classes appears by design.

Indeed, imagine for example that you want to write a library, because it either simplifies developments in a project of yours or would provide rich features to any external user who would import them in their own project. Libraries relying on OOP will to provide their APIs through class handlings. Very well, but as the developer of such a library, how could you take the code written by a user as input data to proceed with your own treatments? Several answers might be appropriate according to what needs to be done precisely, but we present here the most flexible solution, which is also the most difficult to implement: metaclasses.

1. One class to create them all

Just like any Python developer, one of the first things I was told when starting to learn the language is that “in Python, everything is an object”. It is also true for classes. But since objects are instances of classes, if classes are objects themselves, they must be instances of something else, right? This other thing is what is called a metaclass. Literally, a metaclass is a class of a class.

Then what is the class used when we create a class? Remember that the class of an object is its type (we can use it as a type hint for instance), and we can ask for the class of any object with the built-in type() function. Now let’s ask a Python 3 interpreter to tell us what the default class of a class is:

>>> class MyClass:
. . .         pass
. . .
>>> my_object = MyClass()
>>> type(my_object)
<class ‘MyClass’>
>>> type(MyClass)
<class ‘type’>

And the answer is… type. Yes, in Python type has two meanings: the one we already knew and a less known one, the naming used for the default metaclass.

2. Dynamic class creation

This default metaclass can be used to create a class on the fly through instantiation. However, its constructor needs the following mandatory arguments:

MyClass = type(class_name, class_bases, attributes)

Where:

  • class_name refers to the name of the class (could be different from ‘MyClass’ here, but such a practice would be just more confusing) ;
  • class_bases must be a tuple filled with all other classes or mixins that we want MyClass to derive from ;
  • attributes must be a dictionary containing all the attributes and methods of the class (a method being simply a callable attribute accepting self as its first parameter), with their names as keys.

The previous example of creating an empty class could thus have been written as:

MyClass = type(‘MyClass’, (), {})

3. Custom metaclasses

Now, what if you want to intercept this class creation for adding behaviors or data the classes you create? One solution, if not the only one according to the problem you want to solve, is to build a custom metaclass. This is done simply by inheriting from type.

Any custom metaclass created in this way must define its __new__ method as to return the custom class it targets. Here is a sample code showing how it works:

>>> class MyMeta(type):
. . .         def __new__(metacls, class_name: str, class_bases: tuple, class_attrs: dict):
. . .                # Do stuff before creating the class
. . .                cls = type.__new__(metacls, class_name, class_bases, class_attrs)
. . .                # Do stuff after it
. . .                return cls

As you can see, the arguments are pretty much the same as what we saw in the previous section. They will be passed with relevant values automatically whenever a class using this metaclass is interpreted.

type.__new__ is the factory we need in this context to actually build the class object, but any modification of its arguments can be done before calling it. Similarly, it is possible to post-process the class object before returning it.

Now on the caller side, how to make use of such a metaclass? In Python 3, the metaclass is a keyword argument that can be passed at class declaration, after the list of base classes from which to inherit.

>>> class MyClass(metaclass=MyMeta):
. . .         pass

When the Python interpreter reads this declaration, citing the official documentation, the following steps occur:

  1. MRO entries are resolved (the inheritance tree)
  2. The appropriate metaclass is determined (here MyMeta)
  3. The class namespace is prepared (here the namespace is class_attrs, we explain this step below)
  4. The class body is executed (well, it is empty in ClassA…)
  5. The class object is created (through MyMeta instead of following the usual path)

The third step is something we didn’t mention yet. It is about preparing the class_attrs argument that will contain everything parsed by the interpreter when executing the class body. If nothing is done, it will just be a regular dictionary. However, we can change this to provide a custom container for class_attrs, which just has to offer the same interface as a dict. A typical use for this is to turn class_attrs into an OrderedDict in order to remember the order in which attributes are declared. The method allowing this alteration is called __prepare__. If we want to use it in our previous metaclass example, it reads like this:

>>> class MyMeta(type):
. . . 
. . .         def __prepare__(metacls, class_name: str, class_bases: tuple):
. . .                return OrderedDict()
. . . 
. . .         def __new__(metacls, class_name: str, class_bases: tuple, class_attrs: OrderedDict):
. . .                return type.__new__(metacls, class_name, class_bases, class_attrs)

4. Usage of custom metaclasses

Now that all fundamentals have been introduced, it’s time for illustrating what can be done with metaclasses. The most common reasons (that I’ve found so far) why developers might end up creating their own metaclasses are for:

  • modifying or adding class attributes,
  • automatic (hidden) registration to other namespaces or subscription to services.

For instance, they get handy when it comes to producing nice APIs and can be found at the core of most ORMs, like SQLAlchemy or Django, since they shape their APIs in a declarative style. However, they are far from being limited to these two categories of uses. Again, the official documentation discloses the minds of core Python developers about it: “The potential uses for metaclasses are boundless. Some ideas that have been explored include enum, logging, interface checking, automatic delegation, automatic property creation, proxies, frameworks, and automatic resource locking/synchronization.”

Because code can be much clearer than a thousand words, let’s analyze a demonstration code:

from collections import OrderedDict


class MyStore:
    """Store keeping track of singleton instances."""

    def __init__(self):
        self.store = {}

    def __str__(self):
        return str({key: str(value) for key, value in self.store.items()})

    def register(self, name, obj):
        self.store[name] = obj


my_store = MyStore()


class MyField(str):
    pass


class MyMeta(type):
    """
    Example of metaclass demonstrating some of the classical features that such
    a construct can provide: class alteration and registration.
    """

    @staticmethod
    def __prepare__(class_name: str, class_bases: tuple):
        return OrderedDict()

    def __new__(metacls, class_name: str, class_bases: tuple,
                class_attrs: OrderedDict):

        # Reorganizing attributes:
        reorganized_attrs = OrderedDict([('_fields', OrderedDict()),
                                         ('_constants', OrderedDict())])
        for name, attr in class_attrs.items():
            if isinstance(attr, MyField):
                reorganized_attrs['_fields'][name] = attr
            elif not name.startswith('__') and not callable(attr):
                reorganized_attrs['_constants'][name] = attr
            else:
                reorganized_attrs[name] = attr

        # Creating the class:
        cls = type.__new__(metacls, class_name, class_bases, reorganized_attrs)

        # Initializing the singleton pattern:
        obj = cls()
        cls._obj = obj

        # Registering the new object:
        my_store.register(class_name, obj)

        # Displaying the results of the application of the metaclass:
        print("Here is what {} contains:".format(cls.__name__))
        for name, attr in cls.__dict__.items():
            print("    . {}: {}".format(name, attr))
        print("")

        return cls

    def __call__(cls, *args, **kwargs):
    """Implementing the singleton pattern at class call."""
    if not hasattr(cls, '_obj'):
        obj = cls.__new__(cls, *args, **kwargs)
        obj.__init__(*args, **kwargs)
        return obj
    else:
        return cls._obj


class MyClass(metaclass=MyMeta):
    """Example of a user-defined (client) class that makes use of MyMeta."""

    # Demo attributes (mixed fields and constants)
    a = 42
    b = MyField('foo')
    c = MyField('bar')
    d = 'MyField' in globals()

    def __str__(self):
        """Showing the memory address of self (proving it is a singleton)."""
        return "I'm located at: {}".format(id(self))


test_instance = MyClass()
print(test_instance)
other_instance = MyClass()
print("Once again", other_instance)
print("The store is:", my_store)

Which outputs the following results with a Python 3.6 interpreter:

Here is what MyClass contains:
    . _fields: OrderedDict([('b', 'foo'), ('c', 'bar')])
    . _constants: OrderedDict([('a', 42), ('d', True)])
    . __module__: __main__
    . __doc__: Example of a user-defined (client) class that makes use of MyMeta.
    . __str__: <function MyClass.__str__ at 0x1024c5a60>
    . __dict__: <attribute '__dict__' of 'MyClass' objects>
    . __weakref__: <attribute '__weakref__' of 'MyClass' objects>
    . _obj: I'm located at: 4333605272

I'm located at: 4333605272
Once again I'm located at: 4333605272
The store is: {'MyClass': "I'm located at: 4333605272"}

Process finished with exit code 0

If we come back to the analogy of a library writer, things are as follows. As the library author, I am responsible for everything written before declaring MyClass. As the library consumer (client), I just write classes like MyClass, making use of all the rest.

Several things could be achieved in this example thanks to the metaclass MyMeta:

  • Some class attributes are reorganized (no addition or alteration in this example) inside custom collections and according to custom criteria. In the final object (MyClass._obj), the attributes a, b, c and d will not be accessible anymore through self immediately, but only in these collections.
  • The original ordering of attribute declarations in MyClass is preserved in the class_attrs argument of MyMeta.__new__ thanks to MyMeta.__prepare__. This allows building the _fields and _constants collections in an ordered way. Note that it does not prevent cls.__dict__ and obj.__dict__ to be ordinary dictionaries, in which this ordering is lost.
  • A singleton pattern is implemented entirely in the metaclass: in MyMeta.__new__, an instance of the class is created at the same time as the class itself; then in MyMeta.__call__, this instance is always returned instead of any really new instance (except for the very first time, i.e. for the call in MyMeta.__new__).
  • Each singleton instance is registered in a global store of client instances, which could be consumed by any third-party service for example.

5. Additional notes

A metaclass is essentially a class like all those we know. Ultimately, this means that you can define inside it all methods you would implement ordinarily: __init__, __call__, __repr__, operator overloadings, etc. What you can achieve with this remains sometimes vague. For instance with __init__ and __call__ (and maybe a few others, like __repr__ or __str__ for instance which could used for logging): the latter is already used in our example to intercept the creation a new initialized instance and return a singleton instead, while the first could be used to store anything done once the class is created (everything after type.__new__ could have been put in a method MyMeta.__init_). But these are only examples and one could imagine many other uses.

Also, metaclasses are inherited. This means that if any class derives from MyClass, its creation path will go through MyMeta as well. More generally, Python applies some specific rules in order to determine the appropriate metaclass in an inheritance tree. These can be found in the official documentation but won’t be repeated here.

Finally, we should note that there exist some alternative techniques in order to avoid messing with metaclasses:

  • Class decorators
    • Since classes are callable objects, the mechanism of decorators as we know it for functions can be applied to them identically. As a result, a class decorator is simply a function or a class taking a class in input and returning it modified in output, for instance by applying some monkey patching to it.
  • __init_subclass__
    • This one is a recent feature added in Python 3.6 (see PEP 487). It is a hook that can be defined in any class from which other ones inherit. When a child class is instantiated, this method is called from the parent with this child class in arguments. Then we can modify its content it at will, just like we would do otherwise with a class decorator or a metaclass. This solution is just much simpler to implement.

6. Conclusion

Metaclasses are a very advanced and powerful tool, especially for creating nice APIs. On the other hand, their implementation is not easy and requires some experience, as well as a clear understanding of Python’s OOP mechanisms. This is why this citation grew famous and regularly cited when discussing about this topic:

Metaclasses are deeper magic that 99% of users should never worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).

From Tim Peters, Python guru and author of PEP 20 (The Zen of Python).

This complexity indeed should encourage developers to favor alternative solutions, like those we briefly cited above. If you don’t think of any other way to achieve something, I hope these lines will help you to get a good start!