| <HTML> | |
| <HEAD> | |
| <TITLE>Metaclasses in Python 1.5</TITLE> | |
| </HEAD> | |
| <BODY BGCOLOR="FFFFFF"> | |
| <H1>Metaclasses in Python 1.5</H1> | |
| <H2>(A.k.a. The Killer Joke :-)</H2> | |
| <HR> | |
| (<i>Postscript:</i> reading this essay is probably not the best way to | |
| understand the metaclass hook described here. See a <A | |
| HREF="meta-vladimir.txt">message posted by Vladimir Marangozov</A> | |
| which may give a gentler introduction to the matter. You may also | |
| want to search Deja News for messages with "metaclass" in the subject | |
| posted to comp.lang.python in July and August 1998.) | |
| <HR> | |
| <P>In previous Python releases (and still in 1.5), there is something | |
| called the ``Don Beaudry hook'', after its inventor and champion. | |
| This allows C extensions to provide alternate class behavior, thereby | |
| allowing the Python class syntax to be used to define other class-like | |
| entities. Don Beaudry has used this in his infamous <A | |
| HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> package; Jim | |
| Fulton has used it in his <A | |
| HREF="http://www.digicool.com/releases/ExtensionClass/">Extension | |
| Classes</A> package. (It has also been referred to as the ``Don | |
| Beaudry <i>hack</i>,'' but that's a misnomer. There's nothing hackish | |
| about it -- in fact, it is rather elegant and deep, even though | |
| there's something dark to it.) | |
| <P>(On first reading, you may want to skip directly to the examples in | |
| the section "Writing Metaclasses in Python" below, unless you want | |
| your head to explode.) | |
| <P> | |
| <HR> | |
| <P>Documentation of the Don Beaudry hook has purposefully been kept | |
| minimal, since it is a feature of incredible power, and is easily | |
| abused. Basically, it checks whether the <b>type of the base | |
| class</b> is callable, and if so, it is called to create the new | |
| class. | |
| <P>Note the two indirection levels. Take a simple example: | |
| <PRE> | |
| class B: | |
| pass | |
| class C(B): | |
| pass | |
| </PRE> | |
| Take a look at the second class definition, and try to fathom ``the | |
| type of the base class is callable.'' | |
| <P>(Types are not classes, by the way. See questions 4.2, 4.19 and in | |
| particular 6.22 in the <A | |
| HREF="http://www.python.org/cgi-bin/faqw.py" >Python FAQ</A> | |
| for more on this topic.) | |
| <P> | |
| <UL> | |
| <LI>The <b>base class</b> is B; this one's easy.<P> | |
| <LI>Since B is a class, its type is ``class''; so the <b>type of the | |
| base class</b> is the type ``class''. This is also known as | |
| types.ClassType, assuming the standard module <code>types</code> has | |
| been imported.<P> | |
| <LI>Now is the type ``class'' <b>callable</b>? No, because types (in | |
| core Python) are never callable. Classes are callable (calling a | |
| class creates a new instance) but types aren't.<P> | |
| </UL> | |
| <P>So our conclusion is that in our example, the type of the base | |
| class (of C) is not callable. So the Don Beaudry hook does not apply, | |
| and the default class creation mechanism is used (which is also used | |
| when there is no base class). In fact, the Don Beaudry hook never | |
| applies when using only core Python, since the type of a core object | |
| is never callable. | |
| <P>So what do Don and Jim do in order to use Don's hook? Write an | |
| extension that defines at least two new Python object types. The | |
| first would be the type for ``class-like'' objects usable as a base | |
| class, to trigger Don's hook. This type must be made callable. | |
| That's why we need a second type. Whether an object is callable | |
| depends on its type. So whether a type object is callable depends on | |
| <i>its</i> type, which is a <i>meta-type</i>. (In core Python there | |
| is only one meta-type, the type ``type'' (types.TypeType), which is | |
| the type of all type objects, even itself.) A new meta-type must | |
| be defined that makes the type of the class-like objects callable. | |
| (Normally, a third type would also be needed, the new ``instance'' | |
| type, but this is not an absolute requirement -- the new class type | |
| could return an object of some existing type when invoked to create an | |
| instance.) | |
| <P>Still confused? Here's a simple device due to Don himself to | |
| explain metaclasses. Take a simple class definition; assume B is a | |
| special class that triggers Don's hook: | |
| <PRE> | |
| class C(B): | |
| a = 1 | |
| b = 2 | |
| </PRE> | |
| This can be though of as equivalent to: | |
| <PRE> | |
| C = type(B)('C', (B,), {'a': 1, 'b': 2}) | |
| </PRE> | |
| If that's too dense for you, here's the same thing written out using | |
| temporary variables: | |
| <PRE> | |
| creator = type(B) # The type of the base class | |
| name = 'C' # The name of the new class | |
| bases = (B,) # A tuple containing the base class(es) | |
| namespace = {'a': 1, 'b': 2} # The namespace of the class statement | |
| C = creator(name, bases, namespace) | |
| </PRE> | |
| This is analogous to what happens without the Don Beaudry hook, except | |
| that in that case the creator function is set to the default class | |
| creator. | |
| <P>In either case, the creator is called with three arguments. The | |
| first one, <i>name</i>, is the name of the new class (as given at the | |
| top of the class statement). The <i>bases</i> argument is a tuple of | |
| base classes (a singleton tuple if there's only one base class, like | |
| the example). Finally, <i>namespace</i> is a dictionary containing | |
| the local variables collected during execution of the class statement. | |
| <P>Note that the contents of the namespace dictionary is simply | |
| whatever names were defined in the class statement. A little-known | |
| fact is that when Python executes a class statement, it enters a new | |
| local namespace, and all assignments and function definitions take | |
| place in this namespace. Thus, after executing the following class | |
| statement: | |
| <PRE> | |
| class C: | |
| a = 1 | |
| def f(s): pass | |
| </PRE> | |
| the class namespace's contents would be {'a': 1, 'f': <function f | |
| ...>}. | |
| <P>But enough already about writing Python metaclasses in C; read the | |
| documentation of <A | |
| HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> or <A | |
| HREF="http://www.digicool.com/papers/ExtensionClass.html" >Extension | |
| Classes</A> for more information. | |
| <P> | |
| <HR> | |
| <H2>Writing Metaclasses in Python</H2> | |
| <P>In Python 1.5, the requirement to write a C extension in order to | |
| write metaclasses has been dropped (though you can still do | |
| it, of course). In addition to the check ``is the type of the base | |
| class callable,'' there's a check ``does the base class have a | |
| __class__ attribute.'' If so, it is assumed that the __class__ | |
| attribute refers to a class. | |
| <P>Let's repeat our simple example from above: | |
| <PRE> | |
| class C(B): | |
| a = 1 | |
| b = 2 | |
| </PRE> | |
| Assuming B has a __class__ attribute, this translates into: | |
| <PRE> | |
| C = B.__class__('C', (B,), {'a': 1, 'b': 2}) | |
| </PRE> | |
| This is exactly the same as before except that instead of type(B), | |
| B.__class__ is invoked. If you have read <A HREF= | |
| "http://www.python.org/cgi-bin/faqw.py?req=show&file=faq06.022.htp" | |
| >FAQ question 6.22</A> you will understand that while there is a big | |
| technical difference between type(B) and B.__class__, they play the | |
| same role at different abstraction levels. And perhaps at some point | |
| in the future they will really be the same thing (at which point you | |
| would be able to derive subclasses from built-in types). | |
| <P>At this point it may be worth mentioning that C.__class__ is the | |
| same object as B.__class__, i.e., C's metaclass is the same as B's | |
| metaclass. In other words, subclassing an existing class creates a | |
| new (meta)inststance of the base class's metaclass. | |
| <P>Going back to the example, the class B.__class__ is instantiated, | |
| passing its constructor the same three arguments that are passed to | |
| the default class constructor or to an extension's metaclass: | |
| <i>name</i>, <i>bases</i>, and <i>namespace</i>. | |
| <P>It is easy to be confused by what exactly happens when using a | |
| metaclass, because we lose the absolute distinction between classes | |
| and instances: a class is an instance of a metaclass (a | |
| ``metainstance''), but technically (i.e. in the eyes of the python | |
| runtime system), the metaclass is just a class, and the metainstance | |
| is just an instance. At the end of the class statement, the metaclass | |
| whose metainstance is used as a base class is instantiated, yielding a | |
| second metainstance (of the same metaclass). This metainstance is | |
| then used as a (normal, non-meta) class; instantiation of the class | |
| means calling the metainstance, and this will return a real instance. | |
| And what class is that an instance of? Conceptually, it is of course | |
| an instance of our metainstance; but in most cases the Python runtime | |
| system will see it as an instance of a a helper class used by the | |
| metaclass to implement its (non-meta) instances... | |
| <P>Hopefully an example will make things clearer. Let's presume we | |
| have a metaclass MetaClass1. It's helper class (for non-meta | |
| instances) is callled HelperClass1. We now (manually) instantiate | |
| MetaClass1 once to get an empty special base class: | |
| <PRE> | |
| BaseClass1 = MetaClass1("BaseClass1", (), {}) | |
| </PRE> | |
| We can now use BaseClass1 as a base class in a class statement: | |
| <PRE> | |
| class MySpecialClass(BaseClass1): | |
| i = 1 | |
| def f(s): pass | |
| </PRE> | |
| At this point, MySpecialClass is defined; it is a metainstance of | |
| MetaClass1 just like BaseClass1, and in fact the expression | |
| ``BaseClass1.__class__ == MySpecialClass.__class__ == MetaClass1'' | |
| yields true. | |
| <P>We are now ready to create instances of MySpecialClass. Let's | |
| assume that no constructor arguments are required: | |
| <PRE> | |
| x = MySpecialClass() | |
| y = MySpecialClass() | |
| print x.__class__, y.__class__ | |
| </PRE> | |
| The print statement shows that x and y are instances of HelperClass1. | |
| How did this happen? MySpecialClass is an instance of MetaClass1 | |
| (``meta'' is irrelevant here); when an instance is called, its | |
| __call__ method is invoked, and presumably the __call__ method defined | |
| by MetaClass1 returns an instance of HelperClass1. | |
| <P>Now let's see how we could use metaclasses -- what can we do | |
| with metaclasses that we can't easily do without them? Here's one | |
| idea: a metaclass could automatically insert trace calls for all | |
| method calls. Let's first develop a simplified example, without | |
| support for inheritance or other ``advanced'' Python features (we'll | |
| add those later). | |
| <PRE> | |
| import types | |
| class Tracing: | |
| def __init__(self, name, bases, namespace): | |
| """Create a new class.""" | |
| self.__name__ = name | |
| self.__bases__ = bases | |
| self.__namespace__ = namespace | |
| def __call__(self): | |
| """Create a new instance.""" | |
| return Instance(self) | |
| class Instance: | |
| def __init__(self, klass): | |
| self.__klass__ = klass | |
| def __getattr__(self, name): | |
| try: | |
| value = self.__klass__.__namespace__[name] | |
| except KeyError: | |
| raise AttributeError, name | |
| if type(value) is not types.FunctionType: | |
| return value | |
| return BoundMethod(value, self) | |
| class BoundMethod: | |
| def __init__(self, function, instance): | |
| self.function = function | |
| self.instance = instance | |
| def __call__(self, *args): | |
| print "calling", self.function, "for", self.instance, "with", args | |
| return apply(self.function, (self.instance,) + args) | |
| Trace = Tracing('Trace', (), {}) | |
| class MyTracedClass(Trace): | |
| def method1(self, a): | |
| self.a = a | |
| def method2(self): | |
| return self.a | |
| aninstance = MyTracedClass() | |
| aninstance.method1(10) | |
| print "the answer is %d" % aninstance.method2() | |
| </PRE> | |
| Confused already? The intention is to read this from top down. The | |
| Tracing class is the metaclass we're defining. Its structure is | |
| really simple. | |
| <P> | |
| <UL> | |
| <LI>The __init__ method is invoked when a new Tracing instance is | |
| created, e.g. the definition of class MyTracedClass later in the | |
| example. It simply saves the class name, base classes and namespace | |
| as instance variables.<P> | |
| <LI>The __call__ method is invoked when a Tracing instance is called, | |
| e.g. the creation of aninstance later in the example. It returns an | |
| instance of the class Instance, which is defined next.<P> | |
| </UL> | |
| <P>The class Instance is the class used for all instances of classes | |
| built using the Tracing metaclass, e.g. aninstance. It has two | |
| methods: | |
| <P> | |
| <UL> | |
| <LI>The __init__ method is invoked from the Tracing.__call__ method | |
| above to initialize a new instance. It saves the class reference as | |
| an instance variable. It uses a funny name because the user's | |
| instance variables (e.g. self.a later in the example) live in the same | |
| namespace.<P> | |
| <LI>The __getattr__ method is invoked whenever the user code | |
| references an attribute of the instance that is not an instance | |
| variable (nor a class variable; but except for __init__ and | |
| __getattr__ there are no class variables). It will be called, for | |
| example, when aninstance.method1 is referenced in the example, with | |
| self set to aninstance and name set to the string "method1".<P> | |
| </UL> | |
| <P>The __getattr__ method looks the name up in the __namespace__ | |
| dictionary. If it isn't found, it raises an AttributeError exception. | |
| (In a more realistic example, it would first have to look through the | |
| base classes as well.) If it is found, there are two possibilities: | |
| it's either a function or it isn't. If it's not a function, it is | |
| assumed to be a class variable, and its value is returned. If it's a | |
| function, we have to ``wrap'' it in instance of yet another helper | |
| class, BoundMethod. | |
| <P>The BoundMethod class is needed to implement a familiar feature: | |
| when a method is defined, it has an initial argument, self, which is | |
| automatically bound to the relevant instance when it is called. For | |
| example, aninstance.method1(10) is equivalent to method1(aninstance, | |
| 10). In the example if this call, first a temporary BoundMethod | |
| instance is created with the following constructor call: temp = | |
| BoundMethod(method1, aninstance); then this instance is called as | |
| temp(10). After the call, the temporary instance is discarded. | |
| <P> | |
| <UL> | |
| <LI>The __init__ method is invoked for the constructor call | |
| BoundMethod(method1, aninstance). It simply saves away its | |
| arguments.<P> | |
| <LI>The __call__ method is invoked when the bound method instance is | |
| called, as in temp(10). It needs to call method1(aninstance, 10). | |
| However, even though self.function is now method1 and self.instance is | |
| aninstance, it can't call self.function(self.instance, args) directly, | |
| because it should work regardless of the number of arguments passed. | |
| (For simplicity, support for keyword arguments has been omitted.)<P> | |
| </UL> | |
| <P>In order to be able to support arbitrary argument lists, the | |
| __call__ method first constructs a new argument tuple. Conveniently, | |
| because of the notation *args in __call__'s own argument list, the | |
| arguments to __call__ (except for self) are placed in the tuple args. | |
| To construct the desired argument list, we concatenate a singleton | |
| tuple containing the instance with the args tuple: (self.instance,) + | |
| args. (Note the trailing comma used to construct the singleton | |
| tuple.) In our example, the resulting argument tuple is (aninstance, | |
| 10). | |
| <P>The intrinsic function apply() takes a function and an argument | |
| tuple and calls the function for it. In our example, we are calling | |
| apply(method1, (aninstance, 10)) which is equivalent to calling | |
| method(aninstance, 10). | |
| <P>From here on, things should come together quite easily. The output | |
| of the example code is something like this: | |
| <PRE> | |
| calling <function method1 at ae8d8> for <Instance instance at 95ab0> with (10,) | |
| calling <function method2 at ae900> for <Instance instance at 95ab0> with () | |
| the answer is 10 | |
| </PRE> | |
| <P>That was about the shortest meaningful example that I could come up | |
| with. A real tracing metaclass (for example, <A | |
| HREF="#Trace">Trace.py</A> discussed below) needs to be more | |
| complicated in two dimensions. | |
| <P>First, it needs to support more advanced Python features such as | |
| class variables, inheritance, __init__ methods, and keyword arguments. | |
| <P>Second, it needs to provide a more flexible way to handle the | |
| actual tracing information; perhaps it should be possible to write | |
| your own tracing function that gets called, perhaps it should be | |
| possible to enable and disable tracing on a per-class or per-instance | |
| basis, and perhaps a filter so that only interesting calls are traced; | |
| it should also be able to trace the return value of the call (or the | |
| exception it raised if an error occurs). Even the Trace.py example | |
| doesn't support all these features yet. | |
| <P> | |
| <HR> | |
| <H1>Real-life Examples</H1> | |
| <P>Have a look at some very preliminary examples that I coded up to | |
| teach myself how to write metaclasses: | |
| <DL> | |
| <DT><A HREF="Enum.py">Enum.py</A> | |
| <DD>This (ab)uses the class syntax as an elegant way to define | |
| enumerated types. The resulting classes are never instantiated -- | |
| rather, their class attributes are the enumerated values. For | |
| example: | |
| <PRE> | |
| class Color(Enum): | |
| red = 1 | |
| green = 2 | |
| blue = 3 | |
| print Color.red | |
| </PRE> | |
| will print the string ``Color.red'', while ``Color.red==1'' is true, | |
| and ``Color.red + 1'' raise a TypeError exception. | |
| <P> | |
| <DT><A NAME=Trace></A><A HREF="Trace.py">Trace.py</A> | |
| <DD>The resulting classes work much like standard | |
| classes, but by setting a special class or instance attribute | |
| __trace_output__ to point to a file, all calls to the class's methods | |
| are traced. It was a bit of a struggle to get this right. This | |
| should probably redone using the generic metaclass below. | |
| <P> | |
| <DT><A HREF="Meta.py">Meta.py</A> | |
| <DD>A generic metaclass. This is an attempt at finding out how much | |
| standard class behavior can be mimicked by a metaclass. The | |
| preliminary answer appears to be that everything's fine as long as the | |
| class (or its clients) don't look at the instance's __class__ | |
| attribute, nor at the class's __dict__ attribute. The use of | |
| __getattr__ internally makes the classic implementation of __getattr__ | |
| hooks tough; we provide a similar hook _getattr_ instead. | |
| (__setattr__ and __delattr__ are not affected.) | |
| (XXX Hm. Could detect presence of __getattr__ and rename it.) | |
| <P> | |
| <DT><A HREF="Eiffel.py">Eiffel.py</A> | |
| <DD>Uses the above generic metaclass to implement Eiffel style | |
| pre-conditions and post-conditions. | |
| <P> | |
| <DT><A HREF="Synch.py">Synch.py</A> | |
| <DD>Uses the above generic metaclass to implement synchronized | |
| methods. | |
| <P> | |
| <DT><A HREF="Simple.py">Simple.py</A> | |
| <DD>The example module used above. | |
| <P> | |
| </DL> | |
| <P>A pattern seems to be emerging: almost all these uses of | |
| metaclasses (except for Enum, which is probably more cute than useful) | |
| mostly work by placing wrappers around method calls. An obvious | |
| problem with that is that it's not easy to combine the features of | |
| different metaclasses, while this would actually be quite useful: for | |
| example, I wouldn't mind getting a trace from the test run of the | |
| Synch module, and it would be interesting to add preconditions to it | |
| as well. This needs more research. Perhaps a metaclass could be | |
| provided that allows stackable wrappers... | |
| <P> | |
| <HR> | |
| <H2>Things You Could Do With Metaclasses</H2> | |
| <P>There are lots of things you could do with metaclasses. Most of | |
| these can also be done with creative use of __getattr__, but | |
| metaclasses make it easier to modify the attribute lookup behavior of | |
| classes. Here's a partial list. | |
| <P> | |
| <UL> | |
| <LI>Enforce different inheritance semantics, e.g. automatically call | |
| base class methods when a derived class overrides<P> | |
| <LI>Implement class methods (e.g. if the first argument is not named | |
| 'self')<P> | |
| <LI>Implement that each instance is initialized with <b>copies</b> of | |
| all class variables<P> | |
| <LI>Implement a different way to store instance variables (e.g. in a | |
| list kept outside the instance but indexed by the instance's id())<P> | |
| <LI>Automatically wrap or trap all or certain methods | |
| <UL> | |
| <LI>for tracing | |
| <LI>for precondition and postcondition checking | |
| <LI>for synchronized methods | |
| <LI>for automatic value caching | |
| </UL> | |
| <P> | |
| <LI>When an attribute is a parameterless function, call it on | |
| reference (to mimic it being an instance variable); same on assignment<P> | |
| <LI>Instrumentation: see how many times various attributes are used<P> | |
| <LI>Different semantics for __setattr__ and __getattr__ (e.g. disable | |
| them when they are being used recursively)<P> | |
| <LI>Abuse class syntax for other things<P> | |
| <LI>Experiment with automatic type checking<P> | |
| <LI>Delegation (or acquisition)<P> | |
| <LI>Dynamic inheritance patterns<P> | |
| <LI>Automatic caching of methods<P> | |
| </UL> | |
| <P> | |
| <HR> | |
| <H4>Credits</H4> | |
| <P>Many thanks to David Ascher and Donald Beaudry for their comments | |
| on earlier draft of this paper. Also thanks to Matt Conway and Tommy | |
| Burnette for putting a seed for the idea of metaclasses in my | |
| mind, nearly three years ago, even though at the time my response was | |
| ``you can do that with __getattr__ hooks...'' :-) | |
| <P> | |
| <HR> | |
| </BODY> | |
| </HTML> |