Define Descriptor and outline its protocol, and show how to call Descriptor. Learn more about custom Descriptors and several built-in Python descriptors, including functions, properties, static methods, and class methods. It is illustrated by pure Python code equivalence and application examples.

Learning Descriptor can not only get more toolsets, but also better understand the operating principle of Python and its elegant design.

Definition and Introduction

In general, Descriptor is an object property with “binding behavior”, access to which can be controlled by methods according to the Descriptor protocol. The methods are __get__(), __set__(), and __delete__(). If you define any of these methods in an object, then that object is called Descriptor.

The default behavior for property access in an object is to get, set, or delete the corresponding property in the object’s dictionary. For example, the search order for a.x is from A. __dict__[‘x’] to type(a).__dict__[‘x’], and then continues in the base class of type(a) except metaclass. If the value being looked up is an object with any Descriptor method defined, Python calls the Descriptor method to override the default behavior. The priority order of the lookup depends on which Descriptor methods are defined.

Descriptor is a powerful and generic protocol that is the mechanism behind properties, methods, static methods, class methods, and super(). The new style classes introduced in version 2.2 were implemented internally in Python using Descriptor. Descriptor abstracts the underlying C code, providing a flexible new set of tools for everyday Python coding.

Descriptor protocol

descr.__get__(self, obj, type=None) --> value

descr.__set__(self, obj, value) --> None

descr.__delete__(self, obj) --> NoneCopy the code

That’s all there is to the agreement. Objects can override the default behavior of property lookup as long as they define any of their methods to be Descriptor.

Objects that define both __get__() and __set__() are called Data descriptors. Instead, descriptors that define only __get__() are called non-data descriptors (this is typical of class methods, though there may be other uses).

The difference between Data Descriptors and non-data Descriptors is reflected in the overwriting and calculation order of the entries in the instance dictionary. If the instance dictionary contains a property with the same name as the Data Descriptor, the Data Descriptor takes precedence. If the instance dictionary contains attributes with the same name as the non-data Descriptor, the instance dictionary takes precedence.

The instance dictionary is __dict__. Code examples about priority reference: https://gist.github.com/icejoywoo/0f19fa8575ac664140fc)

A read-only Data Descriptor can be created by defining both __get__() and __set__() methods, and __set__() raising AttributeError when called. Just defining a __set__() method that throws an exception is enough to make the object a Data Descriptor.

Call the Descriptor

Descriptor can be called directly by method name. For example, d.__get__(obj).

In addition, the more common way is to automatically call Descriptor via property access. For example, obj.d looks up D in obj’s object dictionary. If D defines a __get__() method, then d.__get__(obj) is called first according to the priority rules listed below.

The details of the call depend on whether obj is an object or a class.

For object, its mechanism is the object. The.__getattribute__ () converts b.x type (b) __dict__ [‘ x ‘] __get__ (b, type (b)). The implementation of the priority chain is as follows: Data Descriptors have higher priority than instance variables, instance variables have higher priority than non-data descriptors, and __getattr__() has the lowest priority. The complete C code is implemented in the Objects/object.c PyObject_GenericGetAttr() function.

For class, its mechanism is the type. The.__getattribute__ () converts B.x b. __dict__ [‘ x ‘] __get__ (None, B). The code for pure Python is as follows:

def __getattribute__(self, key): "Emulate type_getattro() in Objects/typeobject.c" v = object.__getattribute__(self, key) if hasattr(v, '__get__'): return v.__get__(None, self) return vCopy the code

Important points to keep in mind:

  • Descriptor is called through the __getAttribute__ () method
  • Overwriting __getAttribute__ () prevents the Descriptor from being called automatically
  • Object. The.__getattribute__ () and type. The.__getattribute__ () call __get__ () in different ways
  • The Data Descriptor always overwrites the instance dictionary
  • Non-data Descriptor may be overwritten by instance dictionaries

Objects returned by super() also have a custom __getAttribute__ () method for invocation Descriptor. Obj, super (B) m () will search obj. Magic __class__. __mro__ in the base class A, return a. __dict__ [‘ m] __get__ (obj, B). M returns the same thing if it’s not Descriptor. If m is not in the instance dictionary, it reverts to searching through Object.__getAttribute__ ().

The implementation details are in the super_getattro() function of Object/ Typeobject.c. Guido’s introductory tutorial has an equivalent implementation of pure Python.

The Descriptor mechanism hidden inside __getAttribute__ () methods in object, type, and super() is described above. This mechanism is inheritable. If a class derives from an object, or if the class’s metaclass implements a similar mechanism, the class can inherit that mechanism. Similarly, classes can mask descriptors by overwriting __getAttribute__ ().

Descriptor sample

The following code creates a Data Descriptor class that prints a message on get or set. Overwriting __getAttribute__ () can also add printed information to each attribute. However, Descriptor is useful when monitoring several selected properties:

class RevealAccess(object): """A data descriptor that sets and returns values normally and prints a message logging their access. Data Descriptor prints a record access message at assignment and value. """ def __init__(self, initval=None, name='var'): self.val = initval self.name = name def __get__(self, obj, objtype): print('Retrieving', self.name) return self.val def __set__(self, obj, val): print('Updating', self.name) self.val = val >>> class MyClass(object): x = RevealAccess(10, 'var "x"') y = 5 >>> m = MyClass() >>> m.x Retrieving var "x" 10 >>> m.x = 20 Updating var "x" >>> m.x Retrieving var "x" 20 >>> m.y 5Copy the code

The Descriptor protocol is simple and offers exciting possibilities. These usage scenarios are so common that they are packaged into separate function calls. Property, bound and unbound methods, static methods and class methods are all based on the Descriptor protocol.

attribute

Calling property() is a neat way to create a Data Descriptor that triggers a function call when the property is accessed. The function signature is as follows:

property(fget=None, fset=None, fdel=None, doc=None) -> property attributeCopy the code

The documentation shows a typical use of the managed attribute X:

class C(object):
    def getx(self): return self.__x
    def setx(self, value): self.__x = value
    def delx(self): del self.__x
    x = property(getx, setx, delx, "I'm the 'x' property.")Copy the code

See how property() is implemented using the Descriptor protocol. Here’s the equivalent in pure Python:

class Property(object):
    "Emulate PyProperty_Type() in Objects/descrobject.c"

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

    def getter(self, fget):
        return type(self)(fget, self.fset, self.fdel, self.__doc__)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)Copy the code

The property() built-in function is useful whenever the user interface authorizes property access and subsequent changes require method access.

For example, a spreadsheet class can grant access to the value of a Cell through Cell(‘ b10 ‘).value. Subsequent changes to the program require cells to be recalculated on each access; However, the programmers do not want to affect existing client code that directly accesses attributes. The solution is to encapsulate access to value attributes with Property Data descriptors:

class Cell(object):
    . . .
    def getvalue(self, obj):
        "Recalculate cell before returning value"
        self.recalc()
        return obj._value
    value = property(getvalue)Copy the code

Functions and methods

The object-oriented nature of Python is built on a function-based environment. With non-data Descriptor, functions and methods can be seamlessly fused together.

The Class dictionary stores methods as functions. In the Class definition, methods and functions are also defined in terms of def and lambda. The only difference between a method and a function is that its first argument is reserved for an object instance. By Python convention, the instance reference is called self, which in other languages may be this or other names.

To support method calls, functions have __get__() methods, which can be bound at attribute access. This means that all functions are non-data Descriptor, returning bound or unbound methods depending on which object or class the caller is. The pure Python implementation is as follows:

class Function(object):
    . . .
    def __get__(self, obj, objtype=None):
        "Simulate func_descr_get() in Objects/funcobject.c"
        return types.MethodType(self, obj, objtype)Copy the code

To show how a function Descriptor actually works in the interpreter:

>>> class D(object): def f(self, x): return x >>> d = D() >>> D.__dict__['f'] # Stored internally as a function >>> D.f # Get from a class becomes an unbound  method >>> d.f # Get from an instance becomes a bound method >Copy the code

The output above indicates that bound and unbound methods are two different types. Although we can do this, PyMethod_Type in Objects/ Classobject. c is implemented as an object with two different representations, The representation depends on whether the value of im_self is NULL (the C keyword for None is NULL).

Similarly, the effect of a method object call depends on the IM_self field. If assigned (meaning bound), the original function (stored in the im_func field) sets the first argument to the instance when called. If unbound, all arguments are passed to the original function unchanged. The C implementation of instancemethod_call() is a little more complicated because it includes some type checking.

Static methods and class methods

Non-data Descriptor provides a simple variation mechanism for functions to bind to the usual mode of methods.

In general, functions have __get__() methods, which are therefore converted to methods when accessed as attributes. Non-data Descriptor change obj. F (*args) to f(obj, *args) and klass.f(*args) to F (*args).

The following table summarizes bindings and their two most useful variants:

Transformation Called from an Object Called from a Class
function f(obj, *args) f(*args)
staticmethod f(*args) f(*args)
classmethod f(type(obj), *args) f(klass, *args)

Static methods return the original function unchanged. Calling c.f or c.f is equivalent to finding Object.__getAttribute__ (c, “f”) or Object.__getAttribute__ (c, “f”) directly. Therefore, it is equivalent for a function to be called through an object or class.

Static methods are methods that do not reference the self variable.

For example, a package for statistics can contain container classes that hold experimental data. This class provides standard methods for calculating mean, mean, median, and other descriptive statistics that depend on data. However, there may be functions that are conceptually relevant but not data-dependent. For example, ERF (x) is a convenient conversion program for statistical work, but does not rely directly on a specific data set. It can be called from an object or class: s.rf (1.5) — >.9332 or sample.erf (1.5) — >.9332.

Because static methods return the same function, there is nothing special about the sample call:

>> class E(object):
     def f(x):
          print(x)
     f = staticmethod(f)

>>> print(E.f(3))
3
>>> print(E().f(3))
3>Copy the code

Using the non-data Descriptor protocol, the pure Python version of staticMethod () is as follows:

class StaticMethod(object):
 "Emulate PyStaticMethod_Type() in Objects/funcobject.c"

 def __init__(self, f):
      self.f = f

 def __get__(self, obj, objtype=None):
      return self.fCopy the code

Unlike static methods, class methods prefix the argument list with a class reference before calling the function. The result is consistent regardless of whether the caller is an object or a class:

>>> class E(object):
     def f(klass, x):
          return klass.__name__, x
     f = classmethod(f)

>>> print(E.f(3))
('E', 3)
>>> print(E().f(3))
('E', 3)Copy the code

Class methods are useful when a function only needs class references and doesn’t care about any internal data. One use of class methods is to create objects instead of class constructors. In Python 2.3, the dict.fromkeys() class method creates a new dictionary from a list of key values. The equivalent pure Python implementation is as follows:

class Dict(object):
    . . .
    def fromkeys(klass, iterable, value=None):
        "Emulate dict_fromkeys() in Objects/dictobject.c"
        d = klass()
        for key in iterable:
            d[key] = value
        return d
    fromkeys = classmethod(fromkeys)Copy the code

Now, a new dictionary with independent key names will be built like this:

>>> Dict.fromkeys('abracadabra')
{'a': None, 'r': None, 'b': None, 'c': None, 'd': None}Copy the code

Using the non-data Descriptor protocol, the pure Python version of classmethod() is as follows:

class ClassMethod(object):
     "Emulate PyClassMethod_Type() in Objects/funcobject.c"

     def __init__(self, f):
          self.f = f

     def __get__(self, obj, klass=None):
          if klass is None:
               klass = type(obj)
          def newfunc(*args):
               return self.f(klass, *args)
          return newfuncCopy the code