The original | Unravelling binary arithmetic operations in Python

The author | Brett Cannon

Translator | under the pea flower cat (” Python cat “public, author)

Notice | this translation is for the exchange of learning purpose, based on CC BY – NC – SA 4.0 license agreement. The content has been changed slightly for ease of reading.

The overwhelming response to my blog post explaining attribute access inspired me to write another article about how much Python syntax is really just syntax sugar. In this article, I want to talk about binary arithmetic operations.

Specifically, I want to explain how subtraction works: A – B. I deliberately chose subtraction because it’s not commutative. This underscores the importance of the order of operations, as you might mistakenly flip a and B in the implementation as opposed to the addition operation, but still get the same result.

Look at the C code

As usual, we start by looking at the bytecode compiled by the CPython interpreter.

>>> def sub() : a - b
.
>>> import dis
>>> dis.dis(sub)
  1           0 LOAD_GLOBAL              0 (a)
              2 LOAD_GLOBAL              1 (b)
              4 BINARY_SUBTRACT
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
Copy the code

Looks like we need to dig into the BINARY_SUBTRACT opcode. Looking through the Python/ceval.c file, you can see the C code that implements the opcode as follows:

case TARGET(BINARY_SUBTRACT): {
    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *diff = PyNumber_Subtract(left, right);
    Py_DECREF(right);
    Py_DECREF(left);
    SET_TOP(diff);
    if (diff == NULL)
    goto error;
    DISPATCH();
}
Copy the code

Source: github.com/python/cpyt…

The key code here is PyNumber_Subtract(), which implements the actual semantics of subtraction. Moving on to some macros for this function, you can find the binary_op1() function. It provides a generic way to manage binary operations.

Instead of using it as a reference for implementation, however, we’ll use Python’s data model, which is well documented and clearly explains the semantics used for subtraction.

Learn from data models

Reading through the documentation of the data model, you will see that two methods play a key role in implementing subtraction: __sub__ and __rsub__.

1. The __sub__() method

When a-b is executed, __sub__() is looked up in a’s type and b is taken as its argument. Much like __getAttribute__ () in the article I wrote about attribute access, the special/magic methods resolve the object based on its type, not the object itself for performance purposes; In the sample code below, I use _mro_getattr() to represent this process.

Therefore, if a defined __sub__ (), the type (a). __sub__ (a, b) can be used as a subtraction operation. (Magic method is not an object type.)

This means that in essence, subtraction is just a method call! You can also think of it as the library operator.sub() function.

We will copy this function to implement our own model, using the names LHS and RHS for the left and right sides of a-B to make the sample code easier to understand.

Subtraction is implemented by calling __sub__()
def sub(lhs: Any, rhs: Any, /) - >Any:
    """Implement the binary operation `a - b`."""
    lhs_type = type(lhs)
    try:
        subtract = _mro_getattr(lhs_type, "__sub__")
    except AttributeError:
        msg = f"unsupported operand type(s) for -: {lhs_type! r} and {type(rhs)! r}"
        raise TypeError(msg)
    else:
        return subtract(lhs, rhs)
Copy the code

2. Use __rsub__() on the right.

But what if A doesn’t implement __sub__()? If a and B are of different types, then we try to call B’s __rsub__() (the “r” in __rsub__ stands for “right”, which stands for the right side of the operator).

When both sides of an operation are of different types, this ensures that they both have a chance to try to make the expression work. When they are the same, we assume __sub__() will work. However, even if both implementations are the same, you still call __rsub__() in case one of the objects is another (child) class.

3. Don’t care about type

Now, both sides of the expression can participate! But what if, for some reason, an object’s type does not support subtraction (for example, does not support 4 – “stuff”)? In this case, all __sub__ or __rsub__ can do is return NotImplemented.

This is a signal back to Python that it should proceed to the next operation to try to make the code work. For our code, this means that you need to check the return value of the method before assuming it works.

# An implementation of subtraction in which both the left and right sides of an expression can participate
_MISSING = object(a)def sub(lhs: Any, rhs: Any, /) - >Any:
        # lhs.__sub__
        lhs_type = type(lhs)
        try:
            lhs_method = debuiltins._mro_getattr(lhs_type, "__sub__")
        except AttributeError:
            lhs_method = _MISSING

        # lhs.__rsub__ (for knowing if rhs.__rub__ should be called first)
        try:
            lhs_rmethod = debuiltins._mro_getattr(lhs_type, "__rsub__")
        except AttributeError:
            lhs_rmethod = _MISSING

        # rhs.__rsub__
        rhs_type = type(rhs)
        try:
            rhs_method = debuiltins._mro_getattr(rhs_type, "__rsub__")
        except AttributeError:
            rhs_method = _MISSING

        call_lhs = lhs, lhs_method, rhs
        call_rhs = rhs, rhs_method, lhs

        if lhs_type is not rhs_type:
            calls = call_lhs, call_rhs
        else:
            calls = (call_lhs,)

        for first_obj, meth, second_obj in calls:
            if meth is _MISSING:
                continue
            value = meth(first_obj, second_obj)
            if value is not NotImplemented:
                return value
        else:
            raise TypeError(
                f"unsupported operand type(s) for -: {lhs_type! r} and {rhs_type! r}"
            )
Copy the code

4. Subclasses take precedence over superclasses

If you look at the __rsub__() documentation, you’ll notice a comment. It says that if the right side of a subtraction expression is a left subclass (true subclasses, the same class does not count), and the __rsub__() methods of the two objects are different, __rsub__() will be called before __sub__() is called. In other words, if B is a subclass of A, the order of calls will be reversed.

This may seem like a strange exception, but there’s a reason for it. When you create a subclass, it means that you inject new logic into the operations provided by the parent class. This logic does not have to be applied to the parent class, otherwise it would be easy for the parent class to override what the child class wants to do when it operates on the child class.

Specifically, suppose you have a class named Spam, and when you execute Spam() -spam (), you get an instance of LessSpam. Then you create a subclass of Spam named Bacon, so that when you subtract Bacon from Spam, you get VeggieSpam.

Without this rule, Spam() -bacon () yields LessSpam because Spam doesn’t know that minus Bacon should yield VeggieSpam.

However, given the above rule, VeggieSpam gets the expected result, as bacon.__rsub__ () is called first in the expression (and also gets the correct result if Bacon() -spam () is evaluated, Since Bacon.__sub__() is called first, the rule says that the methods of two classes need to be different, not just a subclass determined by issubClass ().)

# Complete implementation of subtraction in Python
_MISSING = object(a)def sub(lhs: Any, rhs: Any, /) - >Any:
        # lhs.__sub__
        lhs_type = type(lhs)
        try:
            lhs_method = debuiltins._mro_getattr(lhs_type, "__sub__")
        except AttributeError:
            lhs_method = _MISSING

        # lhs.__rsub__ (for knowing if rhs.__rub__ should be called first)
        try:
            lhs_rmethod = debuiltins._mro_getattr(lhs_type, "__rsub__")
        except AttributeError:
            lhs_rmethod = _MISSING

        # rhs.__rsub__
        rhs_type = type(rhs)
        try:
            rhs_method = debuiltins._mro_getattr(rhs_type, "__rsub__")
        except AttributeError:
            rhs_method = _MISSING

        call_lhs = lhs, lhs_method, rhs
        call_rhs = rhs, rhs_method, lhs

        if (
            rhs_type is not _MISSING  # Do we care?
            and rhs_type is not lhs_type  # Could RHS be a subclass?
            and issubclass(rhs_type, lhs_type)  # RHS is a subclass!
            and lhs_rmethod is not rhs_method  # Is __r*__ actually different?
        ):
            calls = call_rhs, call_lhs
        elif lhs_type is not rhs_type:
            calls = call_lhs, call_rhs
        else:
            calls = (call_lhs,)

        for first_obj, meth, second_obj in calls:
            if meth is _MISSING:
                continue
            value = meth(first_obj, second_obj)
            if value is not NotImplemented:
                return value
        else:
            raise TypeError(
                f"unsupported operand type(s) for -: {lhs_type! r} and {rhs_type! r}"
            )
Copy the code

Generalize to other binary operations

Now that we’ve solved the subtraction, what about the other binary operations? Well, it turns out they do the same thing, just happen to use different special/magic method names.

So, if we can promote this way, then we can achieve 13 kinds of operational semantics: +, -, *, @, /, / / and %, * *, < <, > >, &, ^ and |.

Because of closures and Python’s flexibility with object introspection, we can extract the creation of operator functions.

# a function to create closures that implement binary operation logic
_MISSING = object(a)def _create_binary_op(name: str, operator: str) - >Any:
    """Create a binary operation function. The `name` parameter specifies the name of the special method used for the binary  operation (e.g. `sub` for `__sub__`). The `operator` name is the token representing the binary operation (e.g. `-` for subtraction). """

    lhs_method_name = f"__{name}__"

    def binary_op(lhs: Any, rhs: Any, /) - >Any:
        """A closure implementing a binary operation in Python."""
        rhs_method_name = f"__r{name}__"

        # lhs.__*__
        lhs_type = type(lhs)
        try:
            lhs_method = debuiltins._mro_getattr(lhs_type, lhs_method_name)
        except AttributeError:
            lhs_method = _MISSING

        # lhs.__r*__ (for knowing if rhs.__r*__ should be called first)
        try:
            lhs_rmethod = debuiltins._mro_getattr(lhs_type, rhs_method_name)
        except AttributeError:
            lhs_rmethod = _MISSING

        # rhs.__r*__
        rhs_type = type(rhs)
        try:
            rhs_method = debuiltins._mro_getattr(rhs_type, rhs_method_name)
        except AttributeError:
            rhs_method = _MISSING

        call_lhs = lhs, lhs_method, rhs
        call_rhs = rhs, rhs_method, lhs

        if (
            rhs_type is not _MISSING  # Do we care?
            and rhs_type is not lhs_type  # Could RHS be a subclass?
            and issubclass(rhs_type, lhs_type)  # RHS is a subclass!
            and lhs_rmethod is not rhs_method  # Is __r*__ actually different?
        ):
            calls = call_rhs, call_lhs
        elif lhs_type is not rhs_type:
            calls = call_lhs, call_rhs
        else:
            calls = (call_lhs,)

        for first_obj, meth, second_obj in calls:
            if meth is _MISSING:
                continue
            value = meth(first_obj, second_obj)
            if value is not NotImplemented:
                return value
        else:
            exc = TypeError(
                f"unsupported operand type(s) for {operator}: {lhs_type! r} and {rhs_type! r}"
            )
            exc._binary_op = operator
            raise exc
Copy the code

With this code, you can define the subtraction operation as _create_binary_op(” sub “, “-“) and then repeat the other operations as needed.

For more information

You can find more articles detailing Python syntax in the syntax Sugar TAB of this blog. The source code is available at github.com/brettcannon… Found on.

correct

  • 2020-08-19: Fixed rule when __rsub__() is called before __sub__().
  • 2020-08-22: Fixed not calling __rsub__ when the type is the same; I also streamlined the transition code, keeping only the beginning and end code, which made it easier for me.
  • 2020-08-23: Added content in most of the examples.