The Python open source library that everyone should use

The One Python Library Everyone Needs

This article has been authorized by the original author Glyph

The Nuggets translation Project

Translator: Gran

Proofreader: Siegen, Chen Chaobang

Why, you ask? Don’t ask, just use.

All right, all right, let me go over this.

I love the Python; It has been my primary programming language for over a decade, and while there have been some interesting and growing languages in that time, I have no plans to switch to another programming language.

But Python isn’t perfect. In some cases, it pushes you to do the wrong thing. This situation has proliferated due to class inheritance and the use of the god-object antipattern by many libraries.

Perhaps one possible reason is that Python is a very easy language to learn, so less experienced programmers make mistakes, and those mistakes persist.

But I think perhaps the more important reason is that Python sometimes punishes you for trying to do the “right thing.”

Doing the “right thing” in the context of object design is to have many small, independent classes that only do one thing and do it well. For example, if you find that your object has accumulated a large number of private methods, maybe you should make the methods with private attributes public. But if it’s too much trouble, you probably don’t want to go public.

You may want to define objects, relationships, constants, and behavior explanations when you have related packets somewhere else. Python makes it easy to define only one tuple or list. When you first type host, port =… Instead of address =… It doesn’t seem like a big problem. But soon you will be typing everywhere [(family, sockType, proto, canonName, sockaddr)] =… Your life will be filled with regrets. If you’re lucky, that’s the case. But if you’re not lucky, you just maintain code, do things like Values [0][7][4][HOSTNAME][” Canonical “] and your life will be full of fancy pain instead of regret.

This raises the question: Is creating a class in Python a pain in the neck? Let’s look at a simple data structure: a three-dimensional cartesian coordinate. It should be simple enough to start with.

class Point3D(object):Copy the code

So far so good. So we have a three-dimensional point, what’s next?

class Point3D(object):
    def __init__(self, x, y, z):
Copy the code

I just want a container for a small amount of data, but I’ve already had to override a special function method at Python runtime based on internal naming conventions? It doesn’t feel terribly bad to me; All programs are barely more than a special symbol when they are obsolete.

At least I see my property name, which makes sense.

class Point3D(object):
    def __init__(self, x, y, z):
        self.x
Copy the code

I’ve already said I want an X, but now I have to specify it as a property…

class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
Copy the code

… For x? Well, apparently…

class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
Copy the code

… Now I have to do it once for each property, so is this really a bad scale actually? Do I have to enter each property like this 3 times? !

Oh well. At least now I’ve done it.

class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
Copy the code

Wait, what do you mean, I’m not finished yet.

class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
Copy the code

Oh come on. I have to type the name of each property five times, okay? If I want to be able to see what’s in it when I’m debugging, can I even freely access one of the tuples? !

class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    def __eq__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)
Copy the code

7 times? !

class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    def __eq__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)
    def __lt__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) < (other.x, other.y, other.z)
Copy the code

Nine times? !

from functools import total_ordering
@total_ordering
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    def __eq__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)
    def __lt__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) < (other.x, other.y, other.z)
Copy the code

Well, oh ~ 2 + lines of code is not a lot, but at least for now we haven’t defined all the other comparison methods. But now we’re done, right?

from unittest import TestCase
class Point3DTests(TestCase):
Copy the code

You know what? I’ve had enough. So far, I’ve written 20 lines of code, and this class doesn’t do anything yet; The most difficult part of the problem should be the quad solver, not “making data structures that can be printed and compared.” I was overwhelmed by piles of useless tuples, lists, and dictionaries; Defining proper data structures in Python is too much work.

`namedtuple`Rescue (not really).

The library’s answer to this conundrum is namedtuple. Although the first draft (with many similarities to him in the genre and some awkward and outdated entries of my own) namedtuple’s misfortune was irremediable. It exports a huge amount of bad public functionality that would be a nightmare for compatibility maintenance, and it doesn’t solve half of the run-in problem. Its full enumeration of drawbacks is tedious, but there are some bright spots.

They are accessed by numbered metrics whether you want to or not. Among other things, this means you can’t have private attributes because they pass through the obvious public interface__getitem__Expose.
It compares equal values to the same originaltuple“, so it’s easy to get caught up in bizarre type chaos, especially if you want to use ittuple 和 listMigrate.
This is a tuple, so it’s always the same. As for the last point, you can use it like this:

Point3D = namedtuple('Point3D'['x'.'y'.'z'])
Copy the code

In this case, it doesn’t look like a type in code; Without special circumstances, simple parsing tools cannot recognize it as a whole. You can’t give this to any other behavior because there’s nowhere else to put this method. Not to mention the fact that you have to type the name of each class twice.

Or you can do this using inheritance:

class Point3D(namedtuple('_Point3DBase'.'x y z'.split(a)])):
    pass
Copy the code

Give you a place where you can put methods and docstrings, usually have it look like a class, it’s… But now a strange internal name is returned (which, by the way, is displayed in the REPr, not the actual name of the class). But you can also silently list mutable properties that aren’t listed here, as well as a strange side effect of adding a class declaration; That is, unless you add __slots__ = ‘x y z’.split() to the class body, we just go back to each property and type the name twice.

What hasn’t been mentioned has been proven that you shouldn’t use inheritance.

So namedtuple can be improved if it’s what you’re going to do, it’s just that in some cases it has some weird packages of its own.

type`attr`

So this is where my favorite mandatory Python library comes in.

Let’s revisit the above question. How do I use Attrs in Point3D?

import attr
@attr.sCopy the code

Since the framework is not built into the language, we really need two lines of code to get started: the import and decorator statements say we can start using the framework.

import attr
@attr.s
class Point3D(object):
Copy the code

You see, no inheritance! By using class modifiers, Point3D can maintain the appearance of old Python classes.

import attr
@attr.s
class Point3D(object):
    x = attr.ib()
Copy the code

It has a property called x.

import attr
@attr.s
class Point3D(object):
    x = attr.ib(a)
    y = attr.ib(a)
    z = attr.ib(a)
Copy the code

Let’s call one y and one Z and we’re done.

What did we do? Wait. A nice string representation, right?

>>> Point3D(1.2.3)
Point3D(x=1, y=2, z=3)
Copy the code

Compare?

>>> Point3D(1.2.3) == Point3D(1.2.3)
True
>>> Point3D(3.2.1) == Point3D(1.2.3)
False
>>> Point3D(3.2.3) > Point3D(1.2.3)
True
Copy the code

Okay, but what if I want to extract data that fits the JSON serialization format and has explicit property definitions?

>>> attr.asdict(Point3D(1.2.3))
{'y': 2.'x': 1.'z': 3}
Copy the code

Maybe this last little bit. But nonetheless, it should be easier because Attrs lets you declare domain classes, have a lot of metadata about them that might be of interest among other things, and then get the metadata to exit.

>>> import pprint
>>> pprint.pprint(attr.fields(Point3D))
(Attribute(name='x', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None),
 Attribute(name='y', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None),
 Attribute(name='z', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None))
Copy the code

I’m not going to dive into every interesting feature of Attrs; You can read the document. Plus, it’s well maintained, so there’s always something new coming up and I might miss something important every once in a while. But attrs does this, and once you use them, you realize that Python was sorely lacking in such things.

It allows you to define succinct types rather than rather verbose onesdef __init__.... Type does not need to be typed.
It lets you say you areThe meaning of direct statement“Rather than saying it in a roundabout way. Use “I have a type, which is called MyType, which hasaProperties and behaviors of, can be obtained directly without having to reverse engineer its behavior (e.g., run)dirFor instance, or to findself.__class__.__dict__).” Instead of “I have A type, it’s called MyType, it has A constructor, and I assign the attribute ‘A’ to the parameter ‘A’.”
It provides useful default behavior, as opposed to Python’s sometimes useful but often backward defaults.
It adds a simpler start that allows you to perform more rigorously later.

Let’s explore the last point.

Gradually increase

I’m not going to talk about every feature right now, and I’d be remiss if I didn’t mention a few of them. As you can see in those new repr() articles on attributes, there are a lot of other interesting things.

For example: you can verify that a property passed to the @attr.s class validates that our three-dimensional points, for example, should probably contain numbers. For simplicity, we can say, in the case of float, like this:

import attr
from attr.validators import instance_of
@attr.s
class Point3D(object):
    x = attr.ib(validator=instance_of(float))
    y = attr.ib(validator=instance_of(float))
    z = attr.ib(validator=instance_of(float))
Copy the code

Our use of attrs means that we have an additional area to validate each attribute; We can just add each attribute of the type information because we need it. Some of these things let us avoid other common mistakes. For example, this is a popular “spot the bug” Python interview question.

class Bag:
    def __init__(self, contents=[]):
        self._contents = contents
    def add(self, something):
        self._contents.append(something)
    def get(self):
        return self._contents[:]
aCopy the code

Solve it, of course, it becomes this.

class Bag:
    def __init__(self, contents=None):
        if contents is None:
            contents = []
        self._contents = contents
Copy the code

Add two extra lines of code.

Contents inadvertently becomes a global variable here, allowing all Bag objects to share the same list without setting different lists. With attrs the substitution becomes:

@attr.s
class Bag:
    _contents = attr.ib(default=attr.Factory(list))
    def add(self, something):
        self._contents.append(something)
    def get(self):
        return self._contents[:]
Copy the code

Among other features, Attrs offers the opportunity to make your classes more convenient and accurate. Another example? If you want to make object independent attributes stricter (or more memory efficient on CPython), you can set slots to True at the class level. For example, @attr.s(slots=True) automatically enables attrs declaration to match the __slots__ attribute. All of these handy features allow you to make better, more powerful things with your Atr.ib () declarations.

The future of the Python

Some people are excited to finally be able to fully program in Python 3. What I was looking forward to was being able to fully program in Python-with-attrs. It has had a subtle but beneficial design impact on every code base I’ve seen.

Try it: You might be surprised at where you now use a neat explain class, where in the past you might have used tuples, lists, or dictionaries with very little description, and endured the chaos of common maintenance. Now that it’s so easy to have structure types that clearly indicate the direction of destination (in their __repr__ and __doc__, or even just the names in their attributes), you may find that you use them more often. Your code will get better, and I know mine already is.

The lack of references here is because attributes exposed to __caller__ have no meaning, they are just publicly named. This pattern, which completely gets rid of private methods and has only private attributes, works well with its own passed parameters. ↩
We haven’t got the really exciting stuff yet: type authentication at build time, variable defaults… ↩

The Python open source library that everyone should use

namedtupleRescue (not really).

typeattr

Gradually increase

The future of the Python

Related Posts

Kibana: URL template in Drilldown

How to write OKR and what it means

See the difference between Slice and Array through memory

`namedtuple`Rescue (not really).

type`attr`