Writing in the front

I recently read Effective Python for work. This is also the first Effective series of books THAT I have finished (I have read Effective Java before, but did not read it in its entirety), and I have gained a lot of feelings after reading it. This record can also help me review and review.

The Effective series’ narrative structure and logic are fixed. Such books are often titled in a computer language, such as C++, Java, or Python. But unlike most language books, it doesn’t start with the most basic concepts, and it rarely covers things you can look up directly in documentation. This type of book mainly shares the problems encountered in actual development, and the corresponding solutions. There are dozens of sub-headings in a book, and each sub-heading is a suggestion.

The author states that the structure of each subheading (suggestion) is identical. First, the author presents a common development problem, and of course spends some time explaining the problem itself, mainly to make the reader aware of the problem and how serious it is. The author then proposes a solution to the problem, usually the easiest one to think of at first. He then goes on to question his proposed solution, listing its flaws and drawbacks. Based on these things, he will propose a new solution, which may have a new problem, and then question and propose another solution. Only at the end does the author give what he thinks is reasonable advice.

Repeatedly looking for solutions to problems, and repeatedly questioning their own solutions. In such a step-by-step way, what readers read is not only boring knowledge itself, but how to think and find the optimal solution to the problem. This process of thinking is extremely valuable, it allows you to gain a lot of things beyond knowledge.

Of course, I don’t have the writer’s skills, and I can’t fully restore my feelings and harvest through words. Here, I’ll briefly record some of what I’ve learned. If you want to have a more profound understanding and comprehension, or recommended to read the original book.


Code style.

In team development, clean, consistent code can greatly improve the efficiency of development. PEP 8 describes what is clean code and some good specifications. Here are some things that I think are good, but are often overlooked:

  • For empty lines between functions, the spacing between global functions or classes in a file is 2 empty lines. If the methods are in a class, the space between them is 1 blank line.

  • Modules should be imported in the order of standard library modules, third-party modules, and local modules. Within each block, the order is determined alphabetically by the module name.

  • Don’t judge empty by length, such as if len(somelist) == 0. This code is visually intrusive and not neat. The right thing to do is to think of empty as False, if not somelist.

  • Get into the habit of using Spaces instead of tabs for indentation.

It may be difficult for newcomers to notice all of these styles at first. But Python provides code style detection tools like Pylint. It can remind you when you need to, so that you can form the habit better and faster.


Basic data structures and syntax

Characters and Strings

The most common representation of data in a program is a string. A string consists of individual characters. But because there are different encoding methods, each language represents characters differently.

In Python3, both bytes and STR can be used to represent characters, where bytes represents an 8-bit value and STR represents Unicode characters.

Python2 uses a different mechanism. STR and Unicode represent characters, where STR represents an 8-bit value and Unicode represents Unicode characters.

Since there are two mechanisms in Python for representing strings, leaving them unqualified and unconstrained can cause confusion. Codec functions are generally required to ensure type consistency:

  • Encode: Converts Unicode characters into binary data
  • Decode: To convert binary data into Unicode characters

In general, most of your programs use Unicode characters (STR in Python3, Unicode in Python2) to represent strings. But when you export the data content of the program, remember to encode it in the specified way. For example, in Python3 we have the following codec functions for utf-8 codec:

# decoding
def to_str(bytes_or_str) :
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value # Instance of str

# code
def to_bytes(bytes_or_str)
    if isinstance(bytes_or_str, str) :
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value # Instance of bytes
Copy the code

It is similar in Python2:

# decoding
def to_unicode(unicode_or_str) :
    if isinstance(unicode_or_str, str):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value # Instance of unicode

# code
def to_str(unicode_or_str)
    if isinstance(unicode_or_str, unicode) :
        value = unicode_or_str.encode('utf-8')
    else:
        value = unicode_or_str
    return value # Instance of str
Copy the code

In addition to the above, there are some differences between Python2 and Python3 regarding the comparison of the two types. In Python2, the unicode type can often be equivalent to the STR type (when representing ASCII), the two types of strings can be concatenated with +, and a == comparison will return True if the two types of data represent the same string. But in Python3, bytes and STR are set to completely different types and cannot be directly concatenated or compared between them.


The list of

Lists are dynamic arrays and are the most common and frequently used data structure in Python. In addition, lists have a very flexible way of using, often can make the expression of the language more concise. The most basic use will not talk about, mention a few more important points:

  • Lists have a very flexible selection of ranges, but often for brevity’s sake, we omit the beginning or end of a range. For example, we would replace a[0:5] with a[:5] and a[1:] with a[1:len(a)]. This has the advantage of reducing visual distractions and making the code easier to understand.

  • Note the difference between b = a[:] and b = a. The latter is merely a copy of the address, while the former a[:] is a shallow copy of the list A assigned to B. By shallow copy, we mean that we copy the address of the reference type and the value of the non-reference type in the original list A. After the assignment, a and B will point to two different lists.

  • Notice the difference between a == b and a is B. The former is a comparison of values, while the latter is a comparison of addresses. The difference can be seen in the following example:

    a = [1.2]
    b = [1.2]
    print(a == b) # True
    print(a is b) # False
    a.append(3)
    print(a == b) # False
    Copy the code
  • List parsing provides a convenient way to generate lists, such as the following example:

    a = [1.2.3.4.5.6]
    even_square = [x ** 2 for x in a if a % 2= =0]
    Copy the code

    Of course, this way of initializing data structures is not limited to lists, but also applies to dictionaries and hashes:

    name_id = {"peter":3."Alex": 2."Bob": 4}
    m = {id: name for name, id in name_id.items()}
    s = {id for id in name_id.values()}
    Copy the code

    A word of caution, though, is that list parsing is good, but it shouldn’t be abused. If you find that you need to use more than two layers of expression, try to avoid using such expressions. In complex cases, this approach tends to reduce the readability of the code.


Exception handling mechanism

A quick mention of Python exception handling. The complete exception handling mechanism in Python is try/except/else/finally. Unlike other languages, Python adds an else statement to handle successful execution of the contents of a try.

Note that the else block is executed only if the statement in the try block is successfully executed. In my opinion, this design is more reasonable, after all, to add a branch option for exception handling.


function

Use exceptions to reflect the running state of a function rather than the return value

Most of the time, we tend to attach some meaning to the return value of a function. In this way, the caller of a function can use the return value of the function to determine the state of the function.

But this is a myth. If you think about it, the return value of a function has two meanings, one representing the state of the operation and one representing the result of the operation. There are bound to be conflicts. For example, if you use None to indicate that a function failed, None might normally be returned as a return value. So what do you do? If the function returns None, how do you tell if the function is in error or exits normally? You might say, why not just find a value that can’t possibly be the result of the normal state? This may guarantee that the program will not fail in certain situations, but it confuses the concept.

There are only two final states of a function, exception and normal exit. Only in the case of a normal exit will we look at the return of the function. In the case of exceptions, we pay more attention to the specific location and cause of the code error, and to better reflect the information, only by throwing exceptions to remind the caller of the function. Otherwise, the caller of the function will assume that the result returned by the function is normal and reliable.


Parameter passing to a function

Function parameter passing is supposed to be a normal operation. However, due to the complexity of requirements, and the fact that people always try to simplify complex things, many different ways of passing parameters have been extended. Let’s take a step-by-step look at them:

  • Ordinary mass participation

    This does not need to be explained too much, just pass the parameters as defined by the function, as follows:

    def log(message, value1, value2) :. log("log message".1.2)
    Copy the code

    This may be the most primitive way of passing parameters, but if you think about it, it actually has the following problems:

    1. The number of parameters is fixed. When a function is put into use, if you want to add new parameters to the function, the range of changes will be very large, not only to change the function itself, but also to change the function call. When a function is widely used, such changes can become tricky.

    2. It is difficult to know the exact meaning of each argument at the time of the function call. For example, from log(“log message”, 1, 2) we can only know that the input argument to the function is a string of two numbers. What they mean is unclear.

    3. In terms of the use of functions, functions have optional arguments. That is to say, if some parameters can be passed or not, it will not affect the normal operation of the function. And the above ordinary parameter transfer way obviously can not do.

  • Positional arguments

    Positional arguments allow the caller to pass a variable number of arguments, as shown in the following example

    def log(message, *values) :
        print(values)
    
    log("log message".1.2) # [1, 2]
    log("log message") # []
    Copy the code

    The * symbol in front of the argument indicates that the argument will be a list of all arguments at this and subsequent positions. On function declarations, * may be placed only before the last argument.

    This method of combining multiple parameters into one parameter can effectively reduce the visual interference and make the parameter passing of the function more flexible. However, bugs can sometimes be hard to find, so use them with caution.

  • Keyword parameter

    Keyword arguments can make function calls more intuitive, and can also make parameter positions more flexible, such as:

    def log(message, value) :. log("log message".1)
    log("log message", value=1)
    log(message="log message", value=1)
    log(value=1, message="log message")
    Copy the code

    Note the following two points when using:

    1. If non-keyword arguments exist, they must precede keyword arguments
    2. Each parameter name can be referred to only once

    In addition, the keyword argument also supports optional and default values:

    def log(message, value=1) :
        print(value)
    
    log("log message") # 1
    log("log message", value=2) # 2
    Copy the code

    For optional default parameters, it is better to use keywords when passing parameters, so that the meaning of parameters can be displayed directly and the comparison is convenient.

  • Enforce keyword arguments

    The keyword arguments mentioned above make the code at the function call more intuitive, but they are optional and can sometimes confuse people with non-keyword arguments and keyword arguments. Enforcing keyword arguments is a good way to resolve this confusion:

    python3:

    def log(message, *, value1, value2) :
        print(value1, value2)
    
    log("msg", value1=1, value2=2) # 1, 2,
    log("msg", value2=2, value1=1) # 1, 2,
    log("msg", value2=2.1) # error
    Copy the code

    The * is used to separate non-keyword arguments from keyword arguments. To be exact, the * symbol indicates the end of the positional parameter. When transmitting the keyword parameter following the positional parameter, strictly use the keyword parameter; otherwise, an error will be reported.

    Python2 does not have the * delimiter, but there is a similar way to represent mandatory keyword arguments:

    def log(*args, **kwargs) :
        print(args, kwargs)
    
    log("msg", value1=1, value2=2) # ('msg',) {'value2': 2, 'value1': 1}
    log("msg", value2=2, value1=1) # ('msg',) {'value2': 2, 'value1': 1}
    log("msg", value2=2.1) # SyntaxError: non-keyword arg after keyword arg
    Copy the code

The main purpose of the above parameter passing method is to make the code more concise and clear. One thing not mentioned is the default argument, which can give function arguments more flexibility, but can cause unexpected problems if the initial value of the default argument is a dynamic data structure, such as the following code:

def log(msg, msgs=[]) :
    msgs.append(msg)
    print msgs
    return msgs

log("log1") # ["log1"]
log("log2") # ["log1", "log2"]
log("log3") # ["log1", "log2", "log3"]
Copy the code

As you can easily see from the sample code, the dynamic initial values of the default parameters are not overridden by a function re-call. If you are giving a dynamic initial value, it is best to set the initial value to None to reduce unnecessary misunderstandings.


object-oriented

Use functions as interfaces

In traditional compiled languages, such as C++ or Java. When we think of interfaces, we tend to immediately think of classes or similar structures. Python also has classes, but they are much larger than structures like functions. They tend to contain properties and methods, and classes must be instantiated into objects before they can be used. In contrast, functions have a lot of flexibility in Python. For example, functions can be passed as arguments or callbacks using some of Python’s built-in apis, and concise writing like anonymous functions can sometimes make certain processes clear (sorting, filtering, and so on).

Of course, properties and methods cannot be clearly represented in functions. Can class-generated objects have the same flexibility as functions? The answer is yes, thanks to the special __call__ method, as in this example:

class CountMissing(object) :
    def __init__(self) :
        self.added = 0
    
    def __call__(self) :
        self.added += 1
        return 0

counter = CountMissing()
counter()
assert callable(counter)
Copy the code

We define __call__ in the CountMissing class so that objects instantiated from this class can be called directly as functions. Each call to this object actually calls the object’s __call__ method.

The nice thing about this is that we can treat objects as functions. In some built-in methods that must be passed in to be used, such as sort, Defaultdict, passing objects can also make the method work. Increases the flexibility of the object while reducing the complexity of the code.


Class

Classes in Python have only public and private access. Private is set to a normal property or method name preceded by two underscores __. But it should be noted that Python’s implementation of private does a simple conversion of member names, as in the following example:

class MyParentObject(object) :
    def __init__(self) :
        self.__private_field = 10

class MyChildObject(MyParentObject) :
    def get_private_field(self) :
        return self.__private_field

baz = MyChildObject()
baz.get_private_field()

>> AttributeError: 'MyChildObject' object has no attribute '_MyChildObject__private_field'
Copy the code

As you can see, private members simply convert the name to _ class name + member name. If we know this layer of transformation, we can still access private members externally, as in the following example:

baz._MyParentObject__private_field # 10
Copy the code

We can also see all members of an object through the special __dict__ method:

baz.__dict__ # {'_MyParentObject__private_field': 10}
Copy the code

At this point, you might be asking how Python can better set access to class members. The best practice here is to use protected and documented management. To make a class member protected is to prefix the member name with a single underscore underscore (_). Of course, this leading underscore is intended only for users of the class and has no actual binding effect in the code. We need to document clearly which members should not be called and why.