PEP8 coding specification

This article presents the coding conventions for the main Python version of the standard library. CPython’s C code style is shown in PEP7.

This article and the PEP 257 docstring standard are adapted from Guido’s original Python Style Guide, with additions to Barry’s GNU Mailman Coding Style Guide.

Many projects have their own coding style guidelines, which prevail in case of conflict.

Consistency consideration

One of Guido’s key points is that code is read more than written. This guide aims to improve the Readability counts of Python code, which PEP 20 calls “Readability counts.”

Style guides emphasize consistency. It is important that projects, modules, or functions be consistent.

The most important thing is to know when there is inconsistency, and sometimes a style guide doesn’t apply. Use your best judgment when in doubt, refer to other examples and ask lots of questions!

Special note: Don’t break backward compatibility by adhering to this PEP!

Some cases can violate the guidelines:

  • Following the guidelines reduces readability.
  • Inconsistent with other code around it.
  • The code was completed in the introduction guide and there is no reason to change it for now.
  • Older versions are compatible.

Code layout

The indentation

Indent each level with 4 Spaces.

Use vertical implicit indentation or hanging indentation in parentheses. The latter should note that the first line has no arguments and subsequent lines have indentation.

  • Yes
Align the left parenthesis
foo = long_function_name(var_one, var_two,
                         var_three, var_four)

# do not align the left parenthesis, but add an extra layer of indentation to distinguish it from what follows.
def long_function_name(var_one, var_two, var_three, var_four) :
    print(var_one)

An extra layer of indent must be added to the hanging indent.
foo = long_function_name(
    var_one, var_two,
    var_three, var_four)
Copy the code
  • No
# When vertical alignment is not used, the first line cannot have arguments.
foo = long_function_name(var_one, var_two,
    var_three, var_four)

The indentation of parameters is indistinguishable from the indentation of subsequent content.
def long_function_name(var_one, var_two, var_three, var_four) :
    print(var_one)
Copy the code

The 4 Spaces rule is optional for continuation lines.

# Suspension indent is not necessarily 4 Spaces
foo = long_function_name(
  var_one, var_two,
  var_three, var_four)
Copy the code

When an if statement crosses a line, a two-character keyword (such as if) followed by a space, followed by an opening parenthesis, makes for a nice indentation. The following lines are not specified for the time being. At least three formats are available, and the third one is recommended.

# No extra indentation, not very nice, personally not recommended.
if (this_is_one_thing and
    that_is_another_thing):
    do_something()

# add a comment
if (this_is_one_thing and
    that_is_another_thing):
    # Since both conditions are true, we can frobnicate.
    do_something()

# Add additional indentation, recommended.
# Add some extra indentation on the conditional continuation line.
if (this_is_one_thing
        and that_is_another_thing):
    do_something()
Copy the code

The parentheses on the right can also start on another line. There are two formats, the second is recommended.

# Close parenthesis does not retreat, personally not recommended
my_list = [
    1.2.3.4.5.6,
    ]
result = some_function_that_takes_arguments(
    'a'.'b'.'c'.'d'.'e'.'f'.)# Close parenthesis rollback
my_list = [
    1.2.3.4.5.6,
]
result = some_function_that_takes_arguments(
    'a'.'b'.'c'.'d'.'e'.'f'.)Copy the code

Spaces or Tab?

  • Whitespace is the preferred indentation method.
  • Tab is used only for consistency in code that already uses Tab indentation.
  • Mixing Tab and space indentation is not allowed in Python 3.
  • Python 2 indents that contain Spaces and tabs and Spaces should all be converted to space indents.

An alarm is raised when the Python2 command line interpreter uses the -t option to mix tabs and Spaces illegally. The -tt warning is promoted to error when used. These options are highly recommended! I also recommend pep8 and Autopep8 modules.

The biggest line width

Limit the maximum line width of all lines to 79 characters.

Long blocks of text, such as docstrings or comments, should be limited to 72 characters.

The default line continuation feature in most tools breaks the code structure and makes it harder to understand and is not recommended. But more than 80 characters are necessary to be reminded. Some tools may not have dynamic line breaking at all.

Some teams strongly prefer longer line widths. If agreed, this increases the nominal line length from 80 to 100 characters (up to 99 characters), although it is still recommended that docstrings and comments remain at 72.

The Python standard library is conservative, limiting the line width to 79 characters (docstrings/comment 72).

The preferred method for line continuation is to use parentheses, brackets, and curly backslashes may still be appropriate. The second is a backslash. For example, in the with statement:

with open('/path/to/some/file/you/want/to/read') as file_1, \
     open('/path/to/some/file/being/written'.'w') as file_2:
    file_2.write(file_1.read())
Copy the code

Similarly, assert.

Take care that the line continues as little as possible to affect readability. For example, a line usually follows a binary operator:

class Rectangle(Blob) :

    def __init__(self, width, height,
                 color='black', emphasis=None, highlight=0) :
        if (width == 0 and height == 0 and
                color == 'red' and emphasis == 'strong' or
                highlight > 100) :raise ValueError("sorry, you lose")
        if width == 0 and height == 0 and (color == 'red' or
                                           emphasis is None) :raise ValueError("I don't think so -- values are %s, %s" %
                             (width, height))
        Blob.__init__(self, width, height,
                      color, emphasis, highlight)
Copy the code

A blank line

  • Two blank lines separate the top-level function and class definition.
  • Class method definitions are separated by a single blank line.
  • Extra blank lines can be used to separate groups of functions if necessary, but use them sparingly.
  • Extra blank lines can be used in functions to split different logical blocks if necessary, but use them sparingly.
  • Python follows contol -l as whitespace; Many tools treat this as a page break, and these vary by editor.

Source code

Code distributed in core Python should always use UTF-8(ASCII in Python 2).

ASCII files (Python 2) or UTF-8(Python 3) should not have encoding declarations.

Non-default encodings in the standard library should only be used for testing or when comments or docstrings, such as author names that contain non-ASCII characters, use \x, \u, \u, or \N.

For Python 3.0 and later, PEP 3131 is a reference, which reads in part: ASCII identifiers must be used in the Python standard library, and only English letters should be used as far as possible. Strings and comments must also be ASCII. The only exceptions are :(a) testing for non-ascii features, and (b) authors whose names are not Latin letters.

The import

  • Import on a separate line

Yes:

import os
import sys
from subprocess import Popen, PIPE
Copy the code

No:

import sys, os
Copy the code
  • Imports are always at the top of the file, after module comments and docstrings, and before module global variables and constants.

Import order is as follows: standard library import, related third party library, local library. There should be empty rows between the imports of each group.

The related all is placed after the import.

  • Absolute path imports are recommended because they are generally more readable and tend to perform better (or at least provide better error messages).
import mypkg.sibling
from mypkg import sibling
from mypkg.sibling import example
Copy the code

Relative paths can also be used when absolute paths are long:

from . import sibling
from .sibling import example
Copy the code

Implicit relative imports have been banned in Python 3.

  • Methods for importing classes:
from myclass import MyClass
from foo.bar.yourclass import YourClass
Copy the code

If there is a conflict with a local name:

import myclass
import foo.bar.yourclass
Copy the code
  • Do not use wildcard import.

Wildcard imports (from import *) should be avoided because it is unclear what names exist in namespaces, confusing readers and many automated tools. The only exception is when redistributing external apis.

String reference

Single-quoted strings are the same as double-quoted strings in Python. Be careful to avoid backslashes in strings to improve readability.

According to PEP 257, all three quotes are double quotes.

Whitespace in expressions and statements

mandatory

  • Avoid Spaces in parentheses
# avoid Spaces inside parentheses
# Yes
spam(ham[1], {eggs: 2})
# No
spam( ham[ 1 ], { eggs: 2})Copy the code
  • Avoid Spaces before commas, colons, and semicolons
Avoid Spaces before # commas, colons, and semicolons
# Yes
if x == 4: print x, y; x, y = y, x
# No
if x == 4 : print x , y ; x , y = y , x
Copy the code
  • Index operations should have the same space (one space or no space) before and after the colon as the operator. My personal recommendation is none.
# Yes
ham[1:9], ham[1:9:3], ham[:9:3], ham[1: :3], ham[1:9:]
ham[lower:upper], ham[lower:upper:], ham[lower::step]
ham[lower+offset : upper+offset]
ham[: upper_fn(x) : step_fn(x)], ham[:: step_fn(x)]
ham[lower + offset : upper + offset]
# No
ham[lower + offset:upper + offset]
ham[1: 9], ham[1 :9], ham[1:9 :3]
ham[lower : : upper]
ham[ : upper]
Copy the code

The opening parenthesis of a function call cannot be preceded by a space

# Yes
spam(1)
dct['key'] = lst[index]

# No
spam (1)
dct ['key'] = lst [index]
Copy the code
  • Operators such as assignment cannot be preceded by more than one space for alignment
# Yes
x = 1
y = 2
long_variable = 3

# No
x             = 1
y             = 2
long_variable = 3
Copy the code

Other Suggestions

  • Place a space around the binary operator:

Involving =, conforming operators (+=, -=, etc.), comparing (==, <, >,! =, <>, <=, >=, in, not in, is, is not), and, or, not

  • It is not recommended to have Spaces before or after operators of higher precedence.
# Yes
i = i + 1
submitted += 1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)

# No
i=i+1
submitted +=1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)
Copy the code
  • Do not place Spaces before or after keyword arguments and default arguments
# Yes
def complex(real, imag=0.0) :
    return magic(r=real, i=imag)

# No
def complex(real, imag = 0.0) :
    return magic(r = real, i = imag)
Copy the code
  • In function comments, = should be preceded by a space, colon and “->” should not be preceded by a space, followed by a space.
# Yes
def munge(input: AnyStr) :
def munge(sep: AnyStr = None) :
def munge() -> AnyStr:
def munge(input: AnyStr, sep: AnyStr = None, limit=1000) :

# No
def munge(input: AnyStr=None) :
def munge(input:AnyStr) :
def munge(input: AnyStr)->PosInt:
Copy the code
  • Compound statements (multiple statements on the same line) are generally not recommended.
# Yes
if foo == 'blah':
    do_blah_thing()
do_one()
do_two()
do_three()

# No
if foo == 'blah': do_blah_thing()
do_one(); do_two(); do_three()
Copy the code
  • While it is sometimes possible to follow a short piece of code on the same line as if/for/while, never follow more than one clause, and avoid line breaks.
# No
if foo == 'blah': do_blah_thing()
for x in lst: total += x
while t < 10: t = delay()
Copy the code

More than:

# No
if foo == 'blah': do_blah_thing()
else: do_non_blah_thing()

try: something()
finally: cleanup()

do_one(); do_two(); do_three(long, argument,
                             list, like, this)

if foo == 'blah': one(); two(); three()
Copy the code

annotation

Comments that contradict the code are worse than none at all. Update comments first when modifying code!

Comments are complete sentences. If the comment is a broken sentence, the first letter should be capitalized, unless it is an identifier that begins with a lower-case letter (never change the case of the identifier).

If the comment is short, omit the trailing period. A comment block usually consists of one or more paragraphs. Paragraphs should be made up of complete sentences and each sentence should end with a dot (followed by two Spaces), paying attention to hyphenation and Spaces.

Non-english speaking programmers please write your comments in English unless you are 120% sure that the code will never be read by someone who doesn’t know your language.

Comment block

Comment blocks usually precede code and have the same indentation as the code. Each line is preceded by a ‘# ‘(unless it is indented text within a comment, note the space after the #).

Paragraphs within comment blocks are separated by lines containing only a single ‘#’.

Inline comments

Be careful with Inline Comments. Use Inline Comments sparingly. Inline comments are on the same line as the statement, separated from the statement by at least two Spaces. In-line comments are not required and are distracting. Don’t do this:

x = x + 1 # Increment x
Copy the code

But sometimes it’s necessary:

x = x + 1 # Compensate for border
Copy the code

Docstring

For documentation strings, see PEP 257.

  • Write docstrings for all public modules, functions, classes, and methods. Private methods do not necessarily have docstrings; it is recommended to have comments (appearing after the def line) that describe what the method does.
  • More references: PEP 257 Docstring conventions. Note that the ending “”” should be a separate line, for example:
"""Return a foobang
Optional plotz says to frobnicate the bizbaz first.
"""
Copy the code
  • A single-line docstring with the ending “”” on the same line.

Version label

Version Bookkeeping

If you must include git, Subversion, CVS, or RCS CRUD information in the source file, place it after the module’s docstring and before any other code with a blank line above and below:

__version__ = "$Revision$"# $Source$
Copy the code

Naming conventions

The Python library naming convention is a bit confusing and impossible to be completely consistent. But there are some commonly recommended naming conventions. New modules and packages, including third-party frameworks, should follow these standards. For existing libraries with different styles, it is recommended to maintain internal consistency.

The most important principle

API names visible to the user should follow conventions rather than implementations.

Description: Naming style

There are several naming styles:

  • B (single lowercase letter)
  • B(single capital letter)
  • Lowercase string
  • Lower_case_with_underscores (lowercase with underscores)
  • UPPERCASE
  • UPPER_CASE_WITH_UNDERSCORES(uppercase string with underscores)
  • CapitalizedWords(string of words or hump abbreviations with a capital letter)

Note: When using capitalized abbreviations, it is better to use capitalized abbreviations. So HTTPServerError is better than HTTPServerError.

  • MixedCase (mixedCase, first word is lowercase)
  • Capitalized_Words_With_Underscores (underlined, uppercase, ugly)

Another style uses short prefixes to group names. This is not often used in Python, but is mentioned for completeness. For example, os.stat() returns tuples with names like st_mode, ST_size, st_mtime, and so on (consistent with the POSIX system call structure).

All exposed functions in the X11 library begin with an X, which is generally considered unnecessary in Python because property and method names are prefixed with objects, while function names are prefixed with module names.

Here is an example of an underscore at the beginning and end:

  • _SINGLE_leading_underscore :(single leading underscore): weak internal use flag. For example, “from M import “does not import objects that begin with an underscore.
  • Single_trailing_underscore_ (single trailing underscore): Used to avoid conflicts with Python keywords. Such as:
Tkinter.Toplevel(master, class_='ClassName')
Copy the code
  • __double_leading_underscore(double leading underscore): When used to name class attributes, name refactoring is triggered. In the FooBar class, __boo becomes _FooBar__boo.
  • Double_leading_and_trailing_underscore: A magic object or attribute for a user’s namespace. For example :init, import or file, do not invent such names yourself.

Naming convention specification

  • Avoid names

Never use the character ‘L ‘(lowercase letter EL), ‘O'(uppercase letter OH), or ‘I'(uppercase letter eye) as a single character variable name. In some fonts, these characters are indistinguishable from the digits 1 and 0. When I replace ‘L’ with ‘L’.

  • Package and module names

Keep module names short, all in lower case, and underline to improve readability. Package names are similar to module names, but underscores are not recommended.

Module names correspond to file names, and some file systems are case-insensitive and truncated with long names, which is not a problem on Unix, but can be a problem when migrating code to Mac, Windows, or DOS. Of course, as systems evolve, this problem is less common.

Other modules are written in C or C++ at the bottom and have corresponding high-level Python modules with C/C++ module names preceded by an underscore (e.g. _socket).

  • The name of the class

Follow the CapWord.

When interfaces need to be documented and callable, naming conventions for functions may be used.

Note that most of the built-in names are single words (or two), and CapWord only works with exception names and built-in constants.

  • Exception names

If it is an Error, add the suffix “Error” to the class name.

  • Global variable name

Variables should only be used inside the module as far as possible, and conventions are similar to functions.

For modules designed to be used with “from M import “, use the all mechanism to prevent importing global variables; Or add a leading underscore to the global variable.

  • The function name

Function names should be lowercase, with underscores separating words for readability if necessary. Mixedcases are only allowed for compatibility considerations (e.g., threading.py).

  • Arguments to functions and methods

The first argument to the instance method is ‘self’.

The first argument to a class method is’ CLS ‘.

If the parameter name of a function conflicts with the reserved key, an underscore is usually followed by the parameter name.

  • Method name and instance variable

Same naming rules for functions.

Private methods and instance variables add a leading underscore.

To avoid naming conflicts with subclasses, two leading underscores are used to trigger refactoring. The class Foo property is named __a and cannot be accessed as foo. __a. (Persistent users can still pass foo._foo__a.) In general, double leading underscores are used only to avoid naming conflicts with base class attributes.

  • constant

Constants are usually defined at the module level and consist of uppercase letters separated by underscores. Examples include MAX_OVERFLOW and TOTAL.

  • Inheritance design

Consider whether a class’s methods and instance variables (collectively called properties) are exposed. If in doubt, choose not to go public; Making it public is easier than making the public property private.

Public properties are available to all and are generally backward compatible. Non-public attributes are not used by third parties, can be changed, or even removed.

The term “private” is not used here; no property in Python is truly private.

Another class of attributes is the subclass API(often called “protected” in other languages). Some classes are designed as base classes that can be extended and modified.

Keep these Python guides in mind:

  1. Public properties should have no leading underscores.
  2. If the public attribute name and the reserved key conflict, you can add a trailing underscore
  3. Simple to expose data properties, preferably only the Property name, without complex access/modification methods. Python’s Property provides a good encapsulation method. D. If you do not want subclasses to use attributes, consider naming them with two leading underscores (no trailing underscores).

Public and internal interfaces

Any guarantee of backward compatibility applies only to the public interface.

Documented interfaces are usually public unless the specification is temporary or internal, and all other interfaces are internal by default.

To better support introspection, modules list public apis in the __all__ attribute.

Internal interfaces should have leading underscores.

If a namespace (package, module, or class) is internal, so is the interface inside it.

Import names should be treated as implementation details. Other modules cannot access the name indirectly unless it is explicitly stated in the module’s API documentation, such as in os.path or the package’s __init__ that exposes submodules.

Programming advice

  • Consider multiple Python implementations (PyPy, Jython, IronPython,Pyrex, Psyco, etc.).

For example, CPython has efficient implementations of statements such as a+=b or a=a+b, but is slow to run in Jython. Try to use _.join() instead. _

  • None is used for comparison with ‘is’ or ‘is not’, not equal.

Notice the difference between “if x is not None” and “if x”.

  • Use “is not” instead of “not… Is “. The former is more readable.
# Yes
if foo is not None

# No
if not foo is None
Copy the code
  • Use class-based exceptions.

The comparison sort operation is better implemented in all six operations than the comparison logic in the code. The functools.total_ordering() decorator can generate missing comparison methods.

__eq__, __ne__, __lt__, __lt__, __gt__, ____)Copy the code

The PEP207 comparison standard states that reflection rules are done in Python. Therefore, the interpreter may swap the positions of arguments, such as replacing y > x with x < y, so it is necessary to implement these five methods.

  • Use the function definition def instead of lambda to assign to identifiers:
# Yes
def f(x) : 
    return 2*x

# No
f = lambda x: 2*x
Copy the code

The former is more suitable for callbacks and string representations.

  • Exception classes inherit from Exception, not BaseException.

Derived from exceptions, not BaseException exceptions. Exceptions directly inherited from BaseException that catch up with them are almost always the wrong thing to do.

To design hierarchy-based exceptions, catch the required exception, not the location where the exception is thrown. Can answer: “What’s wrong?” Instead of just pointing out “problems occur” (see more: PEP3151 Reconstructs OS and IO exception hierarchies)

  • Use exception chains appropriately. In Python3, “raise X from Y” explicitly means that the original traceback is replaced and retained.

When replacing internal exceptions (in Python2: “raise X” or “raise X from None”), ensure that the details are transferred to the new exception (such as converting KeyError to save the attribute name for AttributeError, or embedding the original exception in the new exception).

  • Python2 replaces “raise ValueError(‘message’)” with “raise ValueError, ‘message'”

The latter is not compatible with Python3 syntax. The former is convenient to follow.

  • When catching exceptions, try to specify specific exceptions instead of empty “except:” clauses. Such as:
# Yes
try:
    import platform_specific_module
except ImportError:
    platform_specific_module = None
Copy the code

The empty “except:” clause (equivalent to except Exception) catches SystemExit and KeyboardInterrupt exceptions, makes it difficult to interrupt a program with Control-c, and can mask other problems. If you catch all exceptions except signal errors, use “except Exception”.

The empty “except:” clause applies to two situations:

A, the traceback is printed or logged so that the user will at least know that an error has occurred. B, the code needs to do some cleaning and uses Raise to forward the exception. So try… Finally can catch it.

  • Python 2.6 recommends using AS to display binding exception names:
# Yes
try:
    process_data()
except Exception as exc:
    raise DataProcessingFailedError(str(exc))
Copy the code

This is compatible with Python3 syntax and avoids ambiguity.

  • When catching operating system errors, it is recommended to use Python 3.3 to introduce an explicit exception hierarchy that supports introspection of errno values.
  • Also keep all try/except clauses to a minimum to avoid shielding other errors.
# Yes
try:
    value = collection[key]
except KeyError:
    return key_not_found(key)
else:
    return handle_value(value)

# No
try:
    # Too generic!
    return handle_value(collection[key])
except KeyError:
    # captures KeyError in handle_value()
    return key_not_found(key)
Copy the code
  • The with statement is recommended for local resources to ensure instant cleanup. Try/finally statements are also acceptable.
  • Context managers should do things other than get and release resources through separate functions or methods. Such as:
# Yes
with conn.begin_transaction():
    do_stuff_in_transaction(conn)

# No
with conn:
    do_stuff_in_transaction(conn)
Copy the code

The latter specifies the Enter and exit methods.

  • A function or method explicitly returns None if it does not.
# Yes

def foo(x) :
    if x >= 0:
        return math.sqrt(x)
    else:
        return Nonedef bar(x):
    if x < 0:
        return None
    return math.sqrt(x)# Nodef foo(x):
    if x >= 0:
        return math.sqrt(x)def bar(x) :
    if x < 0:
        return
    return math.sqrt(x)
Copy the code
  • Use string methods instead of the string module.

Python 2.0 string methods are always faster, and Unicode strings have the same API.

  • useAnd startswith ().endswith() checks for prefixes and suffixes instead of string slicing. and

Startswith () and endswith are more concise and reduce errors. Such as:

# Yes
if foo.startswith('bar') :# No
if foo[:3] = ='bar':
Copy the code
  • Use isinstance() instead of comparing object types:
# Yes
if isinstance(obj, int) :# No
if type(obj) is type(1) :Copy the code

When checking for strings, note that STR and Unicode have a common base class in Python 2:

If isinstance(obj, baseString): In Python 2.2, the types module defines the StringTypes type for this, for example:

# Yes
if isinstance(obj, basestring):
Copy the code

In Python3, Unicode and BaseString no longer exist (only STR) and byte objects are no longer strings (sequences of integers)

  • For sequences (strings, lists, tuples), the empty sequence is false:
# Yes
if not seq:
   pass
if seq:
   pass

# No
if len(seq):
   pass
if not len(seq):
   pass
Copy the code
  • Do not have a lot of trailing space after a string.
  • Do not use == for Boolean comparisons
# Yes
if greeting::
   pass

# No
if greeting == True
   pass
if greeting is True: # Worse
   pass
Copy the code
  • Functional comments that are not used by the Python standard library are left to the user to discover and experience useful annotation styles. Here are some third-party suggestions (omitted).