Make your Python code elegant and native

The translation of sequence

If there’s a downside to grace, it’s that you need hard work to get it and a good education to appreciate it.

– Edsger Wybe Dijkstra

The Culture of the Python community has evolved a unique code style to guide the proper use of Python, often referred to as Pythonic. Idiomatic Python code is generally said to be pythonic. Python’s syntax and standard library design are everywhere pythonic. Furthermore, the Python community is very concerned with the consistency of the coding style, and they are pushing and practicing Pythonic everywhere. So it’s not uncommon to see discussions based on some code P vs NP (Pythonic vs non-Pythonic). Pythonic code is concise, unambiguous, elegant and, for the most part, efficient. Reading Pythonic code is a pleasant experience: “The code was written for people, just to make the machine run.”

But what is pythonic, like what is native Chinese, is real but vague. Import this See The Zen of Python by Tim Peters, which provides guidance. Many beginners have read it and agree with its ideas, but are at a loss to put it into practice. PEP 8 is nothing more than a coding specification, not enough to practice Pythonic. If you are having trouble writing Pythonic code, perhaps this note will help you.

Raymond Hettinger is a core Python developer who developed many of the features mentioned in this article. He is also an enthusiastic evangelist for the Python community, devoting himself to teaching pythonic. This post was compiled by Jeff Paine from his presentation at PyCon in 2013.

Terminology clarification: All collections referred to in this article are collections, not sets.

The following is the text.

Here are notes (video, slides) from Raymond Hettinger’s 2013 PyCon talk.

Sample code and quotes are from Raymond’s talk. This is sorted out according to my understanding, I hope you understand as smooth as ME!

Iterate over a range of numbers

for i in [0, 1, 2, 3, 4, 5]:

print i ** 2

for i in range(6):

print i ** 2

A better way

for i in xrange(6):

print i ** 2

Xrange returns an iterator that iterates through a range one value at a time. This method will save more than the Range. Xrange has been renamed range in Python 3.

Iterate over a set

colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]

for i in range(len(colors)):

print colors[i]

A better way

for color in colors:

print color

Reverse traversal

colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]

for i in range(len(colors)-1, -1, -1):

print colors[i]

A better way

for color in reversed(colors):

print color

Iterate over a set and its subscripts

colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]

for i in range(len(colors)):

print i, ‘—>’, colors[i]

A better way

for i, color in enumerate(colors):

print i, ‘—>’, color

This is efficient, elegant, and saves you from creating and adding subscripts yourself.

When you find yourself manipulating subscripts in a set, you’re probably doing something wrong.

Iterate over two sets

names = [‘raymond’, ‘rachel’, ‘matthew’]

colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]

n = min(len(names), len(colors))

for i in range(n):

print names[i], ‘—>’, colors[i]

for name, color in zip(names, colors):

print name, ‘—>’, color

A better way

for name, color in izip(names, colors):

print name, ‘—>’, color

Zip generates a new list in memory that requires more memory. Izip is more efficient than ZIP.

Note: In Python 3, izip was renamed zip and replaced the original zip as the built-in function.

Traversal in order

colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]

# positive sequence

for color in sorted(colors):

print colors

# reverse

for color in sorted(colors, reverse=True):

print colors

Custom sort order

colors = [‘red’, ‘green’, ‘blue’, ‘yellow’]

def compare_length(c1, c2):

    if len(c1) < len(c2): return -1

    if len(c1) > len(c2): return 1

    return 0

print sorted(colors, cmp=compare_length)

A better way

print sorted(colors, key=len)

The first method is inefficient and uncomfortable to write. In addition, Python 3 no longer supports comparison functions.

Call a function until a token value is encountered

blocks = []

while True:

    block = f.read(32)

    if block == ”:

        break

    blocks.append(block)

A better way

blocks = []

for block in iter(partial(f.read, 32), ”):

blocks.append(block)

Iter takes two arguments. The first is the function you call over and over again, and the second is the tag value.

In this example, it’s not obvious that partial makes the code less readable. The advantage of method 2 is that the return value of iter is an iterator. Iterators can be used in a variety of places: set, sorted, min, Max, heapq, sum…

Identify multiple exit points within the loop

def find(seq, target):

    found = False

    for i, value in enumerate(seq):

        if value == target:

            found = True

            break

    if not found:

        return -1

    return i

A better way

def find(seq, target):

    for i, value in enumerate(seq):

        if value == target:

            break

    else:

        return -1

    return i

For executes all the loops and else is executed.

If you’re new to the for-else syntax, you’ll be confused as to when else is executed. There are two ways to think about the else. The traditional approach is to treat for as if and else when the condition following for is False. When the condition is False, the for loop has not been broken, and all loops have been completed. So another way to do it is to call the else nobreak, and when for doesn’t get broken, then the loop ends with an else.

Iterate over the dictionary key

d = {‘matthew’: ‘blue’, ‘rachel’: ‘green’, ‘raymond’: ‘red’}

for k in d:

    print k

for k in d.keys():

    if k.startswith(‘r’):

        del d[k]

When should you use the second method rather than the first? When you need to change your dictionary.

If you change something while iterating on it, you are risking the world and deserve what happens next.

D.keyys () copies all the keys in the dictionary into a list. Then you can modify the dictionary.

Note: If iterating over a dictionary in Python 3 you have to explicitly write: list(d.keys()), because d.keys() returns a “dictionary view” (an iterator that provides a dynamic view of the dictionary key). See the documentation for details.

Iterate over the keys and values of a dictionary

Not fast, you have to rehash and do a lookup every time

for k in d:

print k, ‘—>’, d[k]

# Generate a large list

for k, v in d.items():

print k, ‘—>’, v

A better way

for k, v in d.iteritems():

print k, ‘—>’, v

Iteritems () is better because it returns an iterator.

Note: Python 3 no longer has iteritems(), items() behaves very similar to iteritems(). See the documentation for details.

Build a dictionary with key-value pairs

names = [‘raymond’, ‘rachel’, ‘matthew’]

colors = [‘red’, ‘green’, ‘blue’]

d = dict(izip(names, colors))

# {‘matthew’: ‘blue’, ‘rachel’: ‘green’, ‘raymond’: ‘red’}

Python 3: d = dict(zip(names, colors))

Count with a dictionary

colors = [‘red’, ‘green’, ‘red’, ‘blue’, ‘green’, ‘red’]

# Simple, basic counting method. Suitable for beginners to start learning.

d = {}

for color in colors:

    if color not in d:

        d[color] = 0

    d[color] += 1

# {‘blue’: 1, ‘green’: 2, ‘red’: 3}

A better way

d = {}

for color in colors:

d[color] = d.get(color, 0) + 1

# Slightly damp point method, but some potholes need attention, suitable for skilled hands.

d = defaultdict(int)

for color in colors:

d[color] += 1

Group by dictionary – Parts I and II

names = [‘raymond’, ‘rachel’, ‘matthew’, ‘roger’,

         ‘betty’, ‘melissa’, ‘judith’, ‘charlie’]

# In this example, we group by the length of name

d = {}

for name in names:

    key = len(name)

    if key not in d:

        d[key] = []

    d[key].append(name)

# {5: [‘roger’, ‘betty’], 6: [‘rachel’, ‘judith’], 7: [‘raymond’, ‘matthew’, ‘melissa’, ‘charlie’]}

d = {}

for name in names:

    key = len(name)

    d.setdefault(key, []).append(name)

A better way

d = defaultdict(list)

for name in names:

key = len(name)

d[key].append(name)

Is the dictionary popitem() atomic?

d = {‘matthew’: ‘blue’, ‘rachel’: ‘green’, ‘raymond’: ‘red’}

while d:

key, value = d.popitem()

print key, ‘–>’, value

Popitem is atomic, so there is no need to wrap a lock around it when multithreading.

Connect the dictionary

defaults = {‘color’: ‘red’, ‘user’: ‘guest’}

parser = argparse.ArgumentParser()

parser.add_argument(‘-u’, ‘–user’)

parser.add_argument(‘-c’, ‘–color’)

namespace = parser.parse_args([])

command_line_args = {k: v for k, v in vars(namespace).items() if v}

The following is the usual approach, which defaults to using the first dictionary, overwriting it with environment variables, and finally overwriting it with command line arguments.

# Unfortunately, copying data in this way is crazy.

d = defaults.copy()

d.update(os.environ)

d.update(command_line_args)

A better way

d = ChainMap(command_line_args, os.environ, defaults)

ChainMap was added in Python 3. Efficient and elegant.

Improve readability

Positional parameters and subscripts are nice
But keywords and names are better
The first method is convenient for computers
The second approach is consistent with the way humans think

Use keyword arguments to improve readability of function calls

twitter_search(‘@obama’, False, 20, True)

A better way

twitter_search(‘@obama’, retweets=False, numtweets=20, popular=True)

The second method is slightly slower (in microseconds), but worth it for the readability and development time of the code.

Use namedTuple to improve readability of multiple return values

Old testmod return value

doctest.testmod()

# (0, 4)

Is the test result good or bad? You can’t tell because the return value is not clear.

A better way

# new testmod return value, a namedTuple

doctest.testmod()

# TestResults(failed=0, attempted=4)

Namedtuple is a subclass of tuple, so it still works for normal tuple operations, but it’s friendlier.

Create a nametuple

TestResults = namedTuple(‘TestResults’, [‘failed’, ‘attempted’])

Unpack the sequence

p = ‘Raymond’, ‘Hettinger’, 0x30, ‘[email protected]’

# Common methods/habits of other languages

fname = p[0]

lname = p[1]

age = p[2]

email = p[3]

A better way

fname, lname, age, email = p

The second method uses unpack tuples, which are faster and more readable.

Update the state of multiple variables

def fibonacci(n):

    x = 0

    y = 1

    for i in range(n):

        print x

        t = y

        y = x + y

        x = t

A better way

def fibonacci(n):

    x, y = 0, 1

    for i in range(n):

        print x

        x, y = y, x + y

Problem with the first method

X and y are states, and the states should be updated in a single operation, and the states can be out of sync over several lines, which is often a source of bugs.
Operations have sequential requirements
Too low-level, too detailed

The second method has a higher level of abstraction, no risk of ordering errors and is more efficient.

Simultaneous status update

tmp_x = x + dx * t

tmp_y = y + dy * t

tmp_dx = influence(m, x, y, dx, dy, partial=’x’)

tmp_dy = influence(m, x, y, dx, dy, partial=’y’)

x = tmp_x

y = tmp_y

dx = tmp_dx

dy = tmp_dy

A better way

x, y, dx, dy = (x + dx * t,

                y + dy * t,

                influence(m, x, y, dx, dy, partial=’x’),

                influence(m, x, y, dx, dy, partial=’y’))

The efficiency of

Basic principles of optimization
Don’t move data unless you have to
Notice a little bit about replacing the O(n**2) operation with a linear operation

In general, don’t move data without reason

Connection string

names = [‘raymond’, ‘rachel’, ‘matthew’, ‘roger’,

‘betty’, ‘melissa’, ‘judith’, ‘charlie’]

s = names[0]

for name in names[1:]:

s += ‘, ‘ + name

print s

A better way

print ‘, ‘.join(names)

Update the sequence

names = [‘raymond’, ‘rachel’, ‘matthew’, ‘roger’,

‘betty’, ‘melissa’, ‘judith’, ‘charlie’]

del names[0]

The following code indicates that you are using the wrong data structure

names.pop(0)

names.insert(0, ‘mark’)

A better way

names = deque([‘raymond’, ‘rachel’, ‘matthew’, ‘roger’,

‘betty’, ‘melissa’, ‘judith’, ‘charlie’])

# Deque is more efficient

del names[0]

names.popleft()

names.appendleft(‘mark’)

Decorator and context management

Logic used to separate business and management
A clean and elegant tool for shredding code and improving code reuse
Having a good name is key
Remember the motto of Spider-Man: With great power comes great responsibility

Use decorators to separate out administrative logic

Mixing business and management logic, not reusable

def web_lookup(url, saved={}):

    if url in saved:

        return saved[url]

    page = urllib.urlopen(url).read()

    saved[url] = page

    return page

A better way

@cache

def web_lookup(url):

return urllib.urlopen(url).read()

Note: Functools. lru_cache was introduced in Python 3.2 to solve this problem.

Detach temporary context

# Save the old, create the new

old_context = getcontext().copy()

getcontext().prec = 50

print Decimal(355) / Decimal(113)

setcontext(old_context)

A better way

with localcontext(Context(prec=50)):

print Decimal(355) / Decimal(113)

The sample code is using the standard library Decimal, which already implements the LocalContext.

How do I open and close a file

f = open(‘data.txt’)

try:

data = f.read()

finally:

f.close()

A better way

with open(‘data.txt’) as f:

data = f.read()

How to use the lock

# to create lock

lock = threading.Lock()

# Use locks the old way

lock.acquire()

try:

    print ‘Critical section 1’

    print ‘Critical section 2’

finally:

    lock.release()

A better way

# New ways to use locks

with lock:

print ‘Critical section 1’

print ‘Critical section 2’

Isolate the temporary context

try:

os.remove(‘somefile.tmp’)

except OSError:

pass

A better way

with ignored(OSError):

os.remove(‘somefile.tmp’)

Ignored is a documentation added to Python 3.4.

Note: Ignored is actually called suppress in the standard library.

Try creating your own ignored context manager.

@contextmanager

def ignored(*exceptions):

    try:

        yield

    except exceptions:

        pass

Put it in your tools directory and you can ignore exceptions as well

__enter__ and __exit__ are written to the contextmanager by decorating the generator function in contextlib. See the documentation for details.

Detach temporary context

Temporarily redirect standard output to a file and then return to normal

with open(‘help.txt’, ‘w’) as f:

    oldstdout = sys.stdout

    sys.stdout = f

    try:

        help(pow)

    finally:

        sys.stdout = oldstdout

A better way to write it

with open(‘help.txt’, ‘w’) as f:

with redirect_stdout(f):

help(pow)

Redirect_stdout was added in Python 3.4, bug feedback.

Implement your own redirect_stdout context manager.

@contextmanager

def redirect_stdout(fileobj):

    oldstdout = sys.stdout

    sys.stdout = fileobj

    try:

        yield fieldobj

    finally:

        sys.stdout = oldstdout

Concise one-sentence expression

Two conflicting principles:

Don’t have too much logic on one line
Don’t break a single idea into multiple parts

Raymond’s principles:

The logic of a line of code is equivalent to a sentence of natural language

List parsing and generators

result = []

for i in range(10):

s = i ** 2

result.append(s)

print sum(result)

A better way

print sum(i**2 for i in xrange(10))

The first way is about what you are doing, and the second way is about what you want.

Make your Python code elegant and native

Related Posts

Figure – Shortest path -Dijkstra and its variants

Implementation of clue Binary Tree (C Language) (Data Structure of NTU)

【 check the leak fill the lack 】Java collection detailed solution!