Pay attention to the “water drop and silver bullet” public account, the first time to obtain high-quality technical dry goods. 7 years of senior back-end development, with a simple way to explain the technology clearly.

This article takes about 10 minutes to read.

The yield keyword is actually used more frequently in Python development, for example in the generation of large collections, simplified code structures, coroutines, and concurrency.

But do you really know how yield works?

In this article, we’ll take a look at the running flow of yield and where it’s appropriate to use yield in development scenarios.

The generator

If the yield keyword is included in a method, the function is a “generator.”

A generator is simply a special iterator that, like an iterator, iterates over each element of an output method.

If you’re not sure what an “iterator” is, you can refer to this article I wrote: Advanced Python Technology — Iterators, iterables, Generators.

Let’s look at a method that contains the yield keyword:

# coding: utf8

# generator
def gen(n) :
    for i in range(n):
        yield i

g = gen(5)      Create a generator
print(g)        # <generator object gen at 0x10bb46f50>
print(type(g))  # <type 'generator'>

# Data in iterator
for i in g:
    print(i)
    
# Output:
0, 1, 2, 3, 4
Copy the code

Note that in this example, when we execute g = gen(5), the code in gen does not actually execute, we just create a “generator object” of type Generator.

And then, when we execute for I in g, every time we execute the loop, we’re going to yield, and we’re going to return the value after yield.

This iteration process is the biggest difference from iterators.

In other words, if we want to output five elements that weren’t actually produced when we created the generator, when would they be? Each element is generated in turn only if the for loop encounters yield.

In addition, generators include other methods for iterating data in addition to iterators:

  • generator.__next__(): performforIs called each time toyieldIt will stop and come backyieldSubsequent values are thrown if there is no data to iterate overStopIteratorAbnormal,forEnd of the cycle
  • generator.send(value): Passes a value externally into the generator, changing ityieldThe previous value
  • generator.throw(type[, value[, traceback]]): The outside throws an exception to the generator
  • generator.close(): Close the generator

By using these methods of generators, we can do a lot of interesting things.

next

Starting with the generator’s __next__ method, let’s look at the following example.

# coding: utf8

def gen(n) :
    for i in range(n):
        print('yield before')
        yield i
        print('yield after')

g = gen(3)      Create a generator
print(g.__next__())  # 0
print(The '-')
print(g.__next__())  # 1
print(The '-')
print(g.__next__())  # 2
print(The '-')
print(g.__next__())  # StopIteration

# Output:
# yield before
# 0
# -
# yield after
# yield before
# 1
# -
# yield after
# yield before
# 2
# -
# yield after
# Traceback (most recent call last):
# File "gen.py", line 16, in 
      
# print(g.__next__()) # StopIteration
# StopIteration
Copy the code

In this example, we define the gen method, which contains the yield keyword. We then call g = gen(3) to create a generator, but instead of iterating over it with for, we call G.__next__ () multiple times to print out the elements in the generator.

We see that when g.__next__() is executed, the code will execute to yield and then return the value after yield. If you continue calling G.__next__ (), notice that this execution starts at the same place where the last yield ended. It also retains the context of the last execution and continues iterating backwards.

This is where yield comes in. In an iterative generator, each execution preserves the state of the previous one, rather than the normal method of returning and repeating the last process again the next time.

In addition to saving state, generators can change their internal state in other ways, and these are the send and throw methods.

send

In the example above, we only show the value after yield. The syntax j = yield I can also be used. Let’s look at the following code:

# coding: utf8

def gen() :
    i = 1
    while True:
        j = yield i
        i *= 2
        if j == -1:
            break
Copy the code

At this point, if we execute the following code:

for i in gen():
    print(i)
    time.sleep(1)
Copy the code

The output will be 1, 2, 4, 8, 16, 32, 64… The loop continues until we kill the process.

The reason this code keeps looping is because it can’t execute until the j == -1 branch breaks out, so what if we wanted this code to execute there?

This is where we use the generator’s send method, which changes the state of the generator by passing in values from outside.

The code could be written like this:

g = gen()   Create a generator
print(g.__next__())  # 1
print(g.__next__())  # 2
print(g.__next__())  # 4
# send passes -1 into the generator to branch j = -1
print(g.send(-1))   # StopIteration The iteration stops
Copy the code

When we execute g.end (-1), we pass -1 into the generator and assign it to j before yield, where j = -1, then the method breaks out and does not iterate any further.

throw

In addition to passing a value inside the generator, we can also pass an exception by calling the throw method:

# coding: utf8

def gen() :
    try:
        yield 1
    except ValueError:
        yield 'ValueError'
    finally:
        print('finally')

g = gen()   Create a generator
print(g.__next__()) # 1
Passing an exception internally to the generator returns ValueError
print(g.throw(ValueError))

# the Output:
# 1
# ValueError
# finally
Copy the code

After the generator is created, this example passes an exception inside the generator using g.row (ValueError), going to the branch logic for the generator exception handling.

close

The generator’s close method is also relatively simple, which means that the generator is closed manually and cannot be operated on after being closed.

>>> g = gen()
>>> g.close() # close generator
>>> g.__next__() Unable to iterate over data
Traceback (most recent call last):
  File "<stdin>", line 1.in <module>
StopIteration
Copy the code

The close method is used less often in development, so it’s good to know.

Usage scenarios

Now that you know how yield and generators are used, what business scenarios are yield and generators commonly used in?

Here are a few examples of large collection generation, simplified code structure, coroutines, and concurrency. You can use yield in these usage scenarios.

Generation of large sets

If you want to create a very large collection, using list to create a collection will cause a large amount of memory to be allocated, such as the following:

# coding: utf8

def big_list() :
    result = []
    for i in range(10000000000):
        result.append(i)
    return result

The memory footprint is very large
for i in big_list():
    print(i)
Copy the code

In this scenario, we can solve this problem perfectly by using generators.

Because the generator iterates over data only at yield, it only claims the memory space needed to return the element. The code can be written like this:

# coding: utf8

def big_list() :
    for i in range(10000000000) :yield i

Generate elements in sequence only during iteration to reduce memory footprint
for i in big_list():
    print(i)
Copy the code

Simplified code structure

In development, we often encounter a scenario where a method returns a list, but the list is composed of multiple logical blocks, which makes our code structure very complicated:

# coding: utf8

def gen_list() :
    # Multiple logical blocks to generate a list
    result = []
    for i in range(10):
        result.append(i)
    for j in range(5):
        result.append(j * j)
    for k in [100.200.300]:
        result.append(k)
    return result
    
for item in gen_list():
    print(item)
Copy the code

In this case, we can only append elements to the list within each logical block using Append, which is rather verbose to write.

If you use yield to generate the list, the code is much cleaner:

# coding: utf8

def gen_list() :
    # Multiple logical blocks use yield to generate a list
    for i in range(10) :yield i
    for j in range(5) :yield j * j
    for k in [100.200.300] :yield k
        
for item in gen_list():
    print(i)
Copy the code

With yield, you no longer need to define a variable of type list. You simply yield the element at each logical block, achieving the same functionality as the previous example.

As we can see, the code using yield is cleaner and more structured, with the added benefit of reducing memory consumption by only applying memory for iterating over elements.

Coroutines and concurrency

Another scenario where yield uses a lot is “coroutines and concurrency.”

If we want to improve the efficiency of the program, we usually use a multi-process, multi-threaded way to write the program code. The most common programming model is the “producer-consumer” model, in which one process/thread produces data and the other processes/threads consume data.

In the development of multi-process and multi-threaded programs, in order to prevent shared resources from being tampered with, we usually need to lock the protection, which increases the complexity of programming.

In Python, in addition to using processes and threads, you can use “coroutines” to make your code run more efficiently.

What is a coroutine?

Simply put, a program that is executed cooperatively by a combination of blocks is called a coroutine.

With “coroutines” in Python, the yield keyword is used.

Perhaps this is too easy to understand, we use yield to implement a coroutine producer/consumer example:

# coding: utf8

def consumer() :
    i = None
    while True:
        # Get the data from Producer
        j = yield i 
        print('consume %s' % j)

def producer(c) :
    c.__next__()
    for i in range(5) :print('produce %s' % i)
        Send data to consumer
        c.send(i)
    c.close()

c = consumer()
producer(c)

# Output:
# produce 0
# consume 0
# produce 1
# consume 1
# produce 2
# consume 2
# produce 3
# consume 3.Copy the code

The execution flow of this program is as follows:

  1. c = consumer()Create a generator object
  2. producer(c)Start executing,c.__next()__Will start the generatorconsumerUntil the code runs toj = yield iAnd at this timeconsumerReturn after the first execution is complete
  3. producerThe function continues down untilc.send(i)Where we use the generator’ssendMethods toconsumerTo send data
  4. consumerThe function is woken up fromj = yield iTo continue execution and receiveproducerThe incoming data is assigned toj, and then prints it until it is executed againyieldIn return
  5. producerContinue the loop, sending the data in turn tocosnumer, until the loop ends
  6. In the endc.close()Shut downconsumerGenerator, program exit

In this example, we find that the program switches back and forth between producer and consumer functions, cooperating with each other to complete the business scenarios of production and consumption tasks. Most importantly, the entire program is completed in a single process and single thread.

This example uses the yield, generator’s __next__, send, and close methods described above. If it’s hard to follow, you can watch the example a few times and test it out for yourself.

When we use coroutines to write producer/consumer programs, the benefits are as follows:

  • There is no lock in the whole process of running the program, and the protection of shared variables is not considered, which reduces the programming complexity
  • Programs switch back and forth between functions in user mode, unlike processes/threads that fall into kernel mode, which reduces the cost of kernel-mode context switching and makes execution more efficient

Thus, Python’s yield and generator implementations of coroutines provide a programming foundation for concurrent execution of programs.

Many third-party libraries in Python are packaged based on this feature, such as GEvent and Tornado, which greatly improve the performance of programs.

conclusion

To summarize, this article has focused on the use of yield and various features of generators.

A generator is a special type of iterator that, in addition to iterating over data, preserves state in a method at execution time. In addition, it provides an external way to change internal state, passing external values inside the generator.

Yield and generator features can be used in development for large integration generation, simplified code structure, coroutines, and concurrency business scenarios.

Yield in Python is also the basis for implementing coroutines and concurrency. It provides a user-mode programming model for coroutines that improves the efficiency of programs.

My advanced Python series:

  • Python Advanced – How to implement a decorator?
  • Python Advanced – How to use magic methods correctly? (on)
  • Python Advanced – How to use magic methods correctly? (below)
  • Python Advanced — What is a metaclass?
  • Python Advanced – What is a Context manager?
  • Python Advancements — What is an iterator?
  • Python Advancements — How to use yield correctly?
  • Python Advanced – What is a descriptor?
  • Python Advancements – Why does GIL make multithreading so useless?

Crawler series:

  • How to build a crawler proxy service?
  • How to build a universal vertical crawler platform?
  • Scrapy source code analysis (a) architecture overview
  • Scrapy source code analysis (two) how to run Scrapy?
  • Scrapy source code analysis (three) what are the core components of Scrapy?
  • Scrapy source code analysis (four) how to complete the scraping task?

Want to read more hardcore technology articles? Focus on”Water drops and silver bullets”Public number, the first time to obtain high-quality technical dry goods. 7 years of senior back-end development, with a simple way to explain the technology clearly.