This is the third day of my participation in the August Text Challenge.More challenges in August

coroutines

What is a coroutine?

A coroutine is simply a more lightweight thread (user-mode thread) that is not managed by the operating system kernel and is completely controlled by the program (executed in user-mode). Coroutines are interruptible inside the subroutine and then turn around to execute other subroutines, returning to resume execution at an appropriate time.

What’s the advantage of coroutines?

Coroutines has its own context and the register stack, scheduling switch, register context and stack to other places, when switching back to restore the previously saved context and the register stack, direct operation stack is basic no kernel switch overhead, accessing global variables can be unlocked, so context is very fast.

The yield keyword

  1. In coroutines, yield usually appears to the right of the expression. If there is no expression to the right of yield, the default output value is None. Now there is an expression to the right, so data is returned.
x = yield data
Copy the code
  1. The coroutine can receive data from the call, which feeds the coroutine through send(x), and the send method contains the next method, so the program continues.
  2. Coroutines can interrupt execution to execute another coroutine.

Code examples:

def hello() :
    data = "mima"
    while True:
        x = yield data  
        print(x)
a = hello()
next(a)
data = a.send("hello")
print(data)
Copy the code

Code details:

  • The program starts executing, and the function Hello does not actually execute, but returns a generator to A.
  • When the next() method is called, the hello function actually executes, the print method executes, and the while loop continues;
  • When the program encounters the yield keyword, it breaks again. At this point, the program continues from the yield keyword at a.end (“hello”), and then enters the while loop again. When the program encounters the yield keyword again, it breaks again.

Coroutine running state:

  • GEN_CREATE: Wait for execution to start
  • GEN_RUNNING: The interpreter is executing
  • GEN_SUSPENDED: Suspends at yield expression
  • GEN_CLOSED: The execution is complete

Producer-consumer pattern (coroutine)

import time

def consumer() :
    r = ""
    while True:
        res = yield r
        if not res:
            print("Starting.....")
            return
        print("[CONSUMER] Consuming %s...." %res)
        time.sleep(1)
        r = "200 OK"

def produce(c) :
    next(c)
    n = 0
    while n<6:
        n+=1
        print("[PRODUCER] Producing %s ...."%n)
        r = c.send(n)
        print("[CONSUMER] Consumer return: %s ...."%r)
    c.close()

c = consumer()
produce(c)     
Copy the code

Code details:

  • Call next(c) to start the generator;
  • Once the consumer produces something, it switches to consumer execution via C. end;
  • The consumer gets the message using the yield keyword and executes the result using yield.
  • The producer takes the result of the consumer’s processing and generates the next message;
  • When the producer stops producing, close the consumer, and the whole process ends.

Gevent third-party library coroutine support

Principle of use:

Gevent is based on the Python network library of coroutines. When a Greenlet encounters an IO operation (accessing the network), it automatically switches to another Greenlet until the IO operation is complete, and then switches back to continue at the appropriate time. In other words, the Greenlet keeps the Greenlet running by automatically switching coroutines instead of waiting for IO operations.

Classic code

Since the switch is automatically completed when the IO operation occurs, gEvent needs to modify the Python built-in library, which can be patched with a monkey patch (used to dynamically modify existing code at run time without requiring the original code).

#! /usr/bin/python2
# coding=utf8

from gevent import monkey
monkey.patch_all()

import gevent
import requests


def handle_html(url) :
    print("Starting %s... % url)
    response = requests.get(url)
    code = response.status_code

    print("%s: %s" % (url, str(code)))


if __name__ == "__main__":
    urls = ["https://www.baidu.com"."https://www.douban.com"."https://www.qq.com"]
    jobs = [ gevent.spawn(handle_html, url) for url in urls ]

    gevent.joinall(jobs)
Copy the code

Code details:

  • First add the monkey patch (monk. patch_all);
  • The example simulates concurrent requests for multiple urls, which in normal cases will be serial;

Running results:

Starting https://www.baidu.com... Starting https://www.douban.com... Starting https://www.qq.com... https://www.baidu.com: 200 https://www.douban.com: 418 https://www.qq.com: 200Copy the code

Asyncio built-in library coroutine support

Principle of use:

The programming model of Asyncio is a message loop, which directly obtains an Eventloop application from the asyncio module, and then puts the coroutine to be executed into the Eventloop to realize asynchronous IO.

Code examples:

import asyncio
import threading

async def hello() :
    print("hello, world: %s"%threading.currentThread())
    await asyncio.sleep(1) # 
    print('hello, man %s'%threading.currentThread())

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait([hello(), hello()]))
    loop.close()
Copy the code

Code parsing:

  • First get an EventLoop
  • And then you put that Hello coroutine into the EventLoop, run the EventLoop, and it will run until the Future is completed
  • Sleep (1) executes await asyncio.sleep(1) inside the Hello coroutine to simulate an IO operation that takes 1 second, during which the main thread does not wait but executes concurrently on other threads in the EventLoop.

Running results:

hello, world: <_MainThread(MainThread, started 139944938350400)>
hello, world: <_MainThread(MainThread, started 139944938350400)>
hello, man <_MainThread(MainThread, started 139944938350400)>
hello, man <_MainThread(MainThread, started 139944938350400)>
Copy the code

Asynchronous crawler instance

#! /usr/bin/python3

import aiohttp
import asyncio

async def fetch(url, session) :
    print("starting: %s" % url)
    async with session.get(url) as response:
        print("%s : %s" % (url,response.status))
        return await response.read()

async def run() :
    urls = ["https://www.baidu.com"."https://www.douban.com"."http://www.mi.com"]
    tasks = []
    async with aiohttp.ClientSession() as session:
        tasks = [asyncio.ensure_future(fetch(url, session)) for url in urls] Create task
        response = await asyncio.gather(*tasks) Execute tasks concurrently

        for body in response:
            print(len(response))

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(run())
    loop.close()
Copy the code

Code parsing:

  • Create an event loop and place tasks in the event loop;
  • In the run() method, the main task is to create, execute the task concurrently, and return the page content read;
  • The fetch() method makes the specified request through AIoHTTP and returns the waitable object;

Running results:

starting: https://www.baidu.com
starting: https://www.douban.com
starting: http://www.mi.com
https://www.douban.com : 200
https://www.baidu.com : 200
http://www.mi.com : 200
3
3
3
Copy the code

(End output url and list url in different order, proof of asynchronous I/O operation in coroutine)

About aiohttp

Asyncio implementation class TCP, UDP, SSL and other protocols, aiOHTTP is based on asyncio implementation of HTTP framework, which can be used to write a miniature HTTP server.

Code examples:

from aiohttp import web

async def index(request) :
    await asyncio.sleep(0.5)
    print(request.path)
    return web.Response(body='<h1> Hello, World</h1>')

async def hello(request) :
    await asyncio.sleep(0.5)
    text = '<h1>hello, %s</h1>'%request.match_info['name']
    print(request.path)
    return web.Response(body=text.encode('utf-8'))

async def init(loop) :
    app = web.Application(loop=loop)
    app.router.add_route("GET"."/" , index)
    app.router.add_route("GET"."/hello/{name}", hello)
    srv = await loop.create_server(app.make_handler(), '127.0.0.1'.8000)
    print("Server started at http://127.0.0.0.1:8000...")
    return srv

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(init(loop))
    loop.run_forever()
Copy the code

Code parsing:

  • Create an event loop and pass it into the init coroutine;
  • Create an Application instance and then add a route to handle the specified request;
  • Create the TCP service through loop, and start the event loop.

reference

www.liaoxuefeng.com/wiki/101695… Docs.aiohttp.org/en/stable/w… Docs.python.org/zh-cn/3.7/l…