preface

Many people already know about Python coroutines and how to use them, and there are many solutions for high concurrency in many network models: multithreading, multiprocessing, and coroutines. In the coroutine way, scheduling comes from the user, who can yield a state in a function to achieve nonblocking use of the program. Efficient concurrent tasks can be achieved using coroutines. Use of async/await after Python3.5. This article will describe in detail the use of async/await and the method of implementing non-blocking server with Tornado.

The state of Python coroutines and I/O calls

General use of coroutines

import asyncio

async def do_some_work(x) :
    print('Waiting: ', x)
    If the coroutine encounters await, the event loop will suspend the coroutine and execute other coroutines until the other coroutines are also suspended or completed before executing the next coroutine.
    # suspend, await coroutine, return
    await asyncio.sleep(x)
    return 'Done after {}s'.format(x)
    
coroutine = do_some_work(2)
loop = asyncio.get_event_loop()
loop.run_until_complete(coroutine)
Copy the code

In the example above, coroutines are implemented so that if multiple coroutines execute the do_some_work function at the same time, non-blocking can be achieved without blocking and waiting for each sleep to complete

However, in python development, we will encounter a lot of I/O calls, such as network I/O requests, database I/O requests, etc. These calls take a long time to execute. If many requests require I/O operations, python’s single process model will block I/O for a long time. As a result, the response time of subsequent requests is too long or even times out. Therefore, more time-consuming I/O operations need to be performed in non-blocking mode to greatly improve the execution efficiency and performance of the system.

The general I/O call method in Python

Python uses the Requests library or urllib library for I/O calls, such as HTTP requests.

from urllib import parse,request
import json

If a request contains data, it is a POST request, and if it contains data, it is a GET request
textmod = {"username": "admin"."password": "123456"}
textmod = json.dumps(textmod).encode(encoding='utf-8')
header_dict = {'Accept': 'application/json'.'Content-Type': 'application/json'}
url = 'http://localhost:8080/api/xxx'
req = request.Request(url=url, data=textmod, headers=header_dict)
res = request.urlopen(req)
res = res.read()
Copy the code

The above generic I/O invocation method is blocking, and is often used to connect to database SQLAlchemy, Django orM, etc. If deployed in this way on our API servers like Flask, Django, etc., it would block in a single thread and you would have to use multiple processes like the GEvent library, multi-threaded model, etc. to support concurrency.

Let’s introduce some I/O call libraries that implement AIO (FEI blocking), based on common I/O call libraries.

Common asynchronous I/O libraries

Suppose we implement a sleep using coroutines:

import asyncio
import time

async def do_some_work(x) :
    print('Waiting: ', x)
    time.sleep(x)
    return 'Done after {}s'.format(x)
    
coroutine = do_some_work(2)
loop = asyncio.get_event_loop()
loop.run_until_complete(coroutine)
Copy the code

Although async method is defined in this way, it cannot achieve non-blocking. The program will still block for x seconds when it runs to time.sleep, and it cannot achieve non-blocking with await. So we must change time.sleep to await asyncio.sleep(x) to be non-blocking. Similarly, libraries such as URllib, Requests, SQLAlchemy, etc., are all blocking. A thread running to the corresponding I/O call method will always wait for the execution to return. The only way to solve this problem is multithreading and coroutines. Efficiency is limited and difficult to manage, so it is recommended to use the method of coroutine to solve. Here are a few HTTP and mysql libraries that implement AIO, from third-party developers. Detailed implementation methods can be viewed on Github.

Aiohttp library

import asyncio
from aiohttp import ClientSession

url = "https://www.baidu.com/{}"
async def hello(url) :
    async with ClientSession() as session:
        async with session.get(url) as response:
            response = await response.read()
            print(response)

if __name__ == '__main__':
	coroutine = hello(url)
    loop = asyncio.get_event_loop()
    loop.run_until_complete(coroutine)
Copy the code

The above example implements a non-blocking HTTP request when multiple coroutine tasks are executed in the loop.run_until_complete() method and coroutine1 runs to await response.read(). But at the same time coroutine2 can start executing without waiting for Coroutine1, making it non-blocking. The same goes for aiomysQL, aioredis, etc

Aiomysql library

import asyncio
import aiomysql

async def test_example(loop) :
    pool = await aiomysql.create_pool(host='127.0.0.1', port=3306,
                                      user='root', password=' ',
                                      db='mysql', loop=loop)
    async with pool.acquire() as conn:
        async with conn.cursor() as cur:
            await cur.execute("SELECT 42;")
            print(cur.description)
            (r,) = await cur.fetchone()
            assert r == 42
    pool.close()
    await pool.wait_closed()

loop = asyncio.get_event_loop()
loop.run_until_complete(test_example(loop))
Copy the code

Using the AIomysQL library, creating database tables and performing CRUD operations can be non-blocking, after all, mysql itself supports multi-threading.

About async and await

How are async and await implemented in Python? In older versions of Python, the yield keyword was used to wrap a method as a generator, and each method to implement a coroutine was wrapped in the way of the decorator function @coroutine. Async and await are new syntax since version 3.5, but are implemented in much the same way. For example, the asyncio.sleep() method implemented above will save the sleep to Coroutine1 as a generator variable when multiple tasks such as coroutine1 and coroutine2 are passed loop.run_until_complete(). Coroutine1 is then suspended, coroutine2 is executed, and the entire loop.run_until_complete() mode iterates through the generator’s values, obtaining the results of each coroutine and performing the following steps separately, thus achieving a non-blocking effect. However, this approach still does not take advantage of multi-core CPUS, so the best server deployment mode is still multi-process + coroutine mode

Realize asynchronous server with Tornado framework

Having said that Python implements coroutines directly as scripts, we’ll take a look at how a server implements coroutines. Flask, Django and other frameworks commonly used by Python cannot be non-blocking when running API services and listening on ports. Therefore, flask and Django are often deployed in a way that uses multi-threading to improve concurrency efficiency. Tornado framework, which emerged in recent years, uses I/O multiplexing epoll mechanism to implement, which is called a UVloop. At the bottom level, the loop package at the beginning gradually developed into the asyncio library based on Python3.4, so Tornado framework is event-driven as AN API service.

The basic implementation

import tornado.web
import tornado.httpserver
import tornado.ioloop

class IndexPageHandler(tornado.web.RequestHandler) :
    def get(self) :
        self.render('tornado_index.html')

class Application(tornado.web.Application) :
    def __init__(self) :
        handlers = [
            (r'/', IndexPageHandler),
        ]

        settings = {"template_path": "templates/"}
        tornado.web.Application.__init__(self, handlers, **settings)

if __name__ == '__main__':
    app = Application()
    server = tornado.httpserver.HTTPServer(app)
    server.listen(5000)
    tornado.ioloop.IOLoop.instance().start()
Copy the code

Tornado framework is started in the ioloop mode above. In this mode, if async get method is defined in our controller IndexPageHandler, the await in it cannot be non-blocking. Therefore, we need to use the following asyncio+ Uvloop event loop mechanism to achieve non-blocking:

Non-blocking implementation

import tornado.web
import tornado.httpserver
import tornado.ioloop
import tornado.platform.asyncio as tornado_asyncio
import asyncio
import uvloop

class IndexPageHandler(tornado.web.RequestHandler) :
    def get(self) :
        self.render('tornado_index.html')

class Application(tornado.web.Application) :
    def __init__(self) :
        handlers = [
            (r'/', IndexPageHandler),
        ]

        settings = {"template_path": "templates/"}
        tornado.web.Application.__init__(self, handlers, **settings)

if __name__ == '__main__':
    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    tornado_asyncio.AsyncIOMainLoop().install()
    app = Application()
    server = tornado.httpserver.HTTPServer(app)
    server.listen(5000)
    asyncio.get_event_loop().run_forever()
Copy the code

To start Tornado in this way, we can define async methods in our IndexPageHandler and other handlers, and then import our AIO library and await it.

Tornado Deployment Best Practice

  1. Tornado framework is driven by non-blocking method.
  2. Running API service in multi-process mode;
  3. If environment isolation or rapid expansion is required, Docker mode is recommended

reference

  1. Blog.csdn.net/brucewong05…
  2. www.jianshu.com/p/b5e347b3a…
  3. Juejin. Cn/post / 684490…