This article was first published on:Walker AI

In the recent coding process, I encountered functions decorated with async and await. After querying the data, I learned that such functions are asynchronous functions based on coroutines. This kind of programming method is called asynchronous programming and is commonly used in systems with frequent IO, such as Tornado Web Framework, file download, web crawler and other applications. Coroutines can switch to other tasks while I/O waits, and then automatically call back when I/O finishes, resulting in significant resource savings and performance. This is a brief introduction to asynchronous programming concepts and case studies.

1. Introduction to coroutines

1.1 Definition and implementation of coroutines

A Coroutine, also known as a tasklet, is a user-state context switching technology. In a nutshell, it’s a thread that switches between blocks of code. Such as:

def func1() :
	print(1)...# coroutine intervention
	print(2)
	
def func2() :
	print(3)...# coroutine intervention
	print(4)

func1()
func2()

Copy the code

The above code is normal function definition and execution. The code in the two functions is executed according to the flow, and output successively: 1, 2, 3, 4. But if coroutine technology is involved then the function can be implemented to see the code switch execution, the final input: 1, 3, 2, 4.

There are several ways to implement coroutines in Python, such as:

  • Greenlet, a third-party module that implements the coroutine code (Gevent coroutines are based on greenlet implementations);

  • Yield, generator, can also be used to implement coroutine code.

  • Asyncio, a module introduced in Python3.4 for writing coroutine code;

  • Async & awiat, two keywords introduced in Python3.5, combined with the asyncio module to make writing coroutine code easier.

The first two implementations are older, so focus on the latter

Standard library implementation method

Asyncio is a standard library introduced in Python 3.4 that provides direct built-in support for asynchronous IO.

import asyncio

@asyncio.coroutine
def func1() :
    print(1)
    yield from asyncio.sleep(2)  Switch automatically to other tasks in tasks when IO time-consuming operations are encountered
    print(2)

@asyncio.coroutine
def func2() :
    print(3)
    yield from asyncio.sleep(2) Switch automatically to other tasks in tasks when IO time-consuming operations are encountered
    print(4)

tasks = [
    asyncio.ensure_future( func1() ),
    asyncio.ensure_future( func2() )
]

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))

Copy the code

Keyword implementation method

The async & await keyword is officially introduced in Python3.5, replacing the asyncio.coroutine decorator, and the coroutine code based on it is actually an improved version of the previous example, making the code much easier to read.


import asyncio

async def func1() :
    print(1)
    await asyncio.sleep(2)  # Time-consuming operation
    print(2)

async def func2() :
    print(3)
    await asyncio.sleep(2)   # Time-consuming operation
    print(4)

tasks = [
    asyncio.ensure_future(func1()),
    asyncio.ensure_future(func2())
]

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))

Copy the code

1.2 Case Demonstration

For example, use code to download images from url_list.

  • Method 1: Synchronous programming
The # Requests library only supports synchronous HTTP network requests
import requests

def download_image(url) :
  print("Download :",url)
  # Send a network request to download pictures
  response = requests.get(url)
  Save the image to a local file
  file_name = url.rsplit('_')[-1]
  with open(file_name, mode='wb') as file_object:
      file_object.write(response.content)
print("Download completed")


if __name__ == '__main__':
  url_list = [
      'https://www.1.jpg'.'https://www.2.jpg'.'https://www.3.jpg'
  ]
  for item in url_list:
      download_image(item)

Copy the code

Output: Send requests sequentially, requesting one image at a time, assuming each download takes 1s, completing the task takes more than 3s.

  • Method two: program implementation based on coroutine
# aioHTTP is an HTTP request library that supports asynchronous programming
import aiohttp
import asyncio

async def fetch(session, url) :
  print("Send request:", url)
  async with session.get(url, verify_ssl=False) as response:
      content = await response.content.read()
      file_name = url.rsplit('_')[-1]
      with open(file_name, mode='wb') as file_object:
          file_object.write(content)

async def main() :
  async with aiohttp.ClientSession() as session:
      url_list = [
          'https://www.1.jpg'.'https://www.2.jpg'.'https://www.3.jpg'
      ]
      tasks = [asyncio.create_task(fetch(session, url)) for url in url_list]
      await asyncio.wait(tasks)


if __name__ == '__main__':
  asyncio.run(main())

Copy the code

Output: Send three download requests at a time and download simultaneously. If each download takes 1s, it only takes about 1s to complete the task. The first method takes three times as long as the second method.

1.3 summary

Coroutines allow non-human code that would otherwise be written asynchronously + callbacks to be written seemingly synchronously.

2. Introduction to asynchronous programming

2.1 Differences between synchronous and asynchronous

Synchronous: Perform operations step by step, request asynchronous: start the next step without waiting for the previous step, request complete (each operation still has the order)

The current mainstream technique for python asynchrony is through inclusion of keywordsasync&awaitAsync module implementation.

2.2 Asynchronous programming – Event loops

An event loop can be thought of as a while loop that runs periodically and performs some task, terminating the loop under certain conditions.

# pseudocodeTask list = [Task1, the task2, the task3. ]while True: List of executable tasks, list of completed tasks = Go to the list of tasks to check all tasks, will'executable'and'Done'Task returnforReady tasksinReady task list: Execute ready tasksforCompleted tasksinCompleted Task List: Remove any completed task from the task list Terminates the loop if all tasks in the task list are completedCopy the code

The following code can be used to get and create event loops when writing programs.

# Method 1:
import asyncio
Build or get an event loop
loop = asyncio.get_event_loop()
Add the task to the event loopLoop. Run_until_complete (task)# method 2 (python3.7 and above support) :Asyncio. Run (task)Copy the code

2.3 Asynchronous programming – Get started quickly

Async keyword

  • Coroutine function: a function decorated with the async keyword when defining a functionAsync def Function name
  • Coroutine object: A coroutine object obtained by executing a coroutine function.
# coroutine function
async def func() :
    pass
# coroutine object
result = func()
Copy the code

Note: Executing a coroutine function only creates a coroutine object; the internal code of the function is not executed. If you want to run the internal code of the coroutine function, you must pass the coroutine object to the event loop for processing.

import asyncio 
async def func() :
    print("Execute coroutine function internal code!")
result = func()

Call method 1:
# loop = asyncio.get_event_loop()
# loop.run_until_complete( result )

Call method 2:
asyncio.run( result )
Copy the code

The await keyword

Await + await object (coroutine, Future, Task -> IO wait), suspends current coroutine (Task) when I/O operation is performed, and continues to execute after I/O operation is completed. When the current coroutine is suspended, the event loop can execute other coroutines (tasks).

import asyncio

async def others() :
    print("start")
    await asyncio.sleep(2)
    print('end')
    return 'Return value'

async def func() :
    print("Execute the code inside the coroutine function.")
    # await the result of the value of the object before continuing down
    response = await others()
    print("I/O request ends with:", response)

asyncio.run( func() )

Copy the code

Task object

A Task is added to an event loop for concurrent scheduling of coroutines. The Task is created using asyncio.create_task, which allows the coroutine to be added to the event loop for execution.

async def module_a() :
    print("start module_a")
    await asyncio.sleep(2) Simulate I/O operations for Module_A
    print('end module_a')
    return 'module_a finish'

async def module_b() :
    print("start module_b")
    await asyncio.sleep(1) Simulate I/O operations for Module_A
    print('end module_b')
    return 'module_b finish'  

task_list = [
    module_a(),
	module_b(), 
]

done,pending = asyncio.run( asyncio.wait(task_list) )
print(done)
Copy the code

2.4 Case Demonstration

For example: use code to connect and query the database while downloading an APK file to the local.

import asyncio
import aiomysql
import os
import aiofiles as aiofiles
from aiohttp import ClientSession

async def get_app() :

    url = "http://www.123.apk"
    async with ClientSession() as session:
        Network IO request, get response
        async with session.get(url)as res:
            if res.status == 200:
                print("Download successful", res)
                Disk IO request, read response data
                apk = await res.content.read()
                async  with  aiofiles.open("demo2.apk"."wb") as f:
                    Disk I/O request, data written to the local disk
                    await f.write(apk)
            else:
                print("Download failed")

async def excute_sql(sql) :
    MySQL > connect to MySQL
    conn = await aiomysql.connect(host='127.0.0.1', port=3306, user='root', password='123', db='mysql'.)Create a CURSOR
    cur = await conn.cursor()
    # network IO operation: execute SQL
    await cur.execute(sql)
    # network IO operations: get SQL results
    result = await cur.fetchall()
    print(result)
    # Network IO actions: Close links
    await cur.close()
    conn.close()

task_list = [get_app(), execute_sql(sql="SELECT Host,User FROM user")]
asyncio.run(asyncio.wait(task_list))
Copy the code

Code logic analysis:

[step1] Asyncio.run () creates an event loop. The wait() method adds the task list to the current event loop; (Note: the event loop must be created first and then added to the task list, otherwise an error will be reported.)

【 Step2 】 The event loop listens to the event state and starts to execute the code. The get_app() method in the list is executed first. When the code reaches async with session.get(URL)as res:, the keyword await is encountered, indicating IO time-consuming operation. The thread suspends the task in the background and switches to another asynchronous function, excute_SQL ();

[Step3] When the code executes the first IO time operation of excute_SQL (), the thread will repeat the previous operation and suspend the task to execute other executable code. If the event loop hears that the first IO operation in get_app() has completed, the thread will switch to the code after the first IO operation of the method and execute in sequence until it encounters the next AWAIT decorated IO operation. If the event loop listens until the first IO operation in excute_SQL () completes before the first IO operation in get_app(), the thread continues to execute the excute_SQL code.

[Step4] The thread repeats the steps in point 3 above until the code is completely executed and the event loop stops.

2.5 summary

Generally speaking, CPU time calculation methods are as follows:

Computation-intensive operations: Computation-intensive tasks, such as calculating PI and decoding video in high definition, are characterized by heavy computations, logical judgments, and CPU consumption.

IO intensive operations: The TASKS involving network and disk I/OS are IO intensive tasks. These tasks consume little CPU and spend most of the time waiting for I/O operations to complete (because the I/O speed is much lower than that of the CPU and memory).

Asynchronous programming is based on the implementation of coroutines. If you use coroutines to implement computation-intensive operations, the overall performance of the system will be reduced because threads always go through a series of operations like “calculate” –> “save” –> “create a new environment” as they switch back and forth between contexts. So asynchronous programming is not suitable for computationally intensive programs. However, in IO intensive operations summary, the coroutine switches to other tasks during THE I/O wait time and then automatically calls back when the I/O operation is finished, which can result in significant resource savings and performance.


PS: more dry technology, pay attention to the public, | xingzhe_ai 】, and walker to discuss together!