What is asynchronous programming?

Note: This article is an intuitive sense of the concept, just to simplify, not strictly the same moment.

We’re all familiar with synchrnous code, which is running one step before running the next. The easiest and most intuitive way to run multiple tasks “simultaneously” in synchronized code is to run multiple Threads or processes. This level of “running” multiple tasks is assisted by the operating system. The task scheduling system of the operating system decides when to run the task, when to switch the task, and you, as an application layer programmer, have no way to intervene.

I’m sure you’ve heard the gripe about Thread and Process: Processes are too heavy, and threads involve headline-grabbing locks. Especially for a Python developer, multithreading can’t really use multiple cores due to the GIL (global interpreter lock), and if you use multithreading to run computational tasks, it’s even slower.

Asynchronous programming is different in that values use a single process instead of threads, but it is possible to run multiple tasks “at the same time” (tasks are really functions).

These functions have a nice feature: they can be paused if necessary, handing over the right to run to other functions. When the time is right, it can resume the previous state and continue to run. Does that sound like a process? It can be paused, it can be resumed. It’s just that the process is scheduled by the operating system, and these functions are scheduled by the process itself (or by the programmer yourself). This means that a lot of computer resources will be saved, because process scheduling necessarily requires a lot of SyscAll, and Syscall is expensive.

Asynchronous programming considerations

It is important to note that asynchronous code should not use block functions! This means that your code should not include the following:

  • time.sleep()
  • The socket that will block
  • requests.get()
  • A database call that blocks

Why is that? When thread or process is used, the code blocks and the operating system dispatches it for you, so there is no “one block, one stupid wait” situation.

But now, to the operating system, your process is just a normal process, it doesn’t know what different tasks you have divided up, it’s up to you. If you have a blocking call in your code, the rest of the code is just waiting. (Wait to see if this is wrong).

Taking the Python asyncio

Python version support

  • The asyncio module is released with Python3.4.
  • The async and await keywords were first introduced in Python3.5.
  • Not supported before Python3.3.

Start coding

Synchronous version

Is a simple visit baidu home page 100 times, and then print the status code.

import time
import requests

def visit_sync():
    start = time.time()
    for _ in range(100):
        r = requests.get(URL)
        print(r.status_code)
    end = time.time()
    print("visit_sync tasks %.2f seconds" % (end - start))

if __name__ == '__main__':
    visit_sync()
Copy the code

Run it and find 6.64 seconds.

The asynchronous version

import time
import asyncio
import aiohttp

async def fetch_async(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            status_code = resp.status
            print(status_code)


async def visit_async():
    start = time.time()
    tasks = []
    for _ in range(100):
        tasks.append(fetch_async(URL))
    await asyncio.gather(*tasks)
    end = time.time()
    print("visit_async tasks %.2f seconds" % (end - start))


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(visit_async())
Copy the code

A few caveats:

  • The network access section has changed from requests.get() to AIoHTTP (not the standard library you need to install yourself).
  • The way the function is called has changedvisit_sync()You can run it directly, not in asynchronous codevisit_async(), which prompts you with a warning:

If you print the type of the return value from visit_async(), you can see that this is a coroutine.

The normal gesture is to call await visit_async(), just as in the code await asyncio.Gather (*tasks). But the tricky bit is that await is only used in functions defined with the keyword async, whereas our if __name__ == “__main__” has no function in it, so we can pass the coroutine to an eventloop.

loop = asyncio.get_event_loop()
loop.run_until_complete(visit_async())
Copy the code

After operation, it is found that it takes 0.34 seconds and the efficiency is increased by more than 20 times. (Refer to the previous article on how to analyze asynchronous efficiency in a compelling way.)

To summarize

In fact, this article has introduced an important concept in asynchronous programming: coroutines. We’ll spend a lot of time talking about coroutines later in the Asynchronous Programming 101 series.

The basis for coroutines running multiple tasks “simultaneously” is that functions can be paused (we’ll talk about how this is done later, using yield in Python). The above code uses the asyncio event_loop, which essentially does what it does when the function pauses, moves on to the next task, and when the time is right (in this case, the request completes) resumes the function and lets it continue (a bit like an operating system).

This has a huge performance advantage over using multiple threads or processes and handing the task of scheduling to the operating system, because it does not require a large amount of Syscall. At the same time, it solves the problem of locking caused by multi-thread data sharing. And as an application developer, you probably know better than the operating system when to switch tasks.

I personally feel that two skills are very important for the new generation of programmers: the ability to program asynchronously and the ability to leverage multicore systems.

How about a star?

My public number: the full stack does not exist