The first part

Sdiehl.github. IO /gevent-tuto…

Written by the Gevent Community

gevent is a concurrency library based around libev. It provides a clean API for a variety of concurrency and network related tasks.

Gevent is a concurrency library based on Libev that provides a clean API for handling all kinds of problems and network related tasks.

The Introduction introduces the

The structure of this tutorial assumes an intermediate level knowledge of Python but not much else. No knowledge of concurrency is expected. The goal is to give you the tools you need to get going with gevent, help you tame your existing concurrency problems and start writing asynchronous applications today.

The structure of this tutorial assumes an intermediate Level Python programmer with little knowledge of concurrency. The goal is to show you how to use GEvent, help you solve concurrency problems, and start writing asynchronous applications.

The Core of the Core

Greenlets

The primary pattern used in gevent is the Greenlet, a lightweight coroutine provided to Python as a C extension module. Greenlets all run inside of the OS process for the main program but are scheduled cooperatively. This differs from any of the real parallelism constructs provided by multiprocessing or multithreadinglibraries which do spin processes and POSIX threads which are truly parallel.

The main use of Greenlet in Gevent is to provide Python with a lightweight coroutine as an extension to C. All system processes run by the Greenlets main program are reasonably scheduled. Unlike any library provided by Multiprocessing or Multithreading and POSIX threads, this is a true parallel multiprocessor or multithreaded library that provides a true parallel architecture.

Synchronous & Asynchronous Execution

The core idea of concurrency is that a larger task can be broken down into a collection of subtasks whose operation does not depend on the other tasks and thus can be run asynchronously instead of one at a time synchronously. A switch between the two executions is known as a context switch.

A context switch in gevent done through yielding. In this case example we have two contexts which yield to each other through invoking gevent.sleep(0).

The core idea of concurrency is that a larger task can be broken up into multiple sub-tasks that run independently of a collection of other tasks and therefore can run asynchronously rather than one in time synchronization. A transformation between two executants is an associative transformation.

In GEvent a correlation conversion can be achieved with yielding. In this example, the two programs are converted by calling gevent.sleep(0).

import gevent

def foo() :
    print('Running in foo')
    gevent.sleep(0)
    print('Explicit context switch to foo again')

def bar() :
    print('Explicit context to bar')
    gevent.sleep(0)
    print('Implicit context switch back to bar')

gevent.joinall([
    gevent.spawn(foo),
    gevent.spawn(bar),
])

"""
Running in foo
Explicit context to bar
Explicit context switch to foo again
Implicit context switch back to bar
"""
Copy the code

It is illuminating to visualize the control flow of the program or walk through it with a debugger to see the context switches as they occur.

In the mediator, you can clearly see how the program runs between transitions.

The real power of gevent comes when we use it for network and IO bound functions which can be cooperatively scheduled. Gevent has taken care of all the details to ensure that your network libraries will implicitly yield their greenlet contexts whenever possible. I cannot stress enough what a powerful idiom this is. But maybe an example will illustrate.

The real power of GEvent comes when we use it for network and IO binding functions that can be cooperatively scheduled. Gevent handles all the details to ensure that your network library generates greenlet context as implicitly as possible. I can’t stress enough what a powerful idiom this is. But maybe an example will do.

In this case the select() function is normally a blocking call that polls on various file descriptors.

In this case, the select () function is usually a blocking call that polls various file descriptors.

import time
import gevent
from gevent import select

start = time.time()
tic = lambda: 'the at % 1.1 f seconds' % (time.time() - start)

def gr1() :
    # Busy waits for a second, but we don't want to stick around...
    print('Started Polling: ', tic())
    select.select([], [], [], 2)
    print('Ended Polling: ', tic())

def gr2() :
    # Busy waits for a second, but we don't want to stick around...
    print('Started Polling: ', tic())
    select.select([], [], [], 2)
    print('Ended Polling: ', tic())

def gr3() :
    print("Hey lets do some stuff while the greenlets poll, at", tic())
    gevent.sleep(1)

gevent.joinall([
    gevent.spawn(gr1),
    gevent.spawn(gr2),
    gevent.spawn(gr3),
])

Started Polling: at 0.0 seconds Started Polling: At 0.0 seconds Hey lets do some stuff while the greenlets poll, at at 0.0 seconds Ended Polling: At 2.0 seconds Ended Polling: at 2.0 seconds ""
Copy the code

A somewhat synthetic example defines a task function which is non-deterministic (i.e. its output is not guaranteed to give the same result for the same inputs). In this case the side effect of running the function is that the task pauses its execution for a random number of seconds.

For a more general example, define a task function that is indeterminate (and does not guarantee the same input and output). Running the task function in this case just pauses its random number execution for a few seconds.

import gevent
import random

def task(pid) :
    """ Some non-deterministic task """
    gevent.sleep(random.randint(0.2) *0.001)
    print('Task', pid, 'done')

def synchronous() :
    for i in range(1.10):
        task(i)

def asynchronous() :
    threads = [gevent.spawn(task, i) for i in xrange(10)]
    gevent.joinall(threads)

print('Synchronous:')
synchronous()

print('Asynchronous:')
asynchronous()
Copy the code
# the results
Synchronous:
Task 1 done
Task 2 done
Task 3 done
Task 4 done
Task 5 done
Task 6 done
Task 7 done
Task 8 done
Task 9 done
Asynchronous:
Task 1 done
Task 6 done
Task 5 done
Task 0 done
Task 9 done
Task 8 done
Task 7 done
Task 4 done
Task 3 done
Task 2 done
Copy the code

In the synchronous case all the tasks are run sequentially, which results in the main programming blocking ( i.e. pausing the execution of the main program ) while each task executes.

In the case of synchronization all tasks are run sequentially, causing the main program to block when each task is executed.

The important parts of the program are the gevent.spawn which wraps up the given function inside of a Greenlet thread. The list of initialized greenlets are stored in the array threads which is passed to the gevent.joinall function which blocks the current program to run all the given greenlets. The execution will step forward only when all the greenlets terminate.

The important part of the program is the wrapped function gEvent. Spawn, which is the Greenlet thread. The initialized Greenlets are stored in an array of threads and submitted to the gEvent. Joinall function, which then blocks the current program to run all greenlets. The program will continue to run only when all greenlets stop.

The important fact to notice is that the order of execution in the async case is essentially random and that the total execution time in the async case is much less than the sync case. In fact the maximum time for the synchronous case to complete is when each tasks pauses for 2 seconds resulting in a 20 seconds for the whole queue. In the async case the maximum runtime is roughly 2 seconds since none of the tasks block the execution of the others.

Note that asynchronous programs are out of order, and asynchronous execution times are much less than synchronous ones. In fact, synchronization to complete each task stopped for 2 seconds, resulting in 20 seconds to complete the entire queue. In the asynchronous case, the maximum running time is about 2 seconds, because each task does not block other tasks.

A more common use case, fetching data from a server asynchronously, the runtime of fetch() will differ between requests given the load on the remote server.

A more common scenario is to fetch data asynchronously from the server, where fetch() takes different times between requests to load the server.

import gevent.monkey
gevent.monkey.patch_socket()

import gevent
import urllib2
import simplejson as json

def fetch(pid) :
    response = urllib2.urlopen('http://json-time.appspot.com/time.json')
    result = response.read()
    json_result = json.loads(result)
    datetime = json_result['datetime']

    print 'Process ', pid, datetime
    return json_result['datetime']

def synchronous() :
    for i in range(1.10):
        fetch(i)

def asynchronous() :
    threads = []
    for i in range(1.10):
        threads.append(gevent.spawn(fetch, i))
    gevent.joinall(threads)

print 'Synchronous:'
synchronous()

print 'Asynchronous:'
asynchronous()
Copy the code

Determinism of certainty

As mentioned previously, greenlets are deterministic. Given the same inputs and they always produce the same output. For example lets spread a task across a multiprocessing pool compared to a gevent pool.

As mentioned earlier, greenlets are deterministic. The same input will always provide the same output. For example, expand a task to compare a Multiprocessing pool with a GEvent pool.

import time

def echo(i) :
    time.sleep(0.001)
    return i

# Non Deterministic Process Pool

from multiprocessing.pool import Pool

p = Pool(10)
run1 = [a for a in p.imap_unordered(echo, xrange(10))]
run2 = [a for a in p.imap_unordered(echo, xrange(10))]
run3 = [a for a in p.imap_unordered(echo, xrange(10))]
run4 = [a for a in p.imap_unordered(echo, xrange(10)))print( run1 == run2 == run3 == run4 )

# Deterministic Gevent Pool

from gevent.pool import Pool

p = Pool(10)
run1 = [a for a in p.imap_unordered(echo, xrange(10))]
run2 = [a for a in p.imap_unordered(echo, xrange(10))]
run3 = [a for a in p.imap_unordered(echo, xrange(10))]
run4 = [a for a in p.imap_unordered(echo, xrange(10)))print( run1 == run2 == run3 == run4 )

"""
False
True
"""
Copy the code

Even though gevent is normally deterministic, sources of non-determinism can creep into your program when you begin to interact with outside services such as sockets and files. Thus even though green threads are a form of “deterministic concurrency”, they still can experience some of the same problems that POSIX threads and processes experience.

Although GEvents are typically deterministic, non-deterministic sources can creep into your program when you start interacting with external services such as sockets and Files. Thus, even though Green threads are a form of “deterministic concurrency,” they still experience some of the same problems as POSIX threads and processes.

The perennial problem involved with concurrency is known as a race condition. Simply put is when two concurrent threads / processes depend on some shared resource but also attempt to modify this value. This results in resources whose values become time-dependent on the execution order. This is a problem, and in general one should very much try to avoid race conditions since they result program behavior which is globally non-deterministic.

The perennial problem of concurrency is called the race condition. Simply put, when two concurrent threads/processes depend on some shared resource but also try to modify this value. This causes the value of the resource to become time-dependent depending on the order of execution. This is a problem, and in general, one should try to avoid race conditions because they cause program behavior to be globally uncertain.

The best approach to this is to simply avoid all global state all times. Global state and import-time side effects will always come back to bite you!

The best approach is to avoid all global states at all times

Spawning Threads

gevent provides a few wrappers around Greenlet initialization. Some of the most common patterns are:

Gevent provides some encapsulation of Greenlet initialization. Some of the more common modules are:

import gevent
from gevent import Greenlet

def foo(message, n) :
    """ Each thread will be passed the message, and n arguments in its initialization. """
    gevent.sleep(n)
    print(message)

# Initialize a new Greenlet instance running the named function
# foo
thread1 = Greenlet.spawn(foo, "Hello".1)

# Wrapper for creating and runing a new Greenlet from the named 
# function foo, with the passed arguments
thread2 = gevent.spawn(foo, "I live!".2)

# Lambda expressions
thread3 = gevent.spawn(lambda x: (x+1), 2)

threads = [thread1, thread2, thread3]

# Block until all threads complete.
gevent.joinall(threads)

""" Hello I live! "" "
Copy the code

In addition to using the base Greenlet class, you may also subclass Greenlet class and overload the _run method.

In addition to using Greenlet’s base class, you can also use a subclass of Greenlet to override the _run method.

from gevent import Greenlet

class MyGreenlet(Greenlet) :

    def __init__(self, message, n) :
        Greenlet.__init__(self)
        self.message = message
        self.n = n

    def _run(self) :
        print(self.message)
        gevent.sleep(self.n)

g = MyGreenlet("Hi there!".3)
g.start()
g.join()

""" Hi there! "" "
Copy the code

Greenlet State State

Like any other segment of code, Greenlets can fail in various ways. A greenlet may fail to throw an exception, fail to halt or consume too many system resources.

Like other programming, Greenlets fail in different ways. A greenlet might throw an exception, and failure would stop the program or consume a lot of system resources.

The internal state of a greenlet is generally a time-dependent parameter. There are a number of flags on greenlets which let you monitor the state of the thread

The state inside a greenlet is usually a time-dependent parameter. The following states allow you to listen for thread status.

  • started— Boolean, indicates whether the Greenlet has been started. Indicates whether Greenlet has started
  • ready()— Boolean, indicates whether the Greenlet has halted. Indicates whether the Greenlet has been stopped
  • successful()— Boolean, indicates whether the Greenlet has halted and not thrown an exception. Indicates whether the Greenlet has stopped and no exception has been thrown
  • value— arbitrary, the value returned by the Greenlet. Any value returned by Greenlet
  • exceptionThrown inside the greenlet — Exception, uncaught exception instance thrown inside the greenlet
import gevent

def win() :
    return 'You win! '

def fail() :
    raise Exception('You fail at failing.')

winner = gevent.spawn(win)
loser = gevent.spawn(fail)

print(winner.started) # True
print(loser.started)  # True

# Exceptions raised in the Greenlet, stay inside the Greenlet.
try:
    gevent.joinall([winner, loser])
except Exception as e:
    print('This will never be reached')

print(winner.value) # 'You win! '
print(loser.value)  # None

print(winner.ready()) # True
print(loser.ready())  # True

print(winner.successful()) # True
print(loser.successful())  # False

# The exception raised in fail, will not propogate outside the
# greenlet. A stack trace will be printed to stdout but it
# will not unwind the stack of the parent.

print(loser.exception)

# It is possible though to raise the exception again outside
# raise loser.exception
# or with
# loser.get()

"""
True
True
You win!
None
True
True
True
False
You fail at failing.
"""
Copy the code

The Program Shutdown Program is Shutdown

Greenlets that fail to yield when the main program receives a SIGQUIT may hold the program’s execution longer than expected. This results in so called “zombie processes” which need to be killed from outside of the Python interpreter.

When the main program receives a SIGQUIT, the failure of Greenlets can cause the program to execute longer than expected. These results are called “zombie processes” and need to be killed by programs other than the Python parser.

A common pattern is to listen SIGQUIT events on the main program and to invoke gevent.shutdown before exit.

A common module is to listen for SIGQUIT events in the main program and call gevent.shutdown before exiting.

import gevent
import signal

def run_forever() :
    gevent.sleep(1000)

if __name__ == '__main__':
    gevent.signal(signal.SIGQUIT, gevent.shutdown)
    thread = gevent.spawn(run_forever)
    thread.join()
Copy the code

Timeouts Timeout Settings

Timeouts are a constraint on the runtime of a block of code or a Greenlet.

A timeout is a constraint on the running time of a push of code or a Greenlet.

import gevent
from gevent import Timeout

seconds = 10

timeout = Timeout(seconds)
timeout.start()

def wait() :
    gevent.sleep(10)

try:
    gevent.spawn(wait).join()
except Timeout:
    print 'Could not complete'
Copy the code

Or with a context manager in a with a statement.

Or manage in a with state with a context.

import gevent
from gevent import Timeout

time_to_wait = 5 # seconds

class TooLong(Exception) :
    pass

with Timeout(time_to_wait, TooLong):
    gevent.sleep(10)
Copy the code

In addition, gevent also provides timeout arguments for a variety of Greenlet and data stucture related calls. For example:

In addition, gEvent also provides a timeout parameter for various greenlets and data structure-related calls. Such as:

import gevent
from gevent import Timeout

def wait() :
    gevent.sleep(2)

timer = Timeout(1).start()
thread1 = gevent.spawn(wait)

try:
    thread1.join(timeout=timer)
except Timeout:
    print('Thread 1 timed out')

# -

timer = Timeout.start_new(1)
thread2 = gevent.spawn(wait)

try:
    thread2.get(timeout=timer)
except Timeout:
    print('Thread 2 timed out')

# -

try:
    gevent.with_timeout(1, wait)
except Timeout:
    print('Thread 3 timed out')

"""
Thread 1 timed out
Thread 2 timed out
Thread 3 timed out
"""
Copy the code

Monkeypatching

Alas we come to dark corners of Gevent. I’ve avoided mentioning monkey patching up until now to try and motivate the powerful coroutine patterns, but the time has come to discuss the dark arts of monkey-patching. If you noticed above we invoked the command monkey.patch_socket(). This is a purely side-effectful command to modify the standard library’s socket library.

Here we are in the dark corners of Gevent. I have avoided mentioning monkey tinkering until now to try and inspire powerful synergy modes, but now it’s time to discuss the dark art of monkey tinkering. If you noticed above we called the command monkey.patch_socket(). This is a purely side effect command that modifies the standard library socket library.

import socket
print(socket.socket)

print("After monkey patch")
from gevent import monkey
monkey.patch_socket()
print(socket.socket)

import select
print(select.select)
monkey.patch_select()
print("After monkey patch")
print(select.select)

""" class 'socket.socket' After monkey patch class 'gevent.socket.socket' built-in function select After monkey patch function select at 0x1924de8 """
Copy the code

Python’s runtime allows for most objects to be modified at runtime including modules, classes, and even functions. This is generally an astoudingly bad idea since it creates an “implicit side-effect” that is most often extremely difficult to debug if problems occur, nevertheless in extreme situations where a library needs to alter the fundamental behavior of Python itself monkey patches can be used. In this case gevent is capable of patching most of the blocking system calls in the standard library including those in socket, ssl, threading and select modules to instead behave cooperatively.

Python’s runtime allows you to modify most objects at run time, including modules, classes, and even functions. This is usually a very bad idea because it has an “implied side effect” and is often difficult to debug if something goes wrong, however in extreme cases where libraries need to change the basic behavior of Python itself, monkey patches can be used. In this case, GEvent can patch most of the blocking system calls in the standard library, including those in the “Socket,” “SSL,” “threading,” and “SELECT” modules, enabling collaborative behavior.

For example, the Redis python bindings normally uses regular tcp sockets to communicate with the redis-server instance. Simply by invoking gevent.monkey.patch_all() we can make the redis bindings schedule requests cooperatively and work with the rest of our gevent stack.

For example, the Redis-Python binding typically communicates with the “Redis-server” instance using a regular TCP socket. By simply calling gevent.monkey.patch_all() we can make the Redis binding schedule requests collaboratively and work with the rest of the GEvent stack.

This lets us integrate libraries that would not normally work with gevent without ever writing a single line of code. While monkey-patching is still evil, in this case it is a “useful evil”.

This allows us to integrate libraries that wouldn’t normally work with GEvent without writing a line of code. While monkey patch is still evil, in this case it is a “useful EVIL”.