Introduction

gevent is a coroutine-based Python networking library.

Features include:

  • Fast event loop based on libev (epoll on Linux, kqueue on FreeBSD).
  • Lightweight execution units based on greenlet.
  • API that re-uses concepts from the Python standard library (e.g. Event, Queue).
  • Cooperative socket and ssl modules.
  • Ability to use standard library and 3rd party modules written for standard blocking sockets (gevent.monkey).
  • DNS queries performed through threadpool (default) or through c-ares (enabled via GEVENT_RESOLVER=ares env var).
  • TCP/UDP/HTTP servers
  • Subprocess support (through gevent.subprocess)
  • Thread pools

Installation

gevent runs on Python 2.5 and newer and requires

  • greenlet which can be installed with pip install greenlet.

For ssl to work on Python older than 2.6, ssl package is required.

Example

The following example shows how to run tasks concurrently.

>>> import gevent
>>> from gevent import socket
>>> urls = ['www.google.com', 'www.example.com', 'www.python.org']
>>> jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls]
>>> gevent.joinall(jobs, timeout=2)
>>> [job.value for job in jobs]
['74.125.79.106', '208.77.188.166', '82.94.164.162']

After the jobs have been spawned, gevent.joinall() waits for them to complete, no longer than 2 seconds though. The results are then collected by checking gevent.Greenlet.value property. The gevent.socket.gethostbyname() function has the same interface as the standard socket.gethostbyname() but it does not block the whole interpreter and thus lets the other greenlets proceed with their requests unhindered.

Monkey patching

The example above used gevent.socket for socket operations. If the standard socket module was used the example would have taken 3 times longer to complete because the DNS requests would be sequential. Using the standard socket module inside greenlets makes gevent rather pointless, so what about module and packages that are built on top of socket?

That’s what monkey patching is for. The functions in gevent.monkey carefully replace functions and classes in the standard socket module with their cooperative counterparts. That way even the modules that are unaware of gevent can benefit from running in a multi-greenlet environment.

>>> from gevent import monkey; monkey.patch_socket()
>>> import urllib2 # it's usable from multiple greenlets now

See examples/concurrent_download.py

Event loop

Unlike other network libraries, in similar fashion to eventlet, gevent starts the event loop implicitly in a dedicated greenlet. There’s no reactor that you must call a run() or dispatch() function on. When a function from gevent’s API wants to block, it obtains the Hub instance - a greenlet that runs the event loop - and switches to it. If there’s no Hub instance yet, one is created on the fly.

The event loop provided by libev uses the fastest polling mechanism available on the system by default. It is possible to command libev to use a particular polling mechanism by setting the LIBEV_FLAGS` environment variable. Possible values include LIBEV_FLAGS=1 for the select backend, LIBEV_FLAGS=2 for the poll backend, LIBEV_FLAGS=4 for the epoll backend and LIBEV_FLAGS=8 for the kqueue backend. Please read the libev documentation for more information.

The Libev API is available under gevent.core module. Note, that the callbacks supplied to the libev API are run in the Hub greenlet and thus cannot use the synchronous gevent API. It is possible to use the asynchronous API there, like spawn() and Event.set().

Cooperative multitasking

The greenlets all run in the same OS thread and are scheduled cooperatively. This means that until a particular greenlet gives up control, (by calling a blocking function that will switch to the Hub), other greenlets won’t get a chance to run. It is typically not an issue for an I/O bound app, but one should be aware of this when doing something CPU intensive, or when calling blocking I/O functions that bypass the libev event loop.

Synchronizing access to objects shared across the greenlets is unnecessary in most cases, thus Lock and Semaphore classes, although present, aren’t used very often. Other abstractions from threading and multiprocessing remain useful in the cooperative world:

  • Event allows one to wake up a number of greenlets that are calling Event.wait() method.
  • AsyncResult is similar to Event but allows passing a value or an exception to the waiters.
  • Queue and JoinableQueue.

Lightweight pseudothreads

The greenlets are spawned by creating a Greenlet instance and calling its start method. (The spawn() function is a shortcut that does exactly that). The start method schedules a switch to the greenlet that will happen as soon as the current greenlet gives up control. If there is more than one active event, they will be executed one by one, in an undefined order.

If there is an error during execution it won’t escape greenlet’s boundaries. An unhandled error results in a stacktrace being printed, complemented by the failed function’s signature and arguments:

>>> gevent.spawn(lambda : 1/0)
>>> gevent.sleep(1)
Traceback (most recent call last):
 ...
ZeroDivisionError: integer division or modulo by zero
<Greenlet at 0x7f2ec3a4e490: <function <lambda...>> failed with ZeroDivisionError

The traceback is asynchronously printed to sys.stderr when the greenlet dies.

Greenlet instances have a number of useful methods:

  • join – waits until the greenlet exits;
  • kill – interrupts greenlet’s execution;
  • get – returns the value returned by greenlet or re-raised the exception that killed it.

It is possible to customize the string printed after the traceback by subclassing the Greenlet class and redefining its __str__ method.

To subclass a Greenlet, override its _run() method and call Greenlet.__init__(self) in __init__:

class MyNoopGreenlet(Greenlet):

    def __init__(self, seconds):
        Greenlet.__init__(self)
        self.seconds = seconds

    def _run(self):
        gevent.sleep(self.seconds)

    def __str__(self):
        return 'MyNoopGreenlet(%s)' % self.seconds

Greenlets can be killed asynchronously. Killing will resume the sleeping greenlet, but instead of continuing execution, a GreenletExit will be raised.

>>> g = MyNoopGreenlet(4)
>>> g.start()
>>> g.kill()
>>> g.dead
True

The GreenletExit exception and its subclasses are handled differently than other exceptions. Raising GreenletExit is not considered an exceptional situation, so the traceback is not printed. The GreenletExit is returned by get as if it were returned by the greenlet, not raised.

The kill method can accept a custom exception to be raised:

>>> g = MyNoopGreenlet.spawn(5) # spawn() creates a Greenlet and starts it
>>> g.kill(Exception("A time to kill"))
Traceback (most recent call last):
 ...
Exception: A time to kill
MyNoopGreenlet(5) failed with Exception

The kill can also accept a timeout argument specifying the number of seconds to wait for the greenlet to exit. Note, that kill cannot guarantee that the target greenlet will not ignore the exception, thus it’s a good idea always to pass a timeout to kill.

Timeouts

Many functions in the gevent API are synchronous, blocking the current greenlet until the operation is done. For example, kill waits until the target greenlet is dead before returning [1]. Many of those functions can be made asynchronous by passing the argument block=False.

Furthermore, many of the synchronous functions accept a timeout argument, which specifies a limit on how long the function can block (examples: Event.wait(), Greenlet.join(), Greenlet.kill(), AsyncResult.get(), and many more).

The socket and SSLObject instances can also have a timeout, set by the settimeout method.

When these are not enough, the Timeout class can be used to add timeouts to arbitrary sections of (yielding) code.

Futher reading

To limit concurrency, use the Pool class (see example: dns_mass_resolve.py).

Gevent comes with TCP/SSL/HTTP/WSGI servers. See Implementing servers.

External resources

Gevent for working Python developer is a comprehensive tutorial.

Footnotes

[1]This was not the case before 0.13.0, kill method in 0.12.2 and older was asynchronous by default.

Next page: API reference