gevent.threadpool - A pool of native threads#

class ThreadPool(maxsize, hub=None, idle_task_timeout=-1)[source]#

Bases: GroupMappingMixin

A pool of native worker threads.

This can be useful for CPU intensive functions, or those that otherwise will not cooperate with gevent. The best functions to execute in a thread pool are small functions with a single purpose; ideally they release the CPython GIL. Such functions are extension functions implemented in C.

It implements the same operations as a gevent.pool.Pool, but using threads instead of greenlets.

Note

The method apply_async() will always return a new greenlet, bypassing the threadpool entirely.

Most users will not need to create instances of this class. Instead, use the threadpool already associated with gevent’s hub:

pool = gevent.get_hub().threadpool
result = pool.spawn(lambda: "Some func").get()

Important

It is only possible to use instances of this class from the thread running their hub. Typically that means from the thread that created them. Using the pattern shown above takes care of this.

There is no gevent-provided way to have a single process-wide limit on the number of threads in various pools when doing that, however. The suggested way to use gevent and threadpools is to have a single gevent hub and its one threadpool (which is the default without doing any extra work). Only dispatch minimal blocking functions to the threadpool, functions that do not use the gevent hub.

The len of instances of this class is the number of enqueued (unfinished) tasks.

Just before a task starts running in a worker thread, the values of threading.setprofile() and threading.settrace() are consulted. Any values there are installed in that thread for the duration of the task (using sys.setprofile() and sys.settrace(), respectively). (Because worker threads are long-lived and outlast any given task, this arrangement lets the hook functions change between tasks, but does not let them see the bookkeeping done by the worker thread itself.)

Caution

Instances of this class are only true if they have unfinished tasks.

Changed in version 1.5a3: The undocumented apply_e function, deprecated since 1.1, was removed.

Changed in version 20.12.0: Install the profile and trace functions in the worker thread while the worker thread is running the supplied task.

Changed in version 22.08.0: Add the option to let idle threads expire and be removed from the pool after idle_task_timeout seconds (-1 for no timeout)

apply(func, args=None, kwds=None)#

Rough equivalent of the apply() builtin function, blocking until the result is ready and returning it.

The func will usually, but not always, be run in a way that allows the current greenlet to switch out (for example, in a new greenlet or thread, depending on implementation). But if the current greenlet or thread is already one that was spawned by this pool, the pool may choose to immediately run the func synchronously.

Note

As implemented, attempting to use Threadpool.apply() from inside another function that was itself spawned in a threadpool (any threadpool) will cause the function to be run immediately.

Changed in version 1.1a2: Now raises any exception raised by func instead of dropping it.

apply(func, args=None, kwds=None)#

Rough quivalent of the apply() builtin function blocking until the result is ready and returning it.

The func will usually, but not always, be run in a way that allows the current greenlet to switch out (for example, in a new greenlet or thread, depending on implementation). But if the current greenlet or thread is already one that was spawned by this pool, the pool may choose to immediately run the func synchronously.

Any exception func raises will be propagated to the caller of apply (that is, this method will raise the exception that func raised).

apply_async(func, args=None, kwds=None, callback=None)#

A variant of the apply() method which returns a Greenlet object.

When the returned greenlet gets to run, it will call apply(), passing in func, args and kwds.

If callback is specified, then it should be a callable which accepts a single argument. When the result becomes ready callback is applied to it (unless the call failed).

This method will never block, even if this group is full (that is, even if spawn() would block, this method will not).

Caution

The returned greenlet may or may not be tracked as part of this group, so joining this group is not a reliable way to wait for the results to be available or for the returned greenlet to run; instead, join the returned greenlet.

Tip

Because ThreadPool objects do not track greenlets, the returned greenlet will never be a part of it. To reduce overhead and improve performance, Group and Pool may choose to track the returned greenlet. These are implementation details that may change.

apply_cb(func, args=None, kwds=None, callback=None)#

apply() the given func(*args, **kwds), and, if a callback is given, run it with the results of func (unless an exception was raised.)

The callback may be called synchronously or asynchronously. If called asynchronously, it will not be tracked by this group. (Group and Pool call it asynchronously in a new greenlet; ThreadPool calls it synchronously in the current greenlet.)

imap(func, *iterables, maxsize=None) iterable#

An equivalent of itertools.imap(), operating in parallel. The func is applied to each element yielded from each iterable in iterables in turn, collecting the result.

If this object has a bound on the number of active greenlets it can contain (such as Pool), then at most that number of tasks will operate in parallel.

Parameters:

maxsize (int) –

If given and not-None, specifies the maximum number of finished results that will be allowed to accumulate awaiting the reader; more than that number of results will cause map function greenlets to begin to block. This is most useful if there is a great disparity in the speed of the mapping code and the consumer and the results consume a great deal of resources.

Note

This is separate from any bound on the number of active parallel tasks, though they may have some interaction (for example, limiting the number of parallel tasks to the smallest bound).

Note

Using a bound is slightly more computationally expensive than not using a bound.

Tip

The imap_unordered() method makes much better use of this parameter. Some additional, unspecified, number of objects may be required to be kept in memory to maintain order by this function.

Returns:

An iterable object.

Changed in version 1.1b3: Added the maxsize keyword parameter.

Changed in version 1.1a1: Accept multiple iterables to iterate in parallel.

imap_unordered(func, *iterables, maxsize=None) iterable#

The same as imap() except that the ordering of the results from the returned iterator should be considered in arbitrary order.

This is lighter weight than imap() and should be preferred if order doesn’t matter.

See also

imap() for more details.

join()[source]#

Waits until all outstanding tasks have been completed.

map(func, iterable)#

Return a list made by applying the func to each element of the iterable.

See also

imap()

map_async(func, iterable, callback=None)#

A variant of the map() method which returns a Greenlet object that is executing the map function.

If callback is specified then it should be a callable which accepts a single argument.

spawn(func, *args, **kwargs)[source]#

Add a new task to the threadpool that will run func(*args, **kwargs).

Waits until a slot is available. Creates a new native thread if necessary.

This must only be called from the native thread that owns this object’s hub. This is because creating the necessary data structures to communicate back to this thread isn’t thread safe, so the hub must not be running something else. Also, ensuring the pool size stays correct only works within a single thread.

Returns:

A gevent.event.AsyncResult.

Raises:

InvalidThreadUseError – If called from a different thread.

Changed in version 1.5: Document the thread-safety requirements.

property maxsize#

The maximum allowed number of worker threads.

This is also (approximately) a limit on the number of tasks that can be queued without blocking the waiting greenlet. If this many tasks are already running, then the next greenlet that submits a task will block waiting for a task to finish.

property size#

The number of running pooled worker threads.

Setting this attribute will add or remove running worker threads, up to maxsize.

Initially there are no pooled running worker threads, and threads are created on demand to satisfy concurrent requests up to maxsize threads.

class ThreadPoolExecutor(*args, **kwargs)[source]#

Bases: ThreadPoolExecutor

A version of concurrent.futures.ThreadPoolExecutor that always uses native threads, even when threading is monkey-patched.

The Future objects returned from this object can be used with gevent waiting primitives like gevent.wait().

Caution

If threading is not monkey-patched, then the Future objects returned by this object are not guaranteed to work with as_completed() and wait(). The individual blocking methods like result() and exception() will always work.

New in version 1.2a1: This is a provisional API.

Takes the same arguments as concurrent.futures.ThreadPoolExecuter, which vary between Python versions.

The first argument is always max_workers, the maximum number of threads to use. Most other arguments, while accepted, are ignored.

kill(wait=True, **kwargs)#

Clean-up the resources associated with the Executor.

It is safe to call this method several times. Otherwise, no other methods can be called after this one.

Args:
wait: If True then shutdown will not return until all running

futures have finished executing and the resources used by the executor have been reclaimed.

cancel_futures: If True then shutdown will cancel all pending

futures. Futures that are completed or running will not be cancelled.

shutdown(wait=True, **kwargs)[source]#

Clean-up the resources associated with the Executor.

It is safe to call this method several times. Otherwise, no other methods can be called after this one.

Args:
wait: If True then shutdown will not return until all running

futures have finished executing and the resources used by the executor have been reclaimed.

cancel_futures: If True then shutdown will cancel all pending

futures. Futures that are completed or running will not be cancelled.

submit(fn, *args, **kwargs)[source]#

Submits a callable to be executed with the given arguments.

Schedules the callable to be executed as fn(*args, **kwargs) and returns a Future instance representing the execution of the callable.

Returns:

A Future representing the given call.