New greener region discount. Save 3% on Upsun resource usage. Learn how.
LoginFree trial
FeaturesPricingBlogAbout us
Blog

Python Gevent in practice: common pitfalls to keep in mind

Pythonperformance
22 February 2024
Sümer Cip
Sümer Cip
Cloud Engineer

Gevent is a Python networking library that uses libev and libuv for its event loop and greenlet for asynchronous tasks, offering essential abstractions for server development.

Historically, Eve Online adopted Stackless Python for its backend servers, utilizing tasklets or microthreads. These tasklets allowed thousands of requests to run in parallel within a single thread, avoiding the performance and complexity issues tied to threads. Stackless Python later evolved into Eventlet, inspiring Gevent—an asynchronous library ideal for developing high-performance networking applications, benefiting from Python's readability and rapid development. These factors are major contributors to Upsun's choice of Gevent for some internal services like our Project API.

This article assumes that you, our reader, are familiar with Python Gevent's core concepts like cooperative multitasking, event loops, greenlets, and concurrent scheduling. You can find some good articles on these subjects here and here. Allowing us to focus on best practices and common pitfalls in Gevent, often applicable to other asynchronous libraries.

Common pitfalls

Monkey patching

In Gevent, monkey patching is an optional technique that replaces the blocking calls in the standard library with cooperative alternatives, and is widely used in practice. If you choose not to employ monkey patching, you'll be responsible for managing greenlets yourself, which could be desirable depending on your use case.

Gevent originated at a time when Python didn't natively support asynchronous programming—no async or await keywords. Traditional web server applications were either built using a pre-fork model or relied on multiple threads for concurrency. Gevent aimed to make existing multithreaded code concurrent without relying on actual Operating System (OS) threads. Sometimes, this could be done even without changing a single line of code in your application.

For example, in a multithreaded environment, when a blocking function like `socket.recv` is called, the kernel handles the scheduling of threads. In contrast, Gevent employs lightweight threads, known as greenlets, that run concurrently within a single OS thread. To handle multiple blocking functions within this single thread, Gevent monkey patches the Python standard library. This involves replacing the library's blocking functions with versions that use the event loop to wait asynchronously for operations to complete.

For monkey patching to work effectively, certain guidelines must be strictly followed:

  • The patching should be done as early as possible in the code. If you delay, there's a risk that another part of your code might import a module that uses the original, unpatched functions, leading to conflicts.
  • Monkey patching should be performed on the main thread. This ensures that the patches take effect across all greenlets.
  • The patching process should occur when the application is still single-threaded. This is crucial because Gevent's monkey patching also alters the threading library. If you've already spawned multiple threads using the unpatched threading module, conflicts could arise.

These guidelines are not just best practices; they come directly from the official documentation for Gevent's monkey patching. Not adhering to these rules could result in unpredictable, hard-to-diagnose errors due to conflicts between patched and unpatched code.

Blocking calls

Even when you've carefully implemented monkey patching in Gevent, a single greenlet can still block the entire process. This is especially true for file read/write operations. Despite the benefits of monkey patching, file I/O operations are still synchronous on some operating systems, which means they can become a bottleneck.

However, there is a strategy to get around this limitation. Python Gevent offers the option to perform concurrent file I/O by utilizing a thread pool. This approach allows the file operations to be handled in a way that doesn't block the rest of the application. So, while monkey patching solves many issues related to asynchronous programming, it's essential to be aware of its limitations and know how to navigate them to ensure a truly non-blocking application.

Third-party libraries

Even with Gevent's capabilities, it's crucial to note that it can't make third-party libraries asynchronous if they don't use Python's standard library for blocking calls. If you're using an external library that makes its own blocking calls, you may not discover the issue until your application is in a production environment.

This is where performance regression tests or load testing can come in handy. When a greenlet is blocked on I/O, the issue usually manifests as suboptimal CPU utilization alongside stagnant throughput. Under normal circumstances, you should only hit throughput limits when CPU usage is at or above 100%. Therefore, if you observe low CPU usage and no increase in throughput during load testing, it's likely a sign of a blocking greenlet.

To resolve this, you'll need to identify the problematic third-party library and its specific blocking function. Solutions might involve switching to an alternative library or modifying the current one, although neither is typically straightforward.

CPU intensive code

Running CPU-intensive code in a single Python greenlet can block other greenlets from executing, leading to what's known as greenlet starvation. Debugging this issue is complex, as you need to observe the context-switching trends to identify if certain greenlets are monopolizing CPU time.

To address this, you have a few options:

  1. Offload CPU-heavy tasks to a separate process or thread, ideally using a pool. This segregates I/O-bound and CPU-bound work, preventing them from interfering with each other.
  2. If separating the workloads isn't feasible—perhaps because you need to perform a CPU-intensive task and immediately return the result—explicitly yield the greenlet to allow other greenlets to run.

It's important to note that Gevent is not ideal for CPU-intensive tasks. This limitation is not unique to Gevent; other asynchronous libraries like Asyncio and Node.js also face similar challenges.

Greenlet safety

Thread safety is a well-known concept in the multithreaded world. It involves protecting shared resources from concurrent access by multiple threads. While traditional threads can context-switch at any time, requiring manual conflict resolution (using mutexes), one might assume that asynchronous libraries like Python Gevent would automatically resolve such issues. However, that's not the case.

Even though asynchronous libraries like Gevent don't have the same level of complexity as traditional preemptive threads, they still involve shared resources accessed by different greenlets. This can lead to subtle, sometimes hard-to-detect issues.

Think about the following code:

def withdraw(self, amount):
    if self.balance >= amount:
        withdraw_from_db(amount)
        self.balance -= amount

Suppose the above code is called by multiple greenlets and withdraw_from_db() is a call that yields execution. Since self.balance is a shared resource, it is highly possible that withdraw can happen multiple times even if the balance is not enough. The issue is balance is read and a context switch might happen and the second self.balance we read might not be the same as we read first.

There is a native lock support in Gevent. But I would suggest that you design the way you access your shared resources so that you minimize this as much as possible because whenever you use a lock, that blocks other greenlets from running, which is exactly the opposite of concurrency. Moreover, it is highly probable that you will use the lock when you have a blocking call, but this might be a recipe for disaster. For example, maybe you can implement the withdrawal logic using a database transaction and remove self.balance from the code. There might as well be other ways to circumvent this problem, it just needs to be thought out carefully.

Conclusion

Like any framework, Gevent has its ups and downs. While it offers the advantage of asynchronous programming with less complexity than traditional multithreading, it's not a silver bullet for all concurrency challenges. It's essential to be vigilant about these limitations and possible pitfalls to make the most effective use of Gevent in your applications. Let us know what your Gevent pitfalls have been on Discord.

FAQ

How can developers ensure compatibility of Gevent with different operating systems?
To ensure compatibility of Gevent with different operating systems, developers should conduct extensive testing on all targeted OS environments. While Gevent handles many asynchronous tasks efficiently, some OS-specific behaviors, particularly with file I/O operations, may require additional adjustments or workarounds. Using cross-platform testing tools and CI/CD pipelines can help identify and resolve any discrepancies early in the development process.

Upsun Logo
Join the community