Concurrency & Async Programming

Concurrency in Python

Concurrency means doing multiple things at once — or at least appearing to. Python offers three approaches: threading (for I/O-bound tasks like network calls), multiprocessing (for CPU-bound tasks like number crunching), and asyncio (for high-concurrency I/O like web servers). Choosing the right one is critical for performance.

Threading

Multiple threads share memory. Great for I/O-bound work (API calls, file reads). Limited by the GIL for CPU work.

Multiprocessing

Separate processes with separate memory. Bypasses the GIL. Best for CPU-bound work (data processing, ML).

asyncio

Single thread, cooperative multitasking. Best for high-concurrency I/O (web servers, chat apps, crawlers).

The GIL (Global Interpreter Lock)

CPython's GIL allows only one thread to execute Python bytecode at a time. This means threading doesn't speed up CPU-bound work. But it works great for I/O-bound work because threads release the GIL while waiting for I/O. For CPU parallelism, use multiprocessing.

Threading

Threads are lightweight and share memory with the main program. They're perfect when your program spends most of its time waiting — waiting for an API response, waiting for a file to download, waiting for a database query. While one thread waits, others can work:

import threading
import time

def download(url, delay):
    """Simulate downloading a URL."""
    print(f"  Starting {url}...")
    time.sleep(delay)  # Simulate network I/O
    print(f"  Finished {url} ({delay}s)")
    return f"Data from {url}"

# Sequential — slow!
start = time.time()
download("api/users", 1)
download("api/posts", 2)
download("api/comments", 1)
print(f"Sequential: {time.time() - start:.1f}s\n")

# Threaded — fast!
start = time.time()
threads = [
    threading.Thread(target=download, args=("api/users", 1)),
    threading.Thread(target=download, args=("api/posts", 2)),
    threading.Thread(target=download, args=("api/comments", 1)),
]
for t in threads:
    t.start()
for t in threads:
    t.join()  # Wait for all to finish
print(f"Threaded: {time.time() - start:.1f}s")

Output

  Starting api/users...
  Finished api/users (1s)
  Starting api/posts...
  Finished api/posts (2s)
  Starting api/comments...
  Finished api/comments (1s)
Sequential: 4.0s

  Starting api/users...
  Starting api/posts...
  Starting api/comments...
  Finished api/users (1s)
  Finished api/comments (1s)
  Finished api/posts (2s)
Threaded: 2.0s

The threaded version runs in 2 seconds (the longest single task) instead of 4 (all tasks added up). That's because the threads run their time.sleep() concurrently.

ThreadPoolExecutor — The Modern Way

concurrent.futures provides a higher-level interface that's easier and safer than manual thread management:

from concurrent.futures import ThreadPoolExecutor, as_completed
import time

def fetch(url):
    time.sleep(1)  # Simulate I/O
    return f"Result from {url}"

urls = [f"https://api.example.com/page/{i}" for i in range(5)]

with ThreadPoolExecutor(max_workers=3) as executor:
    # Submit all tasks
    futures = {executor.submit(fetch, url): url for url in urls}

    # Process results as they complete
    for future in as_completed(futures):
        url = futures[future]
        result = future.result()
        print(f"  {url}: {result}")

Key Takeaway: Use ThreadPoolExecutor for I/O-bound parallelism. It manages thread lifecycle, limits concurrency, and provides a clean API with submit() and as_completed().

Multiprocessing

For CPU-bound work, threads don't help because of the GIL. multiprocessing spawns separate Python processes, each with its own GIL, enabling true parallelism on multiple CPU cores:

from multiprocessing import Pool
import time

def cpu_intensive(n):
    """Simulate CPU-bound work."""
    total = sum(i * i for i in range(n))
    return total

if __name__ == '__main__':
    numbers = [5_000_000] * 4

    # Sequential
    start = time.time()
    results = [cpu_intensive(n) for n in numbers]
    print(f"Sequential: {time.time() - start:.2f}s")

    # Parallel with 4 processes
    start = time.time()
    with Pool(4) as pool:
        results = pool.map(cpu_intensive, numbers)
    print(f"Parallel (4 cores): {time.time() - start:.2f}s")

Output

Sequential: 3.24s
Parallel (4 cores): 0.89s

⚠️ Common Mistake: Using Threading for CPU Work

Wrong:

# Threads DON'T speed up CPU-bound work due to the GIL
threads = [Thread(target=cpu_intensive, args=(n,)) for n in numbers]

Why: The GIL prevents multiple threads from running Python code simultaneously. Threads take turns, so CPU-bound work is actually slower with threads due to context-switching overhead.

Instead: Use multiprocessing.Pool or ProcessPoolExecutor for CPU-bound work.

asyncio — Cooperative Multitasking

asyncio uses a single thread with an event loop. Functions voluntarily give up control with await, letting other tasks run. It's the most efficient approach for high-concurrency I/O (thousands of simultaneous connections):

import asyncio

async def fetch(name, delay):
    """Simulate an async API call."""
    print(f"  Starting {name}...")
    await asyncio.sleep(delay)  # Non-blocking sleep!
    print(f"  Finished {name} ({delay}s)")
    return f"Data from {name}"

async def main():
    # Run all three concurrently
    results = await asyncio.gather(
        fetch("users", 1),
        fetch("posts", 2),
        fetch("comments", 1),
    )
    print(f"Results: {results}")

asyncio.run(main())

Output

  Starting users...
  Starting posts...
  Starting comments...
  Finished users (1s)
  Finished comments (1s)
  Finished posts (2s)
Results: ['Data from users', 'Data from posts', 'Data from comments']

asyncio.gather() runs multiple coroutines concurrently and returns all results. asyncio.run() creates the event loop and runs the top-level coroutine. The key insight: await asyncio.sleep() is non-blocking — while one task sleeps, others can run.

When to Use What

Scenario	Use	Why
API calls, file I/O, DB queries	`threading` or `asyncio`	I/O-bound — threads release GIL during I/O
Number crunching, image processing	`multiprocessing`	CPU-bound — bypasses GIL with separate processes
Thousands of connections (web server)	`asyncio`	Most efficient for high-concurrency I/O
Simple parallel tasks	`ThreadPoolExecutor`	Clean API, managed thread pool
Data processing pipeline	`ProcessPoolExecutor`	Parallel CPU work with clean API

🔍 Deep Dive: async for and async with

Python supports async iteration (async for item in aiter:) and async context managers (async with resource:). These let you work with async data sources (like database cursors or websocket streams) and async resources (like database connections) using familiar Python syntax. Libraries like aiohttp (HTTP client), asyncpg (PostgreSQL), and aiofiles (file I/O) use these patterns extensively.

Concurrency in Python

Threading

Multiprocessing

asyncio

The GIL (Global Interpreter Lock)

Threading

ThreadPoolExecutor — The Modern Way

Multiprocessing

⚠️ Common Mistake: Using Threading for CPU Work

asyncio — Cooperative Multitasking

When to Use What

Related Topics