Working with APIs

Master HTTP requests, JSON handling, authentication, error handling, and pagination with the requests library.

Intermediate 35 min read 🐍 Python

HTTP Basics

APIs (Application Programming Interfaces) let your Python programs talk to web services — fetching weather data, posting to social media, querying databases, or integrating with payment systems. Most modern APIs use HTTP and return JSON data.

GET

Retrieve data. Like asking a question: "Give me the user with ID 42."

POST

Create new data. Like filling out a form: "Create a new user with this info."

PUT / PATCH

Update existing data. PUT replaces entirely, PATCH updates partially.

DELETE

Remove data. Like clicking "Delete account."

The requests Library

requests is the standard Python library for HTTP. It's not built-in, so install it first:

pip install requests

GET Requests

import requests

# Simple GET request
response = requests.get("https://api.github.com/users/python")

print(f"Status code: {response.status_code}")
print(f"Content type: {response.headers['content-type']}")

# Parse JSON response
data = response.json()
print(f"Name: {data['name']}")
print(f"Public repos: {data['public_repos']}")
print(f"Followers: {data['followers']}")
Output
Status code: 200
Content type: application/json; charset=utf-8
Name: Python
Public repos: 28
Followers: 12500

Query Parameters

Instead of building URLs manually, pass parameters as a dictionary:

import requests

# Instead of: requests.get("https://api.example.com/search?q=python&limit=5")
params = {"q": "python", "limit": 5, "sort": "stars"}
response = requests.get("https://api.github.com/search/repositories", params=params)

data = response.json()
print(f"Total results: {data['total_count']}")
for repo in data['items'][:3]:
    print(f"  {repo['full_name']} - {repo['stargazers_count']} stars")

POST Requests & Headers

import requests

# POST with JSON body
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
}

data = {
    "title": "New Blog Post",
    "body": "This is the content.",
    "tags": ["python", "tutorial"],
}

response = requests.post(
    "https://api.example.com/posts",
    json=data,        # Automatically serializes to JSON
    headers=headers,
    timeout=10,       # Always set a timeout!
)

print(f"Status: {response.status_code}")
if response.ok:  # True for 2xx status codes
    result = response.json()
    print(f"Created post ID: {result['id']}")
Key Takeaway: Use json=data (not data=data) to send JSON. Use params= for query parameters. Always set timeout= to avoid hanging forever on unresponsive servers.

Error Handling

Network requests fail. Servers go down, connections time out, APIs return errors. Robust code handles all of these:

import requests

def fetch_user(user_id):
    """Fetch a user with proper error handling."""
    try:
        response = requests.get(
            f"https://api.example.com/users/{user_id}",
            timeout=5,
        )
        response.raise_for_status()  # Raises HTTPError for 4xx/5xx
        return response.json()

    except requests.exceptions.Timeout:
        print("Request timed out")
    except requests.exceptions.ConnectionError:
        print("Could not connect to server")
    except requests.exceptions.HTTPError as e:
        print(f"HTTP error: {e.response.status_code}")
        if e.response.status_code == 404:
            print("User not found")
        elif e.response.status_code == 429:
            print("Rate limited - slow down!")
    except requests.exceptions.RequestException as e:
        print(f"Request failed: {e}")

    return None

Sessions & Authentication

A Session persists settings (headers, cookies, auth) across multiple requests. Much cleaner than repeating headers every time:

import requests

with requests.Session() as session:
    # Set auth header once — applies to all requests
    session.headers.update({
        "Authorization": "Bearer YOUR_TOKEN",
        "Accept": "application/json",
    })

    # All requests in this session share the headers
    users = session.get("https://api.example.com/users").json()
    posts = session.get("https://api.example.com/posts").json()
    profile = session.get("https://api.example.com/me").json()

    print(f"Users: {len(users)}, Posts: {len(posts)}")
    print(f"Logged in as: {profile['name']}")

Handling Pagination

Most APIs limit results per page. You need to loop through pages to get everything:

import requests

def get_all_repos(username):
    """Fetch all repos, handling pagination."""
    repos = []
    page = 1

    while True:
        response = requests.get(
            f"https://api.github.com/users/{username}/repos",
            params={"page": page, "per_page": 100},
            timeout=10,
        )
        response.raise_for_status()
        data = response.json()

        if not data:  # Empty page = we've got everything
            break

        repos.extend(data)
        page += 1
        print(f"  Fetched page {page-1}: {len(data)} repos")

    return repos

# repos = get_all_repos("python")
# print(f"Total repos: {len(repos)}")

⚠️ Common Mistake: No Timeout

Wrong:

response = requests.get("https://api.example.com/data")  # No timeout!

Why: Without a timeout, your program hangs indefinitely if the server never responds. In production, this blocks your entire application.

Instead:

response = requests.get("https://api.example.com/data", timeout=10)
🔍 Deep Dive: Async HTTP with aiohttp

For high-concurrency scenarios (fetching hundreds of URLs), requests is too slow because it's synchronous. Use aiohttp with asyncio instead: async with aiohttp.ClientSession() as session: resp = await session.get(url). This lets you make hundreds of concurrent requests without threads.