FastAPI · Python

Fix HTTPException: 429 Too Many Requests: Rate limit exceeded in FastAPI

This error occurs when your rate limiter blocks requests but is misconfigured, applying limits too aggressively or not distinguishing between authenticated and anonymous users. Fix it by properly configuring slowapi or a custom rate limiter with appropriate limits per endpoint, user-based keys, and clear Retry-After headers.

Reading the Stack Trace

Traceback (most recent call last): File "/app/venv/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi result = await app(scope, receive, send) File "/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 677, in __call__ await route.handle(scope, receive, send) File "/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 275, in handle await self.app(scope, receive, send) File "/app/venv/lib/python3.11/site-packages/slowapi/middleware.py", line 46, in dispatch response = await call_next(request) File "/app/venv/lib/python3.11/site-packages/slowapi/extension.py", line 182, in check_rate_limit raise HTTPException(status_code=429, detail="Rate limit exceeded") File "/app/src/main.py", line 35, in get_api_data return {"data": fetch_data()} fastapi.exceptions.HTTPException: 429 Too Many Requests: Rate limit exceeded

Here's what each line means:

File "/app/venv/lib/python3.11/site-packages/slowapi/extension.py", line 182, in check_rate_limit: slowapi's rate limiter checked the request count for this client and determined the limit has been exceeded.
File "/app/venv/lib/python3.11/site-packages/slowapi/middleware.py", line 46, in dispatch: The rate limiting middleware intercepts the request before it reaches the endpoint handler.
File "/app/src/main.py", line 35, in get_api_data: The endpoint never executes because the rate limiter raises HTTPException before the function is called.

Common Causes

1. Using IP-based rate limiting behind a reverse proxy

All requests appear to come from the same IP (the proxy) because the rate limiter uses request.client.host instead of the X-Forwarded-For header.

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)  # All users share one IP behind proxy

@app.get("/api/data")
@limiter.limit("10/minute")
async def get_api_data(request: Request):
    return {"data": fetch_data()}

2. Rate limit too low for the endpoint's usage pattern

A critical endpoint used by the frontend on every page load is rate limited to a very low threshold, causing legitimate users to be blocked.

@app.get("/api/config")
@limiter.limit("5/minute")  # Too low for an endpoint called on every page load
async def get_config(request: Request):
    return {"theme": "dark", "locale": "en"}

3. No custom exception handler for rate limit errors

The default 429 response does not include a Retry-After header, leaving clients unable to determine when they can retry.

# No custom exception handler registered
app = FastAPI()
app.state.limiter = limiter
# Missing: app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

bugstack fixes this class of error automatically — in under 2 minutes.

Start Free Trial →

The Fix

Use a custom key function that extracts the real client IP from the X-Forwarded-For header when behind a reverse proxy. Increase the rate limit to a realistic threshold for the endpoint's usage pattern. Register the RateLimitExceeded exception handler to return proper 429 responses with Retry-After headers.

Before (broken)

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter

@app.get("/api/data")
@limiter.limit("10/minute")
async def get_api_data(request: Request):
    return {"data": fetch_data()}

After (fixed)

from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded
from starlette.requests import Request
from starlette.responses import JSONResponse

def get_real_client_ip(request: Request) -> str:
    forwarded = request.headers.get("X-Forwarded-For")
    if forwarded:
        return forwarded.split(",")[0].strip()
    return request.client.host

limiter = Limiter(key_func=get_real_client_ip)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.get("/api/data")
@limiter.limit("100/minute")
async def get_api_data(request: Request):
    return {"data": fetch_data()}

Testing the Fix

import pytest
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)


def test_request_within_rate_limit():
    response = client.get("/api/data")
    assert response.status_code == 200


def test_rate_limit_headers_present():
    response = client.get("/api/data")
    assert "x-ratelimit-limit" in response.headers or response.status_code == 200


def test_rate_limit_exceeded_returns_429():
    # Send requests until rate limit is hit
    for _ in range(150):
        response = client.get("/api/data")
        if response.status_code == 429:
            assert "retry-after" in response.headers or "Retry-After" in response.headers
            break


def test_different_ips_have_separate_limits():
    response1 = client.get(
        "/api/data",
        headers={"X-Forwarded-For": "1.2.3.4"},
    )
    response2 = client.get(
        "/api/data",
        headers={"X-Forwarded-For": "5.6.7.8"},
    )
    assert response1.status_code == 200
    assert response2.status_code == 200

Run your tests:

pytest tests/test_rate_limit.py -v

Pushing Through CI/CD

git checkout -b fix/fastapi-rate-limit
git add src/main.py tests/test_rate_limit.py
git commit -m "fix: use real client IP for rate limiting and add exception handler"
git push origin fix/fastapi-rate-limit

Your CI config should look something like this:

name: CI
on:
  pull_request:
    branches: [main]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'
      - run: pip install -r requirements.txt
      - run: pytest --tb=short -q

The Full Manual Process: 18 Steps

Here's every step you just went through to fix this one bug:

Notice the error alert or see it in your monitoring tool
Open the error dashboard and read the stack trace
Identify the file and line number from the stack trace
Open your IDE and navigate to the file
Read the surrounding code to understand context
Reproduce the error locally
Identify the error source
Write the fix
Run the test suite locally
Fix any failing tests
Write new tests covering the edge case
Run the full test suite again
Create a new git branch
Commit and push your changes
Open a pull request
Wait for code review
Merge and deploy to production
Monitor production to confirm the error is resolved

Total time: 30-60 minutes. For one bug.

Or Let bugstack Fix It in Under 2 minutes

Every step above? bugstack does it automatically.

Step 1: Install the SDK

pip install bugstack

Step 2: Initialize

import bugstack

bugstack.init(api_key=os.environ["BUGSTACK_API_KEY"])

Step 3: There is no step 3.

bugstack handles everything from here:

Captures the stack trace and request context
Pulls the relevant source files from your GitHub repo
Analyzes the error and understands the code context
Generates a minimal, validated fix
Runs your existing test suite
Pushes through your CI/CD pipeline
Deploys to production (or opens a PR for review)

Time from error to fix deployed: Under 2 minutes.

Human involvement: zero.

Try bugstack Free →

No credit card. 5-minute setup. Cancel anytime.

Deploying the Fix (Manual Path)

Run the test suite locally to confirm rate limiting works with the correct client IP extraction.
Open a pull request with the rate limiter configuration fix.
Wait for CI checks to pass on the PR.
Have a teammate review and approve the PR.
Merge to main and monitor 429 response rates in staging to tune limits.

Frequently Asked Questions

How does BugStack know the fix is safe to deploy?

BugStack tests rate limiting with multiple simulated client IPs, verifies the Retry-After header is present in 429 responses, and confirms legitimate requests are not blocked under normal load.

What if BugStack generates a bad fix?

BugStack never pushes directly to production. Every fix goes through a pull request with full CI checks, so your team can review the rate limiting configuration before merging.

Should I use in-memory or Redis-based rate limiting?

In-memory storage works for single-instance deployments. For multiple instances behind a load balancer, use Redis so rate limit counters are shared across all instances.

How do I set different rate limits for authenticated vs anonymous users?

Create a custom key function that returns the user ID for authenticated requests and the client IP for anonymous ones. Apply higher limits to authenticated users with separate @limiter.limit decorators.

Stop fixing Python errors manually.

bugstack catches runtime errors, writes the fix, and opens a tested PR — in under 2 minutes.

Start Free Trial → Book a Demo →