How It Works Features Pricing Blog Error Guides
Log In Start Free Trial
FastAPI · Python

Fix HTTPException: 429 Too Many Requests: Rate limit exceeded in FastAPI

This error occurs when your rate limiter blocks requests but is misconfigured, applying limits too aggressively or not distinguishing between authenticated and anonymous users. Fix it by properly configuring slowapi or a custom rate limiter with appropriate limits per endpoint, user-based keys, and clear Retry-After headers.

Reading the Stack Trace

Traceback (most recent call last): File "/app/venv/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi result = await app(scope, receive, send) File "/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 677, in __call__ await route.handle(scope, receive, send) File "/app/venv/lib/python3.11/site-packages/starlette/routing.py", line 275, in handle await self.app(scope, receive, send) File "/app/venv/lib/python3.11/site-packages/slowapi/middleware.py", line 46, in dispatch response = await call_next(request) File "/app/venv/lib/python3.11/site-packages/slowapi/extension.py", line 182, in check_rate_limit raise HTTPException(status_code=429, detail="Rate limit exceeded") File "/app/src/main.py", line 35, in get_api_data return {"data": fetch_data()} fastapi.exceptions.HTTPException: 429 Too Many Requests: Rate limit exceeded

Here's what each line means:

Common Causes

1. Using IP-based rate limiting behind a reverse proxy

All requests appear to come from the same IP (the proxy) because the rate limiter uses request.client.host instead of the X-Forwarded-For header.

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)  # All users share one IP behind proxy

@app.get("/api/data")
@limiter.limit("10/minute")
async def get_api_data(request: Request):
    return {"data": fetch_data()}

2. Rate limit too low for the endpoint's usage pattern

A critical endpoint used by the frontend on every page load is rate limited to a very low threshold, causing legitimate users to be blocked.

@app.get("/api/config")
@limiter.limit("5/minute")  # Too low for an endpoint called on every page load
async def get_config(request: Request):
    return {"theme": "dark", "locale": "en"}

3. No custom exception handler for rate limit errors

The default 429 response does not include a Retry-After header, leaving clients unable to determine when they can retry.

# No custom exception handler registered
app = FastAPI()
app.state.limiter = limiter
# Missing: app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

The Fix

Use a custom key function that extracts the real client IP from the X-Forwarded-For header when behind a reverse proxy. Increase the rate limit to a realistic threshold for the endpoint's usage pattern. Register the RateLimitExceeded exception handler to return proper 429 responses with Retry-After headers.

Before (broken)
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter

@app.get("/api/data")
@limiter.limit("10/minute")
async def get_api_data(request: Request):
    return {"data": fetch_data()}
After (fixed)
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded
from starlette.requests import Request
from starlette.responses import JSONResponse

def get_real_client_ip(request: Request) -> str:
    forwarded = request.headers.get("X-Forwarded-For")
    if forwarded:
        return forwarded.split(",")[0].strip()
    return request.client.host

limiter = Limiter(key_func=get_real_client_ip)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.get("/api/data")
@limiter.limit("100/minute")
async def get_api_data(request: Request):
    return {"data": fetch_data()}

Testing the Fix

import pytest
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)


def test_request_within_rate_limit():
    response = client.get("/api/data")
    assert response.status_code == 200


def test_rate_limit_headers_present():
    response = client.get("/api/data")
    assert "x-ratelimit-limit" in response.headers or response.status_code == 200


def test_rate_limit_exceeded_returns_429():
    # Send requests until rate limit is hit
    for _ in range(150):
        response = client.get("/api/data")
        if response.status_code == 429:
            assert "retry-after" in response.headers or "Retry-After" in response.headers
            break


def test_different_ips_have_separate_limits():
    response1 = client.get(
        "/api/data",
        headers={"X-Forwarded-For": "1.2.3.4"},
    )
    response2 = client.get(
        "/api/data",
        headers={"X-Forwarded-For": "5.6.7.8"},
    )
    assert response1.status_code == 200
    assert response2.status_code == 200

Run your tests:

pytest tests/test_rate_limit.py -v

Pushing Through CI/CD

git checkout -b fix/fastapi-rate-limit,git add src/main.py tests/test_rate_limit.py,git commit -m "fix: use real client IP for rate limiting and add exception handler",git push origin fix/fastapi-rate-limit

Your CI config should look something like this:

name: CI
on:
  pull_request:
    branches: [main]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'
      - run: pip install -r requirements.txt
      - run: pytest --tb=short -q

The Full Manual Process: 18 Steps

Here's every step you just went through to fix this one bug:

  1. Notice the error alert or see it in your monitoring tool
  2. Open the error dashboard and read the stack trace
  3. Identify the file and line number from the stack trace
  4. Open your IDE and navigate to the file
  5. Read the surrounding code to understand context
  6. Reproduce the error locally
  7. Identify the root cause
  8. Write the fix
  9. Run the test suite locally
  10. Fix any failing tests
  11. Write new tests covering the edge case
  12. Run the full test suite again
  13. Create a new git branch
  14. Commit and push your changes
  15. Open a pull request
  16. Wait for code review
  17. Merge and deploy to production
  18. Monitor production to confirm the error is resolved

Total time: 30-60 minutes. For one bug.

Or Let bugstack Fix It in Under 2 minutes

Every step above? bugstack does it automatically.

Step 1: Install the SDK

pip install bugstack

Step 2: Initialize

import bugstack

bugstack.init(api_key=os.environ["BUGSTACK_API_KEY"])

Step 3: There is no step 3.

bugstack handles everything from here:

  1. Captures the stack trace and request context
  2. Pulls the relevant source files from your GitHub repo
  3. Analyzes the error and understands the code context
  4. Generates a minimal, verified fix
  5. Runs your existing test suite
  6. Pushes through your CI/CD pipeline
  7. Deploys to production (or opens a PR for review)

Time from error to fix deployed: Under 2 minutes.

Human involvement: zero.

Try bugstack Free →

No credit card. 5-minute setup. Cancel anytime.

Deploying the Fix (Manual Path)

  1. Run the test suite locally to confirm rate limiting works with the correct client IP extraction.
  2. Open a pull request with the rate limiter configuration fix.
  3. Wait for CI checks to pass on the PR.
  4. Have a teammate review and approve the PR.
  5. Merge to main and monitor 429 response rates in staging to tune limits.

Frequently Asked Questions

BugStack tests rate limiting with multiple simulated client IPs, verifies the Retry-After header is present in 429 responses, and confirms legitimate requests are not blocked under normal load.

BugStack never pushes directly to production. Every fix goes through a pull request with full CI checks, so your team can review the rate limiting configuration before merging.

In-memory storage works for single-instance deployments. For multiple instances behind a load balancer, use Redis so rate limit counters are shared across all instances.

Create a custom key function that returns the user ID for authenticated requests and the client IP for anonymous ones. Apply higher limits to authenticated users with separate @limiter.limit decorators.