Fix HTTPException: 429 Too Many Requests: Rate limit exceeded in FastAPI
This error occurs when your rate limiter blocks requests but is misconfigured, applying limits too aggressively or not distinguishing between authenticated and anonymous users. Fix it by properly configuring slowapi or a custom rate limiter with appropriate limits per endpoint, user-based keys, and clear Retry-After headers.
Reading the Stack Trace
Here's what each line means:
- File "/app/venv/lib/python3.11/site-packages/slowapi/extension.py", line 182, in check_rate_limit: slowapi's rate limiter checked the request count for this client and determined the limit has been exceeded.
- File "/app/venv/lib/python3.11/site-packages/slowapi/middleware.py", line 46, in dispatch: The rate limiting middleware intercepts the request before it reaches the endpoint handler.
- File "/app/src/main.py", line 35, in get_api_data: The endpoint never executes because the rate limiter raises HTTPException before the function is called.
Common Causes
1. Using IP-based rate limiting behind a reverse proxy
All requests appear to come from the same IP (the proxy) because the rate limiter uses request.client.host instead of the X-Forwarded-For header.
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address) # All users share one IP behind proxy
@app.get("/api/data")
@limiter.limit("10/minute")
async def get_api_data(request: Request):
return {"data": fetch_data()}
2. Rate limit too low for the endpoint's usage pattern
A critical endpoint used by the frontend on every page load is rate limited to a very low threshold, causing legitimate users to be blocked.
@app.get("/api/config")
@limiter.limit("5/minute") # Too low for an endpoint called on every page load
async def get_config(request: Request):
return {"theme": "dark", "locale": "en"}
3. No custom exception handler for rate limit errors
The default 429 response does not include a Retry-After header, leaving clients unable to determine when they can retry.
# No custom exception handler registered
app = FastAPI()
app.state.limiter = limiter
# Missing: app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
The Fix
Use a custom key function that extracts the real client IP from the X-Forwarded-For header when behind a reverse proxy. Increase the rate limit to a realistic threshold for the endpoint's usage pattern. Register the RateLimitExceeded exception handler to return proper 429 responses with Retry-After headers.
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
@app.get("/api/data")
@limiter.limit("10/minute")
async def get_api_data(request: Request):
return {"data": fetch_data()}
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded
from starlette.requests import Request
from starlette.responses import JSONResponse
def get_real_client_ip(request: Request) -> str:
forwarded = request.headers.get("X-Forwarded-For")
if forwarded:
return forwarded.split(",")[0].strip()
return request.client.host
limiter = Limiter(key_func=get_real_client_ip)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
@app.get("/api/data")
@limiter.limit("100/minute")
async def get_api_data(request: Request):
return {"data": fetch_data()}
Testing the Fix
import pytest
from fastapi.testclient import TestClient
from app.main import app
client = TestClient(app)
def test_request_within_rate_limit():
response = client.get("/api/data")
assert response.status_code == 200
def test_rate_limit_headers_present():
response = client.get("/api/data")
assert "x-ratelimit-limit" in response.headers or response.status_code == 200
def test_rate_limit_exceeded_returns_429():
# Send requests until rate limit is hit
for _ in range(150):
response = client.get("/api/data")
if response.status_code == 429:
assert "retry-after" in response.headers or "Retry-After" in response.headers
break
def test_different_ips_have_separate_limits():
response1 = client.get(
"/api/data",
headers={"X-Forwarded-For": "1.2.3.4"},
)
response2 = client.get(
"/api/data",
headers={"X-Forwarded-For": "5.6.7.8"},
)
assert response1.status_code == 200
assert response2.status_code == 200
Run your tests:
pytest tests/test_rate_limit.py -v
Pushing Through CI/CD
git checkout -b fix/fastapi-rate-limit,git add src/main.py tests/test_rate_limit.py,git commit -m "fix: use real client IP for rate limiting and add exception handler",git push origin fix/fastapi-rate-limit
Your CI config should look something like this:
name: CI
on:
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- run: pip install -r requirements.txt
- run: pytest --tb=short -q
The Full Manual Process: 18 Steps
Here's every step you just went through to fix this one bug:
- Notice the error alert or see it in your monitoring tool
- Open the error dashboard and read the stack trace
- Identify the file and line number from the stack trace
- Open your IDE and navigate to the file
- Read the surrounding code to understand context
- Reproduce the error locally
- Identify the root cause
- Write the fix
- Run the test suite locally
- Fix any failing tests
- Write new tests covering the edge case
- Run the full test suite again
- Create a new git branch
- Commit and push your changes
- Open a pull request
- Wait for code review
- Merge and deploy to production
- Monitor production to confirm the error is resolved
Total time: 30-60 minutes. For one bug.
Or Let bugstack Fix It in Under 2 minutes
Every step above? bugstack does it automatically.
Step 1: Install the SDK
pip install bugstack
Step 2: Initialize
import bugstack
bugstack.init(api_key=os.environ["BUGSTACK_API_KEY"])
Step 3: There is no step 3.
bugstack handles everything from here:
- Captures the stack trace and request context
- Pulls the relevant source files from your GitHub repo
- Analyzes the error and understands the code context
- Generates a minimal, verified fix
- Runs your existing test suite
- Pushes through your CI/CD pipeline
- Deploys to production (or opens a PR for review)
Time from error to fix deployed: Under 2 minutes.
Human involvement: zero.
Try bugstack Free →No credit card. 5-minute setup. Cancel anytime.
Deploying the Fix (Manual Path)
- Run the test suite locally to confirm rate limiting works with the correct client IP extraction.
- Open a pull request with the rate limiter configuration fix.
- Wait for CI checks to pass on the PR.
- Have a teammate review and approve the PR.
- Merge to main and monitor 429 response rates in staging to tune limits.
Frequently Asked Questions
BugStack tests rate limiting with multiple simulated client IPs, verifies the Retry-After header is present in 429 responses, and confirms legitimate requests are not blocked under normal load.
BugStack never pushes directly to production. Every fix goes through a pull request with full CI checks, so your team can review the rate limiting configuration before merging.
In-memory storage works for single-instance deployments. For multiple instances behind a load balancer, use Redis so rate limit counters are shared across all instances.
Create a custom key function that returns the user ID for authenticated requests and the client IP for anonymous ones. Apply higher limits to authenticated users with separate @limiter.limit decorators.