Ahmed Hassan·
Tiered rate limiter with an atomic Lua script that's correct across all my pods and fails open
Designs distributed rate limiting systems with multiple algorithms, tier-based quotas, and real-time enforcement for APIs and user-facing features.
Rate Limiting & Throttling Architecture
You are a Backend Engineer specializing in API infrastructure and abuse prevention. Design a comprehensive rate limiting architecture.
**API/Service Type**: {{api_service_type}} (public REST API, internal microservice, GraphQL endpoint, WebSocket service)
**Traffic Profile**: {{traffic_profile}} (normal traffic, peak traffic, abuse patterns, expected growth)
**Client Tiers**: {{client_tiers}} (free tier, paid tiers, internal services, partners - with different limits per tier)
**Infrastructure**: {{infrastructure}} (Kubernetes, serverless, VMs, existing Redis/cache infrastructure)
Design the complete rate limiting system:
1. **Rate Limiting Algorithms** - Token bucket vs sliding window vs fixed window vs leaky bucket comparison, algorithm selection per use case
2. **Limit Dimensions** - Per-user, per-API-key, per-IP, per-resource, per-method, global limits, concurrent request limits
3. **Tier Configuration** - Rate limit matrix per tier: requests/second, requests/minute, requests/day, burst capacity, concurrent limits
4. **Distributed Implementation** - Redis Cell (CL.THROTTLE), Redis Lua scripts, sliding window log, Redis sorted sets implementation
5. **Gateway Integration** - Kong rate limiting, NGINX limit_req, Envoy rate limiting, AWS API Gateway throttling, Istio rate limiting
6. **Application-Level Limits** - Middleware design, decorator/annotation-based limits, custom limit logic per endpoint
7. **Response Headers** - X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-RateLimit-Retry-After (RFC 6585 compliant)
8. **Graceful Degradation** - Queue requests when limit approaching, priority classes (critical vs background), client-side backpressure
9. **Burst Handling** - Token bucket burst capacity calculation, spike absorption, warm-up after idle periods
10. **Monitoring & Alerting** - Rate limit hit rate per client, blocked request analysis, false positive detection, limit adjustment recommendations
11. **Testing Strategy** - Load test scenarios verifying limits, edge case tests (exactly at limit, limit + 1), clock skew handling
12. **Dynamic Limits** - Admin API for adjusting limits, auto-scaling limits based on system health, promotional limit increases
13. **Multi-Region** - Rate limit state synchronization across regions, eventual consistency trade-offs, regional limits vs global limits
Include the Redis Lua script for atomic rate limit checking and the complete rate limit configuration for each tier.
Ergebnisse
Rate limiting for a **public REST API** (free/pro/enterprise tiers, K8s + existing Redis). Sliding-window log via an atomic Lua script so it's correct across all pods.
**Tier matrix**
| Tier | req/min | burst | concurrent |
|------|---------|-------|-----------|
| free | 60 | 10 | 5 |
| pro | 1,000 | 100 | 50 |
| enterprise | 10,000 | 1,000 | 500 |
**Atomic Lua (single round-trip, race-free):**
```lua
-- KEYS[1]=bucket ARGV: now, windowMs, limit
local now, window, limit = tonumber(ARGV[1]), tonumber(ARGV[2]), tonumber(ARGV[3])
redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, now - window) -- drop expired
local used = redis.call('ZCARD', KEYS[1])
if used >= limit then
return { 0, limit - used } -- blocked
end
redis.call('ZADD', KEYS[1], now, now .. ':' .. math.random())
redis.call('PEXPIRE', KEYS[1], window)
return { 1, limit - used - 1 } -- allowed, remaining
```
**Middleware** sets RFC 6585 headers and 429s on rejection:
```typescript
const [ok, remaining] = await redis.eval(SCRIPT, 1, `rl:${apiKey}`, Date.now(), 60_000, tier.perMin);
res.set("X-RateLimit-Limit", String(tier.perMin));
res.set("X-RateLimit-Remaining", String(Math.max(0, remaining)));
if (!ok) { res.set("Retry-After", "60"); return res.status(429).json({ error: "rate_limited" }); }
```
**Fail-open:** if Redis throws, the limiter calls `next()` — an infra blip must never block paying customers. **Dimensions:** key on API key when present, fall back to IP for anonymous. Hit-rate per tier is logged so limits can be tuned from real data. Multi-region runs regional limits (eventual consistency) rather than a slow global counter.
Modell: Claude Opus 4
42 Likes12 SavesScore: 34
2 Kommentare
Priya Nair·
This belongs in every onboarding doc.
Daniel Cohen·
Genuinely better than the Stack Overflow answer I'd been copying.