BrianIsaac's picture
feat: implement P1 features and production infrastructure
76897aa
# Rate Limiting Module
Production-ready rate limiting for Gradio applications with Redis support and graceful fallback.
## Features
- **Token Bucket Algorithm**: Configurable capacity and refill rate
- **Thread-Safe**: Works with concurrent requests
- **Async Support**: Compatible with async/await handlers
- **Redis Integration**: Distributed rate limiting with Lua scripts
- **Graceful Fallback**: Automatic in-memory fallback when Redis unavailable
- **Multi-Tier**: Support for anonymous, authenticated, and premium users
- **Gradio Integration**: Built-in middleware for Gradio applications
- **Production-Ready**: Comprehensive error handling and logging
## Quick Start
```python
from backend.rate_limiting import (
TieredRateLimiter,
GradioRateLimitMiddleware,
UserTier
)
import gradio as gr
# Create rate limiter
limiter = TieredRateLimiter(
tier_limits={
UserTier.ANONYMOUS: (10, 0.1), # 10 requests, 0.1 refill/sec
},
redis_url=None # Optional Redis URL
)
# Create middleware
middleware = GradioRateLimitMiddleware(limiter)
# Use in Gradio handler
def my_handler(text: str, request: gr.Request = None):
middleware.enforce(request) # Raises gr.Error if limit exceeded
# ... your handler code
```
## Classes
### ThreadSafeTokenBucket
In-memory token bucket with thread safety.
```python
from backend.rate_limiting import ThreadSafeTokenBucket
bucket = ThreadSafeTokenBucket(capacity=10, refill_rate=1.0)
result = bucket.consume()
if result.allowed:
print(f"Request allowed, {result.remaining} remaining")
else:
print(f"Rate limited, retry after {result.retry_after}s")
```
### AsyncTokenBucket
Async-compatible token bucket.
```python
from backend.rate_limiting import AsyncTokenBucket
bucket = AsyncTokenBucket(capacity=10, refill_rate=1.0)
result = await bucket.consume()
```
### HybridRateLimiter
Redis primary with in-memory fallback.
```python
from backend.rate_limiting import HybridRateLimiter
limiter = HybridRateLimiter(
capacity=10,
refill_rate=1.0,
redis_url="redis://localhost:6379/0", # Optional
key_prefix="myapp"
)
result = limiter.consume(identifier="user_123")
```
### TieredRateLimiter
Multi-tier rate limiting.
```python
from backend.rate_limiting import TieredRateLimiter, UserTier
limiter = TieredRateLimiter(
tier_limits={
UserTier.ANONYMOUS: (10, 0.1),
UserTier.AUTHENTICATED: (50, 0.5),
UserTier.PREMIUM: (200, 2.0),
},
redis_url="redis://localhost:6379/0"
)
result = limiter.consume("user_123", UserTier.AUTHENTICATED)
```
### GradioRateLimitMiddleware
Gradio integration middleware.
```python
from backend.rate_limiting import (
TieredRateLimiter,
GradioRateLimitMiddleware,
UserTier
)
limiter = TieredRateLimiter(...)
middleware = GradioRateLimitMiddleware(limiter)
# Check rate limit
info = middleware.check_rate_limit(request)
# Enforce rate limit (raises gr.Error if exceeded)
middleware.enforce(
request,
tokens=1,
error_message="Custom error message"
)
```
## Configuration
Rate limits are configured via capacity and refill rate:
- **Capacity**: Maximum number of tokens (burst requests)
- **Refill Rate**: Tokens added per second (sustained rate)
Example configurations:
```python
# 10 requests burst, 1 request per 10 seconds sustained
(capacity=10, refill_rate=0.1)
# 50 requests burst, 1 request per 2 seconds sustained
(capacity=50, refill_rate=0.5)
# 100 requests burst, 10 requests per second sustained
(capacity=100, refill_rate=10.0)
```
## Redis Integration
The HybridRateLimiter uses Redis for distributed rate limiting:
1. Lua scripts for atomic operations
2. Automatic script caching
3. Key expiration via TTL
4. Connection pooling
5. Graceful fallback to in-memory
Example Redis URL formats:
```python
# Local Redis
redis_url="redis://localhost:6379/0"
# Redis with authentication
redis_url="redis://:password@localhost:6379/0"
# Redis SSL
redis_url="rediss://host:port/0"
# Upstash Redis
redis_url="rediss://user:pass@endpoint:port"
```
## Error Handling
All components include comprehensive error handling:
- Redis connection failures β†’ automatic fallback to in-memory
- Invalid requests β†’ safe defaults
- Rate limit exceeded β†’ clear error messages with retry timing
```python
try:
middleware.enforce(request)
except gr.Error as e:
# Gradio will display this error to the user
print(f"Rate limited: {e}")
```
## Testing
Run the test suite:
```bash
python test_rate_limiting_simple.py
```
## Best Practices
1. **Use appropriate tier limits**: Set limits based on your application's needs
2. **Use Redis for production**: Enables distributed rate limiting across instances
3. **Monitor logs**: Watch for rate limit violations
4. **Customise error messages**: Provide clear feedback to users
5. **Test fallback**: Ensure in-memory fallback works when Redis is down
## Example: Full Gradio Integration
```python
import gradio as gr
from backend.config import settings
from backend.rate_limiting import (
TieredRateLimiter,
GradioRateLimitMiddleware,
UserTier
)
# Initialize rate limiter
limiter = TieredRateLimiter(
tier_limits={
UserTier.ANONYMOUS: (
settings.rate_limit_anonymous_capacity,
settings.rate_limit_anonymous_refill_rate
),
},
redis_url=settings.redis_url
)
middleware = GradioRateLimitMiddleware(limiter)
# Gradio handler
async def analyse_portfolio(
portfolio_text: str,
request: gr.Request = None
):
# Enforce rate limit
middleware.enforce(request)
# Your analysis code here
return "Analysis complete"
# Gradio interface
with gr.Blocks() as demo:
text_input = gr.Textbox()
submit_btn = gr.Button("Analyse")
output = gr.Textbox()
submit_btn.click(
analyse_portfolio,
inputs=[text_input, request], # Include request
outputs=output
)
demo.launch()
```
## Logging
The module logs important events:
```python
import logging
logger = logging.getLogger('backend.rate_limiting')
logger.setLevel(logging.INFO)
```
Log messages:
- `INFO`: Rate limiter initialisation
- `WARNING`: Rate limit exceeded
- `ERROR`: Redis errors, fallback activation
## Performance
- **In-memory**: <1ms overhead per request
- **Redis**: ~2-5ms overhead per request
- **Fallback**: Automatic, no service interruption
## Dependencies
- `redis>=5.0.0` (optional, for distributed rate limiting)
- `upstash-redis>=0.15.0` (optional, for serverless Redis)
- `gradio` (for middleware integration)
## License
Part of Portfolio Intelligence Platform.