Spaces:
Running
on
Zero
Running
on
Zero
A newer version of the Gradio SDK is available:
6.1.0
Rate Limiting Module
Production-ready rate limiting for Gradio applications with Redis support and graceful fallback.
Features
- Token Bucket Algorithm: Configurable capacity and refill rate
- Thread-Safe: Works with concurrent requests
- Async Support: Compatible with async/await handlers
- Redis Integration: Distributed rate limiting with Lua scripts
- Graceful Fallback: Automatic in-memory fallback when Redis unavailable
- Multi-Tier: Support for anonymous, authenticated, and premium users
- Gradio Integration: Built-in middleware for Gradio applications
- Production-Ready: Comprehensive error handling and logging
Quick Start
from backend.rate_limiting import (
TieredRateLimiter,
GradioRateLimitMiddleware,
UserTier
)
import gradio as gr
# Create rate limiter
limiter = TieredRateLimiter(
tier_limits={
UserTier.ANONYMOUS: (10, 0.1), # 10 requests, 0.1 refill/sec
},
redis_url=None # Optional Redis URL
)
# Create middleware
middleware = GradioRateLimitMiddleware(limiter)
# Use in Gradio handler
def my_handler(text: str, request: gr.Request = None):
middleware.enforce(request) # Raises gr.Error if limit exceeded
# ... your handler code
Classes
ThreadSafeTokenBucket
In-memory token bucket with thread safety.
from backend.rate_limiting import ThreadSafeTokenBucket
bucket = ThreadSafeTokenBucket(capacity=10, refill_rate=1.0)
result = bucket.consume()
if result.allowed:
print(f"Request allowed, {result.remaining} remaining")
else:
print(f"Rate limited, retry after {result.retry_after}s")
AsyncTokenBucket
Async-compatible token bucket.
from backend.rate_limiting import AsyncTokenBucket
bucket = AsyncTokenBucket(capacity=10, refill_rate=1.0)
result = await bucket.consume()
HybridRateLimiter
Redis primary with in-memory fallback.
from backend.rate_limiting import HybridRateLimiter
limiter = HybridRateLimiter(
capacity=10,
refill_rate=1.0,
redis_url="redis://localhost:6379/0", # Optional
key_prefix="myapp"
)
result = limiter.consume(identifier="user_123")
TieredRateLimiter
Multi-tier rate limiting.
from backend.rate_limiting import TieredRateLimiter, UserTier
limiter = TieredRateLimiter(
tier_limits={
UserTier.ANONYMOUS: (10, 0.1),
UserTier.AUTHENTICATED: (50, 0.5),
UserTier.PREMIUM: (200, 2.0),
},
redis_url="redis://localhost:6379/0"
)
result = limiter.consume("user_123", UserTier.AUTHENTICATED)
GradioRateLimitMiddleware
Gradio integration middleware.
from backend.rate_limiting import (
TieredRateLimiter,
GradioRateLimitMiddleware,
UserTier
)
limiter = TieredRateLimiter(...)
middleware = GradioRateLimitMiddleware(limiter)
# Check rate limit
info = middleware.check_rate_limit(request)
# Enforce rate limit (raises gr.Error if exceeded)
middleware.enforce(
request,
tokens=1,
error_message="Custom error message"
)
Configuration
Rate limits are configured via capacity and refill rate:
- Capacity: Maximum number of tokens (burst requests)
- Refill Rate: Tokens added per second (sustained rate)
Example configurations:
# 10 requests burst, 1 request per 10 seconds sustained
(capacity=10, refill_rate=0.1)
# 50 requests burst, 1 request per 2 seconds sustained
(capacity=50, refill_rate=0.5)
# 100 requests burst, 10 requests per second sustained
(capacity=100, refill_rate=10.0)
Redis Integration
The HybridRateLimiter uses Redis for distributed rate limiting:
- Lua scripts for atomic operations
- Automatic script caching
- Key expiration via TTL
- Connection pooling
- Graceful fallback to in-memory
Example Redis URL formats:
# Local Redis
redis_url="redis://localhost:6379/0"
# Redis with authentication
redis_url="redis://:password@localhost:6379/0"
# Redis SSL
redis_url="rediss://host:port/0"
# Upstash Redis
redis_url="rediss://user:pass@endpoint:port"
Error Handling
All components include comprehensive error handling:
- Redis connection failures β automatic fallback to in-memory
- Invalid requests β safe defaults
- Rate limit exceeded β clear error messages with retry timing
try:
middleware.enforce(request)
except gr.Error as e:
# Gradio will display this error to the user
print(f"Rate limited: {e}")
Testing
Run the test suite:
python test_rate_limiting_simple.py
Best Practices
- Use appropriate tier limits: Set limits based on your application's needs
- Use Redis for production: Enables distributed rate limiting across instances
- Monitor logs: Watch for rate limit violations
- Customise error messages: Provide clear feedback to users
- Test fallback: Ensure in-memory fallback works when Redis is down
Example: Full Gradio Integration
import gradio as gr
from backend.config import settings
from backend.rate_limiting import (
TieredRateLimiter,
GradioRateLimitMiddleware,
UserTier
)
# Initialize rate limiter
limiter = TieredRateLimiter(
tier_limits={
UserTier.ANONYMOUS: (
settings.rate_limit_anonymous_capacity,
settings.rate_limit_anonymous_refill_rate
),
},
redis_url=settings.redis_url
)
middleware = GradioRateLimitMiddleware(limiter)
# Gradio handler
async def analyse_portfolio(
portfolio_text: str,
request: gr.Request = None
):
# Enforce rate limit
middleware.enforce(request)
# Your analysis code here
return "Analysis complete"
# Gradio interface
with gr.Blocks() as demo:
text_input = gr.Textbox()
submit_btn = gr.Button("Analyse")
output = gr.Textbox()
submit_btn.click(
analyse_portfolio,
inputs=[text_input, request], # Include request
outputs=output
)
demo.launch()
Logging
The module logs important events:
import logging
logger = logging.getLogger('backend.rate_limiting')
logger.setLevel(logging.INFO)
Log messages:
INFO: Rate limiter initialisationWARNING: Rate limit exceededERROR: Redis errors, fallback activation
Performance
- In-memory: <1ms overhead per request
- Redis: ~2-5ms overhead per request
- Fallback: Automatic, no service interruption
Dependencies
redis>=5.0.0(optional, for distributed rate limiting)upstash-redis>=0.15.0(optional, for serverless Redis)gradio(for middleware integration)
License
Part of Portfolio Intelligence Platform.