Spaces:
Running
on
Zero
Running
on
Zero
| # Rate Limiting Module | |
| Production-ready rate limiting for Gradio applications with Redis support and graceful fallback. | |
| ## Features | |
| - **Token Bucket Algorithm**: Configurable capacity and refill rate | |
| - **Thread-Safe**: Works with concurrent requests | |
| - **Async Support**: Compatible with async/await handlers | |
| - **Redis Integration**: Distributed rate limiting with Lua scripts | |
| - **Graceful Fallback**: Automatic in-memory fallback when Redis unavailable | |
| - **Multi-Tier**: Support for anonymous, authenticated, and premium users | |
| - **Gradio Integration**: Built-in middleware for Gradio applications | |
| - **Production-Ready**: Comprehensive error handling and logging | |
| ## Quick Start | |
| ```python | |
| from backend.rate_limiting import ( | |
| TieredRateLimiter, | |
| GradioRateLimitMiddleware, | |
| UserTier | |
| ) | |
| import gradio as gr | |
| # Create rate limiter | |
| limiter = TieredRateLimiter( | |
| tier_limits={ | |
| UserTier.ANONYMOUS: (10, 0.1), # 10 requests, 0.1 refill/sec | |
| }, | |
| redis_url=None # Optional Redis URL | |
| ) | |
| # Create middleware | |
| middleware = GradioRateLimitMiddleware(limiter) | |
| # Use in Gradio handler | |
| def my_handler(text: str, request: gr.Request = None): | |
| middleware.enforce(request) # Raises gr.Error if limit exceeded | |
| # ... your handler code | |
| ``` | |
| ## Classes | |
| ### ThreadSafeTokenBucket | |
| In-memory token bucket with thread safety. | |
| ```python | |
| from backend.rate_limiting import ThreadSafeTokenBucket | |
| bucket = ThreadSafeTokenBucket(capacity=10, refill_rate=1.0) | |
| result = bucket.consume() | |
| if result.allowed: | |
| print(f"Request allowed, {result.remaining} remaining") | |
| else: | |
| print(f"Rate limited, retry after {result.retry_after}s") | |
| ``` | |
| ### AsyncTokenBucket | |
| Async-compatible token bucket. | |
| ```python | |
| from backend.rate_limiting import AsyncTokenBucket | |
| bucket = AsyncTokenBucket(capacity=10, refill_rate=1.0) | |
| result = await bucket.consume() | |
| ``` | |
| ### HybridRateLimiter | |
| Redis primary with in-memory fallback. | |
| ```python | |
| from backend.rate_limiting import HybridRateLimiter | |
| limiter = HybridRateLimiter( | |
| capacity=10, | |
| refill_rate=1.0, | |
| redis_url="redis://localhost:6379/0", # Optional | |
| key_prefix="myapp" | |
| ) | |
| result = limiter.consume(identifier="user_123") | |
| ``` | |
| ### TieredRateLimiter | |
| Multi-tier rate limiting. | |
| ```python | |
| from backend.rate_limiting import TieredRateLimiter, UserTier | |
| limiter = TieredRateLimiter( | |
| tier_limits={ | |
| UserTier.ANONYMOUS: (10, 0.1), | |
| UserTier.AUTHENTICATED: (50, 0.5), | |
| UserTier.PREMIUM: (200, 2.0), | |
| }, | |
| redis_url="redis://localhost:6379/0" | |
| ) | |
| result = limiter.consume("user_123", UserTier.AUTHENTICATED) | |
| ``` | |
| ### GradioRateLimitMiddleware | |
| Gradio integration middleware. | |
| ```python | |
| from backend.rate_limiting import ( | |
| TieredRateLimiter, | |
| GradioRateLimitMiddleware, | |
| UserTier | |
| ) | |
| limiter = TieredRateLimiter(...) | |
| middleware = GradioRateLimitMiddleware(limiter) | |
| # Check rate limit | |
| info = middleware.check_rate_limit(request) | |
| # Enforce rate limit (raises gr.Error if exceeded) | |
| middleware.enforce( | |
| request, | |
| tokens=1, | |
| error_message="Custom error message" | |
| ) | |
| ``` | |
| ## Configuration | |
| Rate limits are configured via capacity and refill rate: | |
| - **Capacity**: Maximum number of tokens (burst requests) | |
| - **Refill Rate**: Tokens added per second (sustained rate) | |
| Example configurations: | |
| ```python | |
| # 10 requests burst, 1 request per 10 seconds sustained | |
| (capacity=10, refill_rate=0.1) | |
| # 50 requests burst, 1 request per 2 seconds sustained | |
| (capacity=50, refill_rate=0.5) | |
| # 100 requests burst, 10 requests per second sustained | |
| (capacity=100, refill_rate=10.0) | |
| ``` | |
| ## Redis Integration | |
| The HybridRateLimiter uses Redis for distributed rate limiting: | |
| 1. Lua scripts for atomic operations | |
| 2. Automatic script caching | |
| 3. Key expiration via TTL | |
| 4. Connection pooling | |
| 5. Graceful fallback to in-memory | |
| Example Redis URL formats: | |
| ```python | |
| # Local Redis | |
| redis_url="redis://localhost:6379/0" | |
| # Redis with authentication | |
| redis_url="redis://:password@localhost:6379/0" | |
| # Redis SSL | |
| redis_url="rediss://host:port/0" | |
| # Upstash Redis | |
| redis_url="rediss://user:pass@endpoint:port" | |
| ``` | |
| ## Error Handling | |
| All components include comprehensive error handling: | |
| - Redis connection failures β automatic fallback to in-memory | |
| - Invalid requests β safe defaults | |
| - Rate limit exceeded β clear error messages with retry timing | |
| ```python | |
| try: | |
| middleware.enforce(request) | |
| except gr.Error as e: | |
| # Gradio will display this error to the user | |
| print(f"Rate limited: {e}") | |
| ``` | |
| ## Testing | |
| Run the test suite: | |
| ```bash | |
| python test_rate_limiting_simple.py | |
| ``` | |
| ## Best Practices | |
| 1. **Use appropriate tier limits**: Set limits based on your application's needs | |
| 2. **Use Redis for production**: Enables distributed rate limiting across instances | |
| 3. **Monitor logs**: Watch for rate limit violations | |
| 4. **Customise error messages**: Provide clear feedback to users | |
| 5. **Test fallback**: Ensure in-memory fallback works when Redis is down | |
| ## Example: Full Gradio Integration | |
| ```python | |
| import gradio as gr | |
| from backend.config import settings | |
| from backend.rate_limiting import ( | |
| TieredRateLimiter, | |
| GradioRateLimitMiddleware, | |
| UserTier | |
| ) | |
| # Initialize rate limiter | |
| limiter = TieredRateLimiter( | |
| tier_limits={ | |
| UserTier.ANONYMOUS: ( | |
| settings.rate_limit_anonymous_capacity, | |
| settings.rate_limit_anonymous_refill_rate | |
| ), | |
| }, | |
| redis_url=settings.redis_url | |
| ) | |
| middleware = GradioRateLimitMiddleware(limiter) | |
| # Gradio handler | |
| async def analyse_portfolio( | |
| portfolio_text: str, | |
| request: gr.Request = None | |
| ): | |
| # Enforce rate limit | |
| middleware.enforce(request) | |
| # Your analysis code here | |
| return "Analysis complete" | |
| # Gradio interface | |
| with gr.Blocks() as demo: | |
| text_input = gr.Textbox() | |
| submit_btn = gr.Button("Analyse") | |
| output = gr.Textbox() | |
| submit_btn.click( | |
| analyse_portfolio, | |
| inputs=[text_input, request], # Include request | |
| outputs=output | |
| ) | |
| demo.launch() | |
| ``` | |
| ## Logging | |
| The module logs important events: | |
| ```python | |
| import logging | |
| logger = logging.getLogger('backend.rate_limiting') | |
| logger.setLevel(logging.INFO) | |
| ``` | |
| Log messages: | |
| - `INFO`: Rate limiter initialisation | |
| - `WARNING`: Rate limit exceeded | |
| - `ERROR`: Redis errors, fallback activation | |
| ## Performance | |
| - **In-memory**: <1ms overhead per request | |
| - **Redis**: ~2-5ms overhead per request | |
| - **Fallback**: Automatic, no service interruption | |
| ## Dependencies | |
| - `redis>=5.0.0` (optional, for distributed rate limiting) | |
| - `upstash-redis>=0.15.0` (optional, for serverless Redis) | |
| - `gradio` (for middleware integration) | |
| ## License | |
| Part of Portfolio Intelligence Platform. | |