Welcome to Choked
Choked is a simple and powerful Python rate limiting library that uses the token bucket algorithm to control the rate of function calls with support for both request-based and token-based limiting.Features
- Easy to use: Simple class-based API with decorator pattern
- Dual limiting: Support both request limits and token limits (for AI/ML APIs)
- Flexible backends: Supports both Redis and managed proxy service backends
- Async/Sync support: Works with both synchronous and asynchronous functions
- Smart token estimation: Built-in estimators for OpenAI, VoyageAI, and general text
- Exponential backoff: Smart retry logic with jitter to prevent thundering herd
- Distributed: Share rate limits across multiple processes or servers
- Multi-worker scaling: Perfect for managing multiple workers using the same API key
Quick Start
Install choked:How it Works
Choked uses a token bucket algorithm with dual limiting support:- Request limiting: Each function call consumes 1 request token
- Token limiting: Each function call consumes estimated tokens based on input text
- Buckets refill at steady rates (e.g., “100/s” = 100 tokens per second)
- When limits are reached, functions wait with exponential backoff
Perfect for AI/ML APIs
Choked excels with token-based APIs like OpenAI, VoyageAI, and others:- Dual limiting: Respect both request and token limits simultaneously
- Smart estimation: Automatic token counting for popular AI services
- Auto-scaling: Add/remove workers without changing rate limits
- No overages: Never exceed your API provider’s limits