Configuration
Choked offers flexible configuration options to suit different deployment scenarios and performance requirements.Backend Configuration
Choked supports two backend types for storing token bucket state:Redis Backend (Recommended for Production)
Use Redis for distributed rate limiting across multiple processes or servers:Managed Proxy Service
Use the managed proxy service for zero-infrastructure rate limiting:Choked Class Parameters
TheChoked class accepts these initialization parameters:
Required Parameters (One Of)
redis_url(str): Redis connection URL for distributed rate limiting- Example:
"redis://localhost:6379/0" - Example:
"redis://user:pass@host:6379/0"
- Example:
api_token(str): API token for managed rate limiting service- Contact us for access to the managed service
Validation
- Exactly one of
redis_urlorapi_tokenmust be provided - Cannot specify both parameters
- Cannot specify neither parameter
Decorator Parameters
The Choked instance when used as a decorator accepts these parameters:Required Parameters
key(str): Unique identifier for the rate limit bucket
Rate Limit Parameters (At Least One Required)
request_limit(str): Request rate limit in format ‘number/period’- Examples:
"10/s"(10 per second),"100/m"(100 per minute) - Optional if
token_limitis provided
- Examples:
token_limit(str): Token rate limit in format ‘number/period’- Examples:
"1000/s"(1000 tokens per second),"100000/m"(100K tokens per minute) - Optional if
request_limitis provided
- Examples:
Optional Parameters
token_estimator(str): Token estimation method"openai": Use OpenAI/tiktoken for text estimation"voyageai": Use VoyageAI tokenizer for text estimation"default": Use tiktoken with GPT-4 tokenizer (same as “openai”)- Required when using
token_limit
Rate Limit Format
Rate limits use the format"number/period" where:
- number: Positive integer (tokens or requests allowed)
- period: Either
"s"(seconds) or"m"(minutes)
Valid Examples
Invalid Examples
Usage Examples
Request-Only Limiting
Token-Only Limiting
Dual Limiting
Multiple Services
Advanced Configuration
Per-User Rate Limiting
Create dynamic rate limits based on user context:Shared Rate Limits
Functions with the same key share rate limits:Multi-Worker Coordination
Perfect for scenarios where multiple workers share API keys:Token Estimation Details
OpenAI Estimator ("openai")
- Uses
tiktokenlibrary with GPT-4 tokenizer - Extracts text from function arguments automatically
- Handles OpenAI message format:
[{"role": "user", "content": "text"}] - Falls back to word-based estimation if tiktoken fails
VoyageAI Estimator ("voyageai")
- Uses HuggingFace transformers with
voyageai/voyage-3.5tokenizer - Extracts text from function arguments automatically
- Falls back to OpenAI estimator if VoyageAI tokenizer fails
Default Estimator ("default")
- Same as OpenAI estimator
- Uses tiktoken with GPT-4 tokenizer
Text Extraction
All estimators automatically extract text from:- String arguments and keyword arguments
- List arguments containing strings
- Dictionary arguments with
messageskey (OpenAI format) - Nested content in message dictionaries
Fallback Behavior
If token estimation fails:- VoyageAI → OpenAI estimator
- OpenAI → Word-based estimation (~0.75 tokens per word)
- Word-based → Returns 1 token minimum
Performance Tuning
Rate Limit Design
Choose rate limits based on your API’s characteristics:Backend Selection
- Redis: Best for high-throughput, self-hosted, multiple processes
- Managed Service: Best for simplicity, zero-infrastructure, getting started
Memory and Performance
- Redis backend: Minimal memory usage, Lua scripts for atomic operations
- Managed service: Zero local memory usage, HTTP-based coordination
- Token estimation: Cached tokenizers, minimal computational overhead