Skip to main content

Configuration

Choked offers flexible configuration options to suit different deployment scenarios and performance requirements.

Backend Configuration

Choked supports two backend types for storing token bucket state: Use Redis for distributed rate limiting across multiple processes or servers:
from choked import Choked

# Create instance with Redis backend
choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="api_calls", request_limit="10/m")
def my_function():
    return "Limited by Redis"

Managed Proxy Service

Use the managed proxy service for zero-infrastructure rate limiting:
from choked import Choked

# Create instance with managed service
choke = Choked(api_token="your-api-token")

@choke(key="api_calls", request_limit="10/m")
def my_function():
    return "Limited by managed service"

Choked Class Parameters

The Choked class accepts these initialization parameters:

Required Parameters (One Of)

  • redis_url (str): Redis connection URL for distributed rate limiting
    • Example: "redis://localhost:6379/0"
    • Example: "redis://user:pass@host:6379/0"
  • api_token (str): API token for managed rate limiting service
    • Contact us for access to the managed service

Validation

  • Exactly one of redis_url or api_token must be provided
  • Cannot specify both parameters
  • Cannot specify neither parameter

Decorator Parameters

The Choked instance when used as a decorator accepts these parameters:

Required Parameters

  • key (str): Unique identifier for the rate limit bucket

Rate Limit Parameters (At Least One Required)

  • request_limit (str): Request rate limit in format ‘number/period’
    • Examples: "10/s" (10 per second), "100/m" (100 per minute)
    • Optional if token_limit is provided
  • token_limit (str): Token rate limit in format ‘number/period’
    • Examples: "1000/s" (1000 tokens per second), "100000/m" (100K tokens per minute)
    • Optional if request_limit is provided

Optional Parameters

  • token_estimator (str): Token estimation method
    • "openai": Use OpenAI/tiktoken for text estimation
    • "voyageai": Use VoyageAI tokenizer for text estimation
    • "default": Use tiktoken with GPT-4 tokenizer (same as “openai”)
    • Required when using token_limit

Rate Limit Format

Rate limits use the format "number/period" where:
  • number: Positive integer (tokens or requests allowed)
  • period: Either "s" (seconds) or "m" (minutes)

Valid Examples

"10/s"      # 10 per second
"100/m"     # 100 per minute  
"1000/s"    # 1000 per second
"50000/m"   # 50,000 per minute

Invalid Examples

"10/h"      # Hours not supported
"10"        # Missing period
"/s"        # Missing number
"10.5/s"    # Decimals not supported

Usage Examples

Request-Only Limiting

from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="fast_api", request_limit="100/s")
def fast_api_call():
    return "Quick API call"

Token-Only Limiting

from choked import Choked

choke = Choked(api_token="your-token")

@choke(key="openai_embed", token_limit="1000000/m", token_estimator="openai")
def get_embeddings(texts):
    # Automatically estimates tokens from texts
    return create_embeddings(texts)

Dual Limiting

from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="gpt4_chat", request_limit="50/s", token_limit="100000/m", token_estimator="openai")
def chat_completion(messages):
    # Limited by both requests AND tokens
    return openai.chat.completions.create(
        model="gpt-4",
        messages=messages
    )

Multiple Services

from choked import Choked

# Different backends for different services
openai_choke = Choked(api_token="openai-service-token")
redis_choke = Choked(redis_url="redis://localhost:6379/0")

@openai_choke(key="gpt4", request_limit="50/s", token_limit="100000/m", token_estimator="openai")
def openai_call(messages):
    return "OpenAI response"

@redis_choke(key="internal_api", request_limit="1000/s")
def internal_call():
    return "Internal API response"

Advanced Configuration

Per-User Rate Limiting

Create dynamic rate limits based on user context:
def create_user_limiter(user_id, redis_url):
    choke = Choked(redis_url=redis_url)
    
    @choke(key=f"user_{user_id}", request_limit="10/m")
    def user_api_call():
        return f"API call for user {user_id}"
    
    return user_api_call

# Each user gets their own rate limit
user_123_call = create_user_limiter("123", "redis://localhost:6379/0")
user_456_call = create_user_limiter("456", "redis://localhost:6379/0")

Shared Rate Limits

Functions with the same key share rate limits:
from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="shared_resource", request_limit="10/m")
def function_a():
    return "A"

@choke(key="shared_resource", request_limit="10/m")
def function_b():
    return "B"

# Both functions compete for the same 10 requests/minute

Multi-Worker Coordination

Perfect for scenarios where multiple workers share API keys:
from choked import Choked

# All workers use the same configuration
choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="shared_api_key", request_limit="1000/h")
def worker_api_call():
    # All workers automatically coordinate through Redis
    return make_external_api_call()

# Scale workers up/down without changing rate limit configuration
# The token bucket automatically handles fair distribution

Token Estimation Details

OpenAI Estimator ("openai")

  • Uses tiktoken library with GPT-4 tokenizer
  • Extracts text from function arguments automatically
  • Handles OpenAI message format: [{"role": "user", "content": "text"}]
  • Falls back to word-based estimation if tiktoken fails

VoyageAI Estimator ("voyageai")

  • Uses HuggingFace transformers with voyageai/voyage-3.5 tokenizer
  • Extracts text from function arguments automatically
  • Falls back to OpenAI estimator if VoyageAI tokenizer fails

Default Estimator ("default")

  • Same as OpenAI estimator
  • Uses tiktoken with GPT-4 tokenizer

Text Extraction

All estimators automatically extract text from:
  • String arguments and keyword arguments
  • List arguments containing strings
  • Dictionary arguments with messages key (OpenAI format)
  • Nested content in message dictionaries

Fallback Behavior

If token estimation fails:
  1. VoyageAI → OpenAI estimator
  2. OpenAI → Word-based estimation (~0.75 tokens per word)
  3. Word-based → Returns 1 token minimum

Performance Tuning

Rate Limit Design

Choose rate limits based on your API’s characteristics:
# High burst, steady average
@choke(key="bursty", request_limit="100/s")  # Allow bursts

# Strict rate limiting  
@choke(key="strict", request_limit="1/s")    # Exactly 1 per second

# Token-focused for AI APIs
@choke(key="ai_api", token_limit="100000/m", token_estimator="openai")

Backend Selection

  • Redis: Best for high-throughput, self-hosted, multiple processes
  • Managed Service: Best for simplicity, zero-infrastructure, getting started

Memory and Performance

  • Redis backend: Minimal memory usage, Lua scripts for atomic operations
  • Managed service: Zero local memory usage, HTTP-based coordination
  • Token estimation: Cached tokenizers, minimal computational overhead