Skip to main content
Get up and running with Choked in minutes.

Installation

pip install choked

Basic Usage

Create a Choked instance and use it as a decorator:
from choked import Choked

# Create instance with Redis backend
choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="my_function", request_limit="5/10s")
def my_rate_limited_function():
    print("This function can be called 5 times every 10 seconds")
    return "success"

# Call the function
result = my_rate_limited_function()

Backend Options

Choose between Redis (self-hosted) or managed proxy service:

Redis Backend

from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

Managed Proxy Service

from choked import Choked

choke = Choked(api_token="your-api-token")

Parameters Explained

  • key: Unique identifier for this rate limit. Functions with the same key share the same bucket.
  • request_limit: Request rate limit (e.g., “10/s”, “100/m”). Optional if token_limit provided.
  • token_limit: Token rate limit (e.g., “1000/s”, “100000/m”). Optional if request_limit provided.
  • token_estimator: Token estimation method (“openai”, “voyageai”, “default”). Required for token_limit.

Async Functions

Choked automatically detects and handles async functions:
import asyncio
from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="async_api", request_limit="3/5s")
async def async_api_call():
    print("Async function with rate limiting")
    return "async result"

# Use with asyncio
async def main():
    result = await async_api_call()
    print(result)

asyncio.run(main())

Token-Based Limiting

Perfect for AI/ML APIs that have token limits:
from choked import Choked

choke = Choked(api_token="your-token")

@choke(key="openai_chat", token_limit="100000/m", token_estimator="openai")
def chat_completion(messages):
    # Automatically estimates tokens from messages
    return openai.chat.completions.create(
        model="gpt-4",
        messages=messages
    )

Dual Limiting

Apply both request and token limits simultaneously:
@choke(key="api_dual", request_limit="50/s", token_limit="100000/m", token_estimator="openai")
def dual_limited_function(text):
    # Limited by both requests (50/s) AND tokens (100K/m)
    return process_text(text)

Multiple Rate Limits

You can create multiple Choked instances or use different keys:
# Different instances for different services
openai_choke = Choked(api_token="openai-token")
voyage_choke = Choked(api_token="voyage-token")

@openai_choke(key="gpt4", request_limit="50/s", token_limit="100000/m", token_estimator="openai")
def gpt4_call(messages):
    return "GPT-4 response"

@voyage_choke(key="embed", token_limit="1000000/m", token_estimator="voyageai")
def voyage_embed(texts):
    return "Embeddings"

Next Steps