Quickstart

Get up and running with Choked in minutes.

Installation

pip install choked

Basic Usage

Create a Choked instance and use it as a decorator:

from choked import Choked

# Create instance with Redis backend
choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="my_function", request_limit="5/10s")
def my_rate_limited_function():
    print("This function can be called 5 times every 10 seconds")
    return "success"

# Call the function
result = my_rate_limited_function()

Backend Options

Choose between Redis (self-hosted) or managed proxy service:

Redis Backend

from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

Managed Proxy Service

from choked import Choked

choke = Choked(api_token="your-api-token")

Parameters Explained

key: Unique identifier for this rate limit. Functions with the same key share the same bucket.
request_limit: Request rate limit (e.g., “10/s”, “100/m”). Optional if token_limit provided.
token_limit: Token rate limit (e.g., “1000/s”, “100000/m”). Optional if request_limit provided.
token_estimator: Token estimation method (“openai”, “voyageai”, “default”). Required for token_limit.

Async Functions

Choked automatically detects and handles async functions:

import asyncio
from choked import Choked

choke = Choked(redis_url="redis://localhost:6379/0")

@choke(key="async_api", request_limit="3/5s")
async def async_api_call():
    print("Async function with rate limiting")
    return "async result"

# Use with asyncio
async def main():
    result = await async_api_call()
    print(result)

asyncio.run(main())

Token-Based Limiting

Perfect for AI/ML APIs that have token limits:

from choked import Choked

choke = Choked(api_token="your-token")

@choke(key="openai_chat", token_limit="100000/m", token_estimator="openai")
def chat_completion(messages):
    # Automatically estimates tokens from messages
    return openai.chat.completions.create(
        model="gpt-4",
        messages=messages
    )

Dual Limiting

Apply both request and token limits simultaneously:

@choke(key="api_dual", request_limit="50/s", token_limit="100000/m", token_estimator="openai")
def dual_limited_function(text):
    # Limited by both requests (50/s) AND tokens (100K/m)
    return process_text(text)

Multiple Rate Limits

You can create multiple Choked instances or use different keys:

# Different instances for different services
openai_choke = Choked(api_token="openai-token")
voyage_choke = Choked(api_token="voyage-token")

@openai_choke(key="gpt4", request_limit="50/s", token_limit="100000/m", token_estimator="openai")
def gpt4_call(messages):
    return "GPT-4 response"

@voyage_choke(key="embed", token_limit="1000000/m", token_estimator="voyageai")
def voyage_embed(texts):
    return "Embeddings"

Next Steps

Learn about Configuration options
Understand Token Buckets in detail

Get Started

Essentials

Installation

Basic Usage

Backend Options

Redis Backend

Managed Proxy Service

Parameters Explained

Async Functions

Token-Based Limiting

Dual Limiting

Multiple Rate Limits

Next Steps

Get Started

Essentials

​Installation

​Basic Usage

​Backend Options

​Redis Backend

​Managed Proxy Service

​Parameters Explained

​Async Functions

​Token-Based Limiting

​Dual Limiting

​Multiple Rate Limits

​Next Steps

Installation

Basic Usage

Backend Options

Redis Backend

Managed Proxy Service

Parameters Explained

Async Functions

Token-Based Limiting

Dual Limiting

Multiple Rate Limits

Next Steps