Rate Limits

Rate limits protect the platform from abuse and ensure fair usage for all customers. This page explains how rate limits work and how to handle them.

Rate Limit Tiers

By Plan

Plan	API Requests/Min	Concurrent Calls	Outbound Calls/Min
Starter	60	5	10
Professional	300	25	60
Enterprise	Custom	Custom	Custom

By Endpoint

Some endpoints have specific limits:

Endpoint	Limit	Window
`/v1/calls/outbound`	Plan limit	Per minute
`/v1/sms`	100/min	Per minute
`/v1/knowledge-base`	60/min	Per minute
`/v1/exports`	10/hour	Per hour
`/v1/bulk/*`	5/min	Per minute

Rate Limit Headers

Every API response includes rate limit information:

X-RateLimit-Limit: 300
X-RateLimit-Remaining: 298
X-RateLimit-Reset: 1705312860
X-RateLimit-Window: 60

Header	Description
`X-RateLimit-Limit`	Maximum requests per window
`X-RateLimit-Remaining`	Requests remaining in window
`X-RateLimit-Reset`	Unix timestamp when window resets
`X-RateLimit-Window`	Window duration in seconds

Rate Limit Exceeded

When you exceed the rate limit:

Response

{
  "success": false,
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Too many requests. Try again in 45 seconds.",
    "retry_after": 45
  }
}

Headers

HTTP/1.1 429 Too Many Requests
Retry-After: 45
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705312860

Handling Rate Limits

Retry with Backoff

Implement exponential backoff:

async function apiCallWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') || 60;
      const delay = Math.min(retryAfter * 1000, Math.pow(2, attempt) * 1000);
      
      console.log(`Rate limited. Retrying in ${delay}ms`);
      await new Promise(r => setTimeout(r, delay));
      continue;
    }
    
    return response;
  }
  
  throw new Error('Max retries exceeded');
}

Python Example

import time
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retries = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504],
    respect_retry_after_header=True
)
session.mount('https://', HTTPAdapter(max_retries=retries))

response = session.get(
    'https://api.usecrew.ai/v1/agents',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

Optimizing API Usage

Batch Requests

Instead of multiple single requests:

// Bad: 100 individual requests
for (const entry of entries) {
  await api.knowledgeBase.create(entry);
}

// Good: Single batch request
await api.knowledgeBase.createBatch(entries);

Caching

Cache responses when appropriate:

const cache = new Map();
const CACHE_TTL = 60000; // 1 minute

async function getAgentsWithCache() {
  const cached = cache.get('agents');
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }
  
  const response = await api.agents.list();
  cache.set('agents', { data: response, timestamp: Date.now() });
  return response;
}

Webhooks Instead of Polling

Use webhooks instead of polling for updates:

// Bad: Polling every 5 seconds
setInterval(async () => {
  const calls = await api.calls.list({ status: 'active' });
  processActiveCalls(calls);
}, 5000);

// Good: Webhook subscription
// Configure webhook for call.* events
// Process events as they arrive

Monitoring Usage

Check Current Usage

curl https://api.usecrew.ai/v1/usage/rate-limits \
  -H "Authorization: Bearer YOUR_API_KEY"

{
  "current_window": {
    "requests_made": 150,
    "requests_limit": 300,
    "window_reset": "2024-01-15T10:31:00Z"
  },
  "endpoints": {
    "/v1/calls/outbound": {
      "requests_made": 25,
      "requests_limit": 60
    }
  }
}

Usage Dashboard

View rate limit usage in the dashboard:

Go to Settings → Usage
Select API Usage tab
View requests by endpoint and time

Concurrency Limits

Concurrent Calls

Limit on simultaneous active calls:

Plan	Concurrent Inbound	Concurrent Outbound
Starter	5	5
Professional	25	25
Enterprise	Custom	Custom

Handling Concurrency Limits

{
  "success": false,
  "error": {
    "code": "concurrent_limit_exceeded",
    "message": "Maximum concurrent calls reached. Please wait for active calls to complete.",
    "active_calls": 25,
    "limit": 25
  }
}

Burst Handling

Short bursts above the rate limit are handled gracefully:

Small bursts (10-20% over) are typically allowed
Sustained over-limit traffic is rate limited
Burst allowance resets after a quiet period

Time:      t0    t1    t2    t3    t4    t5
Limit:     100   100   100   100   100   100
Requests:  120   90    110   150   80    100
Result:    ✓     ✓     ✓     ✓*    ✓     ✓

* Some requests may be delayed but not rejected

Enterprise Rate Limits

Custom Limits

Enterprise customers can request:

Higher request limits
Increased concurrent call capacity
Dedicated rate limit pools
Priority processing

Requesting Increases

Contact your account manager or sales@usecrew.ai with:

Current usage patterns
Expected growth
Specific endpoint requirements
Business justification

Dedicated Resources

Enterprise plans can include:

Dedicated API endpoints
Reserved capacity
Isolated rate limit pools
Custom SLAs

Best Practices

Monitor rate limit headers

Track remaining requests and plan accordingly.

Implement exponential backoff

Don’t hammer the API when rate limited.

Use batch endpoints

Reduce request count with batch operations.

Cache when possible

Don’t re-fetch data that hasn’t changed.

Use webhooks

Subscribe to events instead of polling.

Plan for limits

Design systems that gracefully handle rate limits.

Troubleshooting

Unexpected Rate Limiting

Check if multiple applications share the same API key
Verify you’re not running duplicate processes
Review automated scripts for infinite loops
Check for webhook retry storms

Consistent Rate Limiting

If you’re consistently hitting limits:

Review your usage patterns
Optimize with batching and caching
Consider upgrading your plan
Contact support for guidance

Next Steps

API Overview — Full API documentation
Webhooks — Event-driven updates
Billing & Plans — Plan comparison

Welcome

Core Concepts

Voice & Communication

Integrations

Customization

Developer

Security & Compliance

Enterprise

Rate Limits

Rate Limits

Rate Limit Tiers

By Plan

By Endpoint

Rate Limit Headers

Rate Limit Exceeded

Response

Headers

Handling Rate Limits

Retry with Backoff

Python Example

Optimizing API Usage

Batch Requests

Caching

Webhooks Instead of Polling

Monitoring Usage

Check Current Usage

Usage Dashboard

Concurrency Limits

Concurrent Calls

Handling Concurrency Limits

Burst Handling

Enterprise Rate Limits

Custom Limits

Requesting Increases

Dedicated Resources

Best Practices

Troubleshooting

Unexpected Rate Limiting

Consistent Rate Limiting

Next Steps

Welcome

Core Concepts

Voice & Communication

Integrations

Customization

Developer

Security & Compliance

Enterprise

​Rate Limits

​Rate Limit Tiers

​By Plan

​By Endpoint

​Rate Limit Headers

​Rate Limit Exceeded

​Response

​Headers

​Handling Rate Limits

​Retry with Backoff

​Python Example

​Optimizing API Usage

​Batch Requests

​Caching

​Webhooks Instead of Polling

​Monitoring Usage

​Check Current Usage

​Usage Dashboard

​Concurrency Limits

​Concurrent Calls

​Handling Concurrency Limits

​Burst Handling

​Enterprise Rate Limits

​Custom Limits

​Requesting Increases

​Dedicated Resources

​Best Practices

​Troubleshooting

​Unexpected Rate Limiting

​Consistent Rate Limiting

​Next Steps

Rate Limits

Rate Limit Tiers

By Plan

By Endpoint

Rate Limit Headers

Rate Limit Exceeded

Response

Headers

Handling Rate Limits

Retry with Backoff

Python Example

Optimizing API Usage

Batch Requests

Caching

Webhooks Instead of Polling

Monitoring Usage

Check Current Usage

Usage Dashboard

Concurrency Limits

Concurrent Calls

Handling Concurrency Limits

Burst Handling

Enterprise Rate Limits

Custom Limits

Requesting Increases

Dedicated Resources

Best Practices

Troubleshooting

Unexpected Rate Limiting

Consistent Rate Limiting

Next Steps