retellai-performance-tuning
Optimize Retell AI API performance with caching, batching, and connection pooling. Use when experiencing slow API responses, implementing caching strategies, or optimizing request throughput for Retell AI integrations. Trigger with phrases like "retellai performance", "optimize retellai", "retellai latency", "retellai caching", "retellai slow", "retellai batch". allowed-tools: Read, Write, Edit version: 1.0.0 license: MIT author: Jeremy Longshore <jeremy@intentsolutions.io>
Allowed Tools
No tools specified
Provided by Plugin
retellai-pack
Claude Code skill pack for Retell AI (30 skills)
Installation
This skill is included in the retellai-pack plugin:
/plugin install retellai-pack@claude-code-plugins-plus
Click to copy
Instructions
# Retell AI Performance Tuning
## Overview
Optimize Retell AI API performance with caching, batching, and connection pooling.
## Prerequisites
- Retell AI SDK installed
- Understanding of async patterns
- Redis or in-memory cache available (optional)
- Performance monitoring in place
## Latency Benchmarks
| Operation | P50 | P95 | P99 |
|-----------|-----|-----|-----|
| Read | 50ms | 150ms | 300ms |
| Write | 100ms | 250ms | 500ms |
| List | 75ms | 200ms | 400ms |
## Caching Strategy
### Response Caching
```typescript
import { LRUCache } from 'lru-cache';
const cache = new LRUCache({
max: 1000,
ttl: 60000, // 1 minute
updateAgeOnGet: true,
});
async function cachedRetell AIRequest(
key: string,
fetcher: () => Promise,
ttl?: number
): Promise {
const cached = cache.get(key);
if (cached) return cached as T;
const result = await fetcher();
cache.set(key, result, { ttl });
return result;
}
```
### Redis Caching (Distributed)
```typescript
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
async function cachedWithRedis(
key: string,
fetcher: () => Promise,
ttlSeconds = 60
): Promise {
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
const result = await fetcher();
await redis.setex(key, ttlSeconds, JSON.stringify(result));
return result;
}
```
## Request Batching
```typescript
import DataLoader from 'dataloader';
const retellaiLoader = new DataLoader(
async (ids) => {
// Batch fetch from Retell AI
const results = await retellaiClient.batchGet(ids);
return ids.map(id => results.find(r => r.id === id) || null);
},
{
maxBatchSize: 100,
batchScheduleFn: callback => setTimeout(callback, 10),
}
);
// Usage - automatically batched
const [item1, item2, item3] = await Promise.all([
retellaiLoader.load('id-1'),
retellaiLoader.load('id-2'),
retellaiLoader.load('id-3'),
]);
```
## Connection Optimization
```typescript
import { Agent } from 'https';
// Keep-alive connection pooling
const agent = new Agent({
keepAlive: true,
maxSockets: 10,
maxFreeSockets: 5,
timeout: 30000,
});
const client = new RetellAIClient({
apiKey: process.env.RETELLAI_API_KEY!,
httpAgent: agent,
});
```
## Pagination Optimization
```typescript
async function* paginatedRetell AIList(
fetcher: (cursor?: string) => Promise<{ data: T[]; nextCursor?: string }>
): AsyncGenerator {
let cursor: string | undefined;
do {
const { data, nextCursor } = await fetcher(cursor);
for (const item of data) {
yield item;
}
cursor = nextCursor;
} while (cursor);
}
// Usage
for await (const item of paginatedRetell AIList(cursor =>
retellaiClient.list({ cursor, limit: 100 })
)) {
await process(item);
}
```
## Performance Monitoring
```typescript
async function measuredRetell AICall(
operation: string,
fn: () => Promise
): Promise {
const start = performance.now();
try {
const result = await fn();
const duration = performance.now() - start;
console.log({ operation, duration, status: 'success' });
return result;
} catch (error) {
const duration = performance.now() - start;
console.error({ operation, duration, status: 'error', error });
throw error;
}
}
```
## Instructions
### Step 1: Establish Baseline
Measure current latency for critical Retell AI operations.
### Step 2: Implement Caching
Add response caching for frequently accessed data.
### Step 3: Enable Batching
Use DataLoader or similar for automatic request batching.
### Step 4: Optimize Connections
Configure connection pooling with keep-alive.
## Output
- Reduced API latency
- Caching layer implemented
- Request batching enabled
- Connection pooling configured
## Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Cache miss storm | TTL expired | Use stale-while-revalidate |
| Batch timeout | Too many items | Reduce batch size |
| Connection exhausted | No pooling | Configure max sockets |
| Memory pressure | Cache too large | Set max cache entries |
## Examples
### Quick Performance Wrapper
```typescript
const withPerformance = (name: string, fn: () => Promise) =>
measuredRetell AICall(name, () =>
cachedRetell AIRequest(`cache:${name}`, fn)
);
```
## Resources
- [Retell AI Performance Guide](https://docs.retellai.com/performance)
- [DataLoader Documentation](https://github.com/graphql/dataloader)
- [LRU Cache Documentation](https://github.com/isaacs/node-lru-cache)
## Next Steps
For cost optimization, see `retellai-cost-tuning`.