openevidence-rate-limits

Implement OpenEvidence rate limiting, backoff, and request optimization. Use when handling rate limit errors, implementing retry logic, or optimizing API request throughput for clinical queries. Trigger with phrases like "openevidence rate limit", "openevidence throttling", "openevidence 429", "openevidence retry", "openevidence backoff". allowed-tools: Read, Write, Edit version: 1.0.0 license: MIT author: Jeremy Longshore <jeremy@intentsolutions.io>

v1.0.0

Jeremy Longshore

MIT

Allowed Tools

No tools specified

Provided by Plugin

openevidence-pack

Claude Code skill pack for OpenEvidence medical AI (24 skills)

saas packs v1.0.0

View Plugin

Installation

This skill is included in the openevidence-pack plugin:

/plugin install openevidence-pack@claude-code-plugins-plus

Click to copy

Instructions

# OpenEvidence Rate Limits ## Overview Handle OpenEvidence rate limits gracefully with exponential backoff, request queuing, and clinical priority management. ## Prerequisites - OpenEvidence SDK installed - Understanding of async/await patterns - Access to rate limit headers ## Rate Limit Tiers | Tier | Clinical Queries/min | DeepConsult/hour | Burst | Use Case | |------|---------------------|------------------|-------|----------| | Standard | 60 | 5 | 10 | Small practices | | Professional | 300 | 20 | 50 | Clinics, small hospitals | | Enterprise | 1,000 | 100 | 200 | Health systems | | Unlimited | Custom | Custom | Custom | Large integrations | ## Rate Limit Headers | Header | Description | Example | |--------|-------------|---------| | `X-RateLimit-Limit` | Max requests per window | `60` | | `X-RateLimit-Remaining` | Requests remaining | `45` | | `X-RateLimit-Reset` | Unix timestamp for reset | `1706295600` | | `Retry-After` | Seconds to wait (on 429) | `30` | | `X-DeepConsult-Remaining` | DeepConsult quota | `5` | ## Instructions ### Step 1: Implement Exponential Backoff with Jitter ```typescript // src/openevidence/rate-limit.ts interface RetryConfig { maxRetries: number; baseDelayMs: number; maxDelayMs: number; jitterMs: number; } const DEFAULT_CONFIG: RetryConfig = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 60000, jitterMs: 500, }; export async function withExponentialBackoff( operation: () => Promise, config: Partial = {} ): Promise { const cfg = { ...DEFAULT_CONFIG, ...config }; for (let attempt = 0; attempt <= cfg.maxRetries; attempt++) { try { return await operation(); } catch (error: any) { if (attempt === cfg.maxRetries) throw error; const status = error.status || error.response?.status; if (status !== 429 && (status < 500 || status >= 600)) throw error; // Use Retry-After header if provided let delay: number; const retryAfter = error.response?.headers?.['retry-after']; if (retryAfter) { delay = parseInt(retryAfter) * 1000; } else { // Exponential backoff with jitter const exponentialDelay = cfg.baseDelayMs * Math.pow(2, attempt); const jitter = Math.random() * cfg.jitterMs; delay = Math.min(exponentialDelay + jitter, cfg.maxDelayMs); } console.log(`Rate limited. Attempt ${attempt + 1}/${cfg.maxRetries}. Retrying in ${delay}ms...`); await new Promise(r => setTimeout(r, delay)); } } throw new Error('Unreachable'); } ``` ### Step 2: Rate Limit Monitor ```typescript // src/openevidence/rate-monitor.ts export class RateLimitMonitor { private remaining: number = 60; private limit: number = 60; private resetAt: Date = new Date(); private deepConsultRemaining: number = 5; updateFromHeaders(headers: Headers | Record): void { const get = (key: string) => headers instanceof Headers ? headers.get(key) : headers[key.toLowerCase()]; const remaining = get('x-ratelimit-remaining'); if (remaining) this.remaining = parseInt(remaining); const limit = get('x-ratelimit-limit'); if (limit) this.limit = parseInt(limit); const resetTimestamp = get('x-ratelimit-reset'); if (resetTimestamp) { this.resetAt = new Date(parseInt(resetTimestamp) * 1000); } const deepConsult = get('x-deepconsult-remaining'); if (deepConsult) this.deepConsultRemaining = parseInt(deepConsult); } shouldThrottle(): boolean { return this.remaining < 5 && new Date() < this.resetAt; } getWaitTime(): number { return Math.max(0, this.resetAt.getTime() - Date.now()); } canDeepConsult(): boolean { return this.deepConsultRemaining > 0; } getStatus(): RateLimitStatus { return { remaining: this.remaining, limit: this.limit, resetAt: this.resetAt.toISOString(), usagePercent: ((this.limit - this.remaining) / this.limit) * 100, deepConsultRemaining: this.deepConsultRemaining, }; } } interface RateLimitStatus { remaining: number; limit: number; resetAt: string; usagePercent: number; deepConsultRemaining: number; } ``` ### Step 3: Priority-Based Request Queue ```typescript // src/openevidence/request-queue.ts import PQueue from 'p-queue'; type ClinicalPriority = 'stat' | 'urgent' | 'routine' | 'research'; interface QueuedRequest { operation: () => Promise; priority: ClinicalPriority; resolve: (value: T) => void; reject: (error: any) => void; } export class ClinicalRequestQueue { private queues: Record; private monitor: RateLimitMonitor; constructor(monitor: RateLimitMonitor) { this.monitor = monitor; // Priority-based concurrency this.queues = { stat: new PQueue({ concurrency: 5, interval: 1000, intervalCap: 10 }), urgent: new PQueue({ concurrency: 3, interval: 1000, intervalCap: 6 }), routine: new PQueue({ concurrency: 2, interval: 1000, intervalCap: 4 }), research: new PQueue({ concurrency: 1, interval: 1000, intervalCap: 2 }), }; } async enqueue( operation: () => Promise, priority: ClinicalPriority = 'routine' ): Promise { const queue = this.queues[priority]; return queue.add(async () => { // Check if we should throttle if (this.monitor.shouldThrottle()) { const waitTime = this.monitor.getWaitTime(); console.log(`Throttling ${priority} request for ${waitTime}ms`); await new Promise(r => setTimeout(r, waitTime)); } // Execute with retry logic return withExponentialBackoff(operation); }); } getQueueStatus(): Record { return { stat: { pending: this.queues.stat.pending, size: this.queues.stat.size }, urgent: { pending: this.queues.urgent.pending, size: this.queues.urgent.size }, routine: { pending: this.queues.routine.pending, size: this.queues.routine.size }, research: { pending: this.queues.research.pending, size: this.queues.research.size }, }; } async pause(priority?: ClinicalPriority): Promise { if (priority) { this.queues[priority].pause(); } else { Object.values(this.queues).forEach(q => q.pause()); } } async resume(priority?: ClinicalPriority): Promise { if (priority) { this.queues[priority].start(); } else { Object.values(this.queues).forEach(q => q.start()); } } } ``` ### Step 4: Adaptive Rate Limiting ```typescript // src/openevidence/adaptive-limiter.ts export class AdaptiveRateLimiter { private windowMs = 60000; // 1 minute window private requestTimes: number[] = []; private targetUsagePercent = 80; // Stay below 80% of limit private currentLimit = 60; recordRequest(): void { const now = Date.now(); this.requestTimes.push(now); // Clean old entries this.requestTimes = this.requestTimes.filter( t => t > now - this.windowMs ); } updateLimit(limit: number): void { this.currentLimit = limit; } shouldDelay(): { delay: boolean; waitMs: number } { const requestsInWindow = this.requestTimes.length; const targetMax = Math.floor(this.currentLimit * (this.targetUsagePercent / 100)); if (requestsInWindow >= targetMax) { // Calculate time until oldest request exits window const oldestRequest = Math.min(...this.requestTimes); const waitMs = Math.max(0, oldestRequest + this.windowMs - Date.now()); return { delay: true, waitMs }; } return { delay: false, waitMs: 0 }; } getMetrics(): { requestsInWindow: number; usagePercent: number; headroom: number; } { const requestsInWindow = this.requestTimes.length; return { requestsInWindow, usagePercent: (requestsInWindow / this.currentLimit) * 100, headroom: this.currentLimit - requestsInWindow, }; } } ``` ## Output - Reliable API calls with automatic retry - Priority-based request queuing - Rate limit monitoring and alerting - Adaptive throttling based on usage ## Error Handling | Header | Description | Action | |--------|-------------|--------| | `X-RateLimit-Limit` | Max requests | Adjust queue concurrency | | `X-RateLimit-Remaining` | Remaining requests | Throttle if low | | `X-RateLimit-Reset` | Reset timestamp | Wait until reset | | `Retry-After` | Seconds to wait | Honor exactly | ## Examples ### Complete Rate-Limited Client ```typescript // src/services/rate-limited-client.ts import { OpenEvidenceClient } from '@openevidence/sdk'; import { RateLimitMonitor } from '../openevidence/rate-monitor'; import { ClinicalRequestQueue } from '../openevidence/request-queue'; const monitor = new RateLimitMonitor(); const queue = new ClinicalRequestQueue(monitor); const rawClient = new OpenEvidenceClient({ apiKey: process.env.OPENEVIDENCE_API_KEY, orgId: process.env.OPENEVIDENCE_ORG_ID, }); export async function clinicalQuery( question: string, priority: 'stat' | 'urgent' | 'routine' = 'routine' ) { return queue.enqueue( async () => { const response = await rawClient.query({ question, context: { specialty: 'internal-medicine', urgency: priority }, }); // Update monitor from response headers monitor.updateFromHeaders(response.headers); return response.data; }, priority ); } // Usage await clinicalQuery('What is the dose of amoxicillin for sinusitis?', 'routine'); await clinicalQuery('Contraindications for tPA in stroke patient?', 'stat'); ``` ### Monitor Dashboard ```typescript // Expose rate limit metrics for monitoring app.get('/health/openevidence', (req, res) => { const status = monitor.getStatus(); const queueStatus = queue.getQueueStatus(); res.json({ rateLimits: status, queues: queueStatus, healthy: status.remaining > 5, }); }); ``` ## Resources - [OpenEvidence API Terms](https://www.openevidence.com/policies/api) - [p-queue Documentation](https://github.com/sindresorhus/p-queue) ## Next Steps For security configuration, see `openevidence-security-basics`.