langfuse-reference-architecture
Production-grade Langfuse architecture patterns and best practices. Use when designing LLM observability infrastructure, planning Langfuse deployment, or implementing enterprise-grade tracing architecture. Trigger with phrases like "langfuse architecture", "langfuse design", "langfuse infrastructure", "langfuse enterprise", "langfuse at scale". allowed-tools: Read, Write, Edit version: 1.0.0 license: MIT author: Jeremy Longshore <jeremy@intentsolutions.io>
Allowed Tools
No tools specified
Provided by Plugin
langfuse-pack
Claude Code skill pack for Langfuse LLM observability (24 skills)
Installation
This skill is included in the langfuse-pack plugin:
/plugin install langfuse-pack@claude-code-plugins-plus
Click to copy
Instructions
# Langfuse Reference Architecture
## Overview
Production-grade architecture patterns for Langfuse LLM observability at scale.
## Prerequisites
- Understanding of distributed systems
- Knowledge of cloud infrastructure
- Familiarity with observability patterns
## Architecture Patterns
### Pattern 1: Basic Cloud Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β API β β Worker β β Cron β β
β β Service β β Service β β Jobs β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β
β β β β β
β βββββββββββββββββ΄ββββββββββββββββ β
β β β
β ββββββββββ΄βββββββββ β
β β Langfuse SDK β β
β β (Singleton) β β
β ββββββββββ¬βββββββββ β
β β β
βββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Langfuse Cloud β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ingestion API β Processing β PostgreSQL β Dashboard β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
### Pattern 2: Self-Hosted Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VPC β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Application Cluster β β
β β βββββββββ βββββββββ βββββββββ βββββββββ β β
β β β Pod 1 β β Pod 2 β β Pod 3 β β Pod N β β β
β β βββββ¬ββββ βββββ¬ββββ βββββ¬ββββ βββββ¬ββββ β β
β β ββββββββββββ΄ββββββ¬βββββ΄βββββββββββ β β
β βββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ β
β β β
β ββββββββββ΄βββββββββ β
β β Internal LB β β
β ββββββββββ¬βββββββββ β
β β β
β βββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ β
β β Langfuse Self-Hosted Cluster β β
β β βββββββββββββ βββββββββββββ βββββββββββββ β β
β β β Langfuse β β Langfuse β β Langfuse β β β
β β β Instance 1β β Instance 2β β Instance 3β β β
β β βββββββ¬ββββββ βββββββ¬ββββββ βββββββ¬ββββββ β β
β β ββββββββββββββββββΌβββββββββββββββββ β β
β β β β β
β β ββββββββββ΄βββββββββ β β
β β β PostgreSQL RDS β β β
β β β (Multi-AZ) β β β
β β βββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
### Pattern 3: High-Scale Architecture with Buffer
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Regional Application Clusters β β
β β βββββββββββ βββββββββββ βββββββββββ β β
β β β US-East β β EU-West β β AP-Southβ β β
β β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β β
β β ββββββββββββββΌβββββββββββββ β β
β ββββββββββββββββββββββΌββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββ΄βββββββββ β
β β Langfuse SDK β β
β β (Batched) β β
β ββββββββββ¬βββββββββ β
β β β
β ββββββββββ΄βββββββββ β
β β Message Queue β β Buffer for high volume β
β β (SQS/Kafka) β β
β ββββββββββ¬βββββββββ β
β β β
β ββββββββββ΄βββββββββ β
β β Ingestion β β Async workers β
β β Workers β β
β ββββββββββ¬βββββββββ β
β β β
βββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Langfuse (Cloud/Self-Hosted) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Instructions
### Step 1: Implement Singleton SDK Pattern
```typescript
// lib/langfuse/client.ts
import { Langfuse } from "langfuse";
class LangfuseClient {
private static instance: Langfuse | null = null;
private static shutdownRegistered = false;
static getInstance(): Langfuse {
if (!LangfuseClient.instance) {
LangfuseClient.instance = new Langfuse({
publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
secretKey: process.env.LANGFUSE_SECRET_KEY!,
baseUrl: process.env.LANGFUSE_HOST,
// Production settings
flushAt: parseInt(process.env.LANGFUSE_FLUSH_AT || "25"),
flushInterval: parseInt(process.env.LANGFUSE_FLUSH_INTERVAL || "5000"),
requestTimeout: 15000,
});
if (!LangfuseClient.shutdownRegistered) {
LangfuseClient.registerShutdown();
}
}
return LangfuseClient.instance;
}
private static registerShutdown() {
const shutdown = async (signal: string) => {
console.log(`${signal} received. Flushing Langfuse...`);
if (LangfuseClient.instance) {
await LangfuseClient.instance.shutdownAsync();
LangfuseClient.instance = null;
}
};
process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));
process.on("beforeExit", () => shutdown("beforeExit"));
LangfuseClient.shutdownRegistered = true;
}
}
export const langfuse = LangfuseClient.getInstance();
```
### Step 2: Implement Trace Context Propagation
```typescript
// lib/langfuse/context.ts
import { AsyncLocalStorage } from "async_hooks";
interface TraceContext {
traceId: string;
parentSpanId?: string;
userId?: string;
sessionId?: string;
}
const traceStorage = new AsyncLocalStorage();
export function withTraceContext(
context: TraceContext,
fn: () => T
): T {
return traceStorage.run(context, fn);
}
export function getTraceContext(): TraceContext | undefined {
return traceStorage.getStore();
}
// Middleware for Express
export function langfuseMiddleware() {
return (req: Request, res: Response, next: NextFunction) => {
const trace = langfuse.trace({
name: `${req.method} ${req.path}`,
userId: req.user?.id,
sessionId: req.session?.id,
metadata: {
method: req.method,
path: req.path,
userAgent: req.headers["user-agent"],
},
});
const context: TraceContext = {
traceId: trace.id,
userId: req.user?.id,
sessionId: req.session?.id,
};
withTraceContext(context, () => {
// Attach trace to request for easy access
req.langfuseTrace = trace;
// Finish trace on response
res.on("finish", () => {
trace.update({
output: { statusCode: res.statusCode },
level: res.statusCode >= 400 ? "ERROR" : undefined,
});
});
next();
});
};
}
```
### Step 3: Implement Queue-Based Ingestion
```typescript
// lib/langfuse/queue.ts
import { SQSClient, SendMessageCommand } from "@aws-sdk/client-sqs";
import { Langfuse } from "langfuse";
interface QueuedTrace {
name: string;
input?: any;
output?: any;
metadata?: Record;
userId?: string;
sessionId?: string;
timestamp: string;
}
// Producer: Application sends to queue
class QueuedLangfuseProducer {
private sqs: SQSClient;
private queueUrl: string;
constructor() {
this.sqs = new SQSClient({});
this.queueUrl = process.env.LANGFUSE_QUEUE_URL!;
}
async trace(params: Omit) {
const message: QueuedTrace = {
...params,
timestamp: new Date().toISOString(),
};
await this.sqs.send(
new SendMessageCommand({
QueueUrl: this.queueUrl,
MessageBody: JSON.stringify(message),
MessageGroupId: params.sessionId || "default",
})
);
}
}
// Consumer: Worker processes queue
class QueuedLangfuseConsumer {
private langfuse: Langfuse;
constructor() {
this.langfuse = new Langfuse();
}
async processMessage(message: QueuedTrace) {
const trace = this.langfuse.trace({
name: message.name,
input: message.input,
output: message.output,
metadata: message.metadata,
userId: message.userId,
sessionId: message.sessionId,
timestamp: new Date(message.timestamp),
});
return trace.id;
}
async processBatch(messages: QueuedTrace[]) {
for (const message of messages) {
await this.processMessage(message);
}
await this.langfuse.flushAsync();
}
}
```
### Step 4: Multi-Environment Configuration
```typescript
// config/langfuse.ts
type Environment = "development" | "staging" | "production";
interface LangfuseEnvironmentConfig {
publicKey: string;
secretKey: string;
host: string;
flushAt: number;
flushInterval: number;
enabled: boolean;
sampling: {
rate: number;
alwaysSampleErrors: boolean;
};
}
const ENVIRONMENT_CONFIGS: Record = {
development: {
publicKey: process.env.LANGFUSE_PUBLIC_KEY_DEV!,
secretKey: process.env.LANGFUSE_SECRET_KEY_DEV!,
host: process.env.LANGFUSE_HOST_DEV || "http://localhost:3000",
flushAt: 1,
flushInterval: 1000,
enabled: true,
sampling: { rate: 1.0, alwaysSampleErrors: true },
},
staging: {
publicKey: process.env.LANGFUSE_PUBLIC_KEY_STAGING!,
secretKey: process.env.LANGFUSE_SECRET_KEY_STAGING!,
host: process.env.LANGFUSE_HOST_STAGING || "https://cloud.langfuse.com",
flushAt: 15,
flushInterval: 5000,
enabled: true,
sampling: { rate: 0.5, alwaysSampleErrors: true },
},
production: {
publicKey: process.env.LANGFUSE_PUBLIC_KEY_PROD!,
secretKey: process.env.LANGFUSE_SECRET_KEY_PROD!,
host: process.env.LANGFUSE_HOST || "https://cloud.langfuse.com",
flushAt: 25,
flushInterval: 5000,
enabled: true,
sampling: { rate: 0.1, alwaysSampleErrors: true },
},
};
export function getLangfuseConfig(): LangfuseEnvironmentConfig {
const env = (process.env.NODE_ENV || "development") as Environment;
return ENVIRONMENT_CONFIGS[env] || ENVIRONMENT_CONFIGS.development;
}
```
### Step 5: Implement Service Mesh Tracing
```typescript
// For microservices: propagate trace context across services
interface TraceHeaders {
"x-langfuse-trace-id": string;
"x-langfuse-parent-id"?: string;
"x-langfuse-session-id"?: string;
}
// Outgoing request: inject headers
function injectTraceHeaders(headers: Headers) {
const context = getTraceContext();
if (context) {
headers.set("x-langfuse-trace-id", context.traceId);
if (context.parentSpanId) {
headers.set("x-langfuse-parent-id", context.parentSpanId);
}
if (context.sessionId) {
headers.set("x-langfuse-session-id", context.sessionId);
}
}
}
// Incoming request: extract headers and continue trace
function extractTraceContext(request: Request): TraceContext | null {
const traceId = request.headers.get("x-langfuse-trace-id");
if (!traceId) return null;
return {
traceId,
parentSpanId: request.headers.get("x-langfuse-parent-id") || undefined,
sessionId: request.headers.get("x-langfuse-session-id") || undefined,
};
}
// Create linked trace in downstream service
function createLinkedTrace(parentContext: TraceContext, name: string) {
return langfuse.trace({
name,
sessionId: parentContext.sessionId,
metadata: {
parentTraceId: parentContext.traceId,
parentSpanId: parentContext.parentSpanId,
},
});
}
```
## Output
- Singleton SDK pattern with graceful shutdown
- Trace context propagation
- Queue-based async ingestion
- Multi-environment configuration
- Service mesh integration
## Architecture Decision Matrix
| Pattern | Use Case | Complexity | Scale |
|---------|----------|------------|-------|
| Basic Cloud | Small apps | Low | 100K traces/day |
| Self-Hosted | Data privacy | Medium | 1M traces/day |
| Queue-Based | High volume | High | 10M+ traces/day |
## Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Multiple instances | No singleton | Use singleton pattern |
| Lost traces | No shutdown | Register shutdown handlers |
| Cross-service gaps | No propagation | Implement header injection |
| Scale issues | Direct ingestion | Add message queue buffer |
## Resources
- [Langfuse Self-Hosting](https://langfuse.com/docs/deployment/self-host)
- [Langfuse Architecture](https://langfuse.com/docs)
- [OpenTelemetry Context](https://opentelemetry.io/docs/concepts/context-propagation/)
## Next Steps
For multi-environment setup, see `langfuse-multi-env-setup`.