@ai2070/l0
Version:
L0: The Missing Reliability Substrate for AI
1,277 lines (1,032 loc) • 74.7 kB
Markdown
# L0 - Deterministic Streaming Execution Substrate (DSES) for AI
### The missing reliability and observability layer for all AI streams.

<p align="center">
<a href="https://www.npmjs.com/package/@ai2070/l0">
<img src="https://img.shields.io/npm/v/@ai2070/l0?color=brightgreen&label=npm" alt="npm version">
</a>
<a href="https://bundlephobia.com/package/@ai2070/l0">
<img src="https://img.shields.io/bundlephobia/minzip/@ai2070/l0?label=minzipped" alt="minzipped size">
</a>
<a href="https://packagephobia.com/result?p=@ai2070/l0">
<img src="https://packagephobia.com/badge?p=@ai2070/l0" alt="install size">
</a>
<img src="https://img.shields.io/badge/types-included-blue?logo=typescript&logoColor=white" alt="Types Included">
<a href="https://github.com/ai-2070/l0/actions">
<img src="https://img.shields.io/github/actions/workflow/status/ai-2070/l0/ci.yml?label=tests" alt="CI status">
</a>
<img src="https://img.shields.io/badge/license-Apache_2.0-green" alt="MIT License">
</p>
> LLMs produce high-value reasoning over a low-integrity transport layer.
> Streams stall, drop tokens, reorder events, violate timing guarantees, and expose no deterministic contract.
>
> This breaks retries. It breaks supervision. It breaks reproducibility.
> It makes reliable AI systems impossible to build on top of raw provider streams.
>
> **L0 is the deterministic execution substrate that fixes the transport -
with guardrails designed for the streaming layer itself: stream-neutral, pattern-based, loop-safe, and timing-aware.**
L0 adds deterministic execution, fallbacks, retries, network protection, guardrails, drift detection, and tool tracking to any LLM stream - turning raw model output into production-grade behavior.
It works with **OpenAI**, **Vercel AI SDK**, **Mastra AI**, and **custom adapters**. Supports **multimodal streams**, tool calls, and provides full deterministic replay.
```bash
npm install @ai2070/l0
```
_Production-grade reliability. Just pass your stream. L0'll take it from here._
L0 includes 2,600+ tests covering all major reliability features.
```
Any AI Stream L0 Layer Your App
───────────────── ┌──────────────────────────────────────┐ ─────────────
│ │
Vercel AI SDK │ Retry · Fallback · Resume │ Reliable
OpenAI / Mastra ──▶│ Guardrails · Timeouts · Consensus │─────▶ Output
Custom Streams │ Full Observability │
│ │
└──────────────────────────────────────┘
───────────────── ─────────────
text / image / L0 = Token-Level Reliability
video / audio
```
**Upcoming versions:**
- **1.0.0** - API freeze + Python version
**Bundle sizes (minified):**
| Import | Size | Gzipped | Description |
| ----------------------- | ----- | ------- | ------------------------ |
| `@ai2070/l0` (full) | 191KB | 56KB | Everything |
| `@ai2070/l0/core` | 71KB | 21KB | Runtime + retry + errors |
| `@ai2070/l0/structured` | 61KB | 18KB | Structured output |
| `@ai2070/l0/consensus` | 72KB | 21KB | Multi-model consensus |
| `@ai2070/l0/parallel` | 58KB | 17KB | Parallel/race operations |
| `@ai2070/l0/window` | 62KB | 18KB | Document chunking |
| `@ai2070/l0/guardrails` | 18KB | 6KB | Validation rules |
| `@ai2070/l0/monitoring` | 27KB | 7KB | OTel/Sentry |
| `@ai2070/l0/drift` | 4KB | 2KB | Drift detection |
Dependency-free. Tree-shakeable subpath exports for minimal bundles.
> Most applications should simply use `import { l0 } from "@ai2070/l0"`.
> Only optimize imports if you're targeting edge runtimes or strict bundle constraints.
## Features
| Feature | Description |
| ------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **🔁 Smart Retries** | Model-aware retries with fixed-jitter backoff. Automatic retries for zero-token output, network stalls, SSE disconnects, and provider overloads. |
| **🌐 Network Protection** | Automatic recovery from dropped streams, slow responses, backgrounding, 429/503 load shedding, DNS errors, and partial chunks. |
| **🔀 Model Fallbacks** | Automatically fallback to secondary models (e.g., 4o → 4o-mini → Claude/Gemini) with full retry logic. |
| **💥 Zero-Token/Stall Protection** | Detects when model produces nothing or stalls mid-stream. Automatically retries or switches to fallbacks. |
| 📍 **Last-Known-Good Token Resumption** | When a stream interrupts, L0 resumes generation from the last structurally valid token (Opt-in). |
| **🧠 Drift Detection** | Detects tone shifts, duplicated sentences, entropy spikes, markdown collapse, and meta-AI patterns before corruption. |
| **🧱 Structured Output** | Guaranteed-valid JSON with Zod (v3/v4), Effect Schema, or JSON Schema. Auto-corrects missing braces, commas, and markdown fences. |
| **🩹 JSON Auto-Healing + Markdown Fence Repair** | Automatic correction of truncated or malformed JSON (missing braces, brackets, quotes), and repair of broken Markdown code fences. Ensures clean extraction of structured data from noisy LLM output. |
| **🛡️ Guardrails** | JSON, Markdown, LaTeX, and pattern validation with fast/slow path execution. Delta-only checks run sync; full-content scans defer to async to never block streaming. |
| **⚡ Race: Fastest-Model Wins** | Run multiple models or providers in parallel and return the fastest valid stream. Ideal for ultra-low-latency chat and high-availability systems. |
| **🌿 Parallel: Fan-Out / Fan-In** | Start multiple streams simultaneously and collect structured or summarized results. Perfect for agent-style multi-model workflows. |
| **🔗 Pipe: Streaming Pipelines** | Compose multiple streaming steps (e.g., summarize → refine → translate) with safe state passing and guardrails between each stage. |
| **🧩 Consensus: Agreement Across Models** | Combine multiple model outputs using unanimous, weighted, or best-match consensus. Guarantees high-confidence generation for safety-critical tasks. |
| **📄 Document Windows** | Built-in chunking (token, paragraph, sentence, character). Ideal for long documents, transcripts, or multi-page processing. |
| **🎨 Formatting Helpers** | Extract JSON/code from markdown fences, strip thinking tags, normalize whitespace, and clean LLM output for downstream processing. |
| **📊 Monitoring** | Built-in integrations with OpenTelemetry and Sentry for metrics, tracing, and error tracking. |
| **🔔 Lifecycle Callbacks** | `onStart`, `onComplete`, `onError`, `onEvent`, `onViolation`, `onRetry`, `onFallback`, `onToolCall` - full observability into every stream phase. |
| **📡 Streaming-First Runtime** | Thin, deterministic wrapper over `streamText()` with unified event types (`token`, `error`, `complete`) for easy UIs. |
| **📼 Atomic Event Logs** | Record every token, retry, fallback, and guardrail check as immutable events. Full audit trail for debugging and compliance. |
| **🔄 Byte-for-Byte Replays** | Deterministically replay any recorded stream to reproduce exact output. Perfect for testing, and time-travel debugging. |
| **⛔ Safety-First Defaults** | Continuation off by default. Structured objects never resumed. No silent corruption. Integrity always preserved. |
| **⚡ Tiny & Explicit** | 21KB gzipped core. Tree-shakeable with subpath exports (`/core`, `/structured`, `/consensus`, `/parallel`, `/window`). No frameworks, no heavy abstractions. |
| **🔌 Custom Adapters (BYOA)** | Bring your own adapter for any LLM provider. Built-in adapters for Vercel AI SDK, OpenAI, and Mastra. |
| **🖼️ Multimodal Support** | Build adapters for image/audio/video generation (FLUX.2, Stable Diffusion, Veo 3, CSM). Progress tracking, data events, and state management for non-text outputs. |
| **🧪 Battle-Tested** | 2,600+ unit tests and 250+ integration tests validating real streaming, retries, and advanced behavior. |
## Quick Start
### With Vercel AI SDK: Minimal Usage
```typescript
import { l0, recommendedGuardrails, recommendedRetry } from "@ai2070/l0";
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
const result = await l0({
// Primary model stream
stream: () =>
streamText({
model: openai("gpt-5-mini"),
prompt,
}),
});
// Read the stream
for await (const event of result.stream) {
```
### Vercel AI SDK: Expanded
```typescript
import { l0, recommendedGuardrails, recommendedRetry } from "@ai2070/l0";
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
const result = await l0({
// Primary model stream
stream: () =>
streamText({
model: openai("gpt-5-mini"),
prompt,
}),
// Optional: Fallback models
fallbackStreams: [() => streamText({ model: openai("gpt-5-mini"), prompt })],
// Optional: Guardrails, default: none
guardrails: recommendedGuardrails,
// Other presets:
// minimalGuardrails // jsonRule, zeroOutputRule
// recommendedGuardrails // jsonRule, markdownRule, zeroOutputRule, patternRule
// strictGuardrails // jsonRule, markdownRule, latexRule, patternRule, zeroOutputRule
// jsonOnlyGuardrails // jsonRule, zeroOutputRule
// markdownOnlyGuardrails // markdownRule, zeroOutputRule
// latexOnlyGuardrails // latexRule, zeroOutputRule
// Optional: Retry configuration, default as follows
retry: {
attempts: 3, // LLM errors only
maxRetries: 6, // Total (LLM + network)
baseDelay: 1000,
maxDelay: 10000,
backoff: "fixed-jitter", // "exponential" | "linear" | "fixed" | "full-jitter"
},
// Or use presets:
// minimalRetry // { attempts: 2, maxRetries: 4, backoff: "linear" }
// recommendedRetry // { attempts: 3, maxRetries: 6, backoff: "fixed-jitter" }
// strictRetry // { attempts: 3, maxRetries: 6, backoff: "full-jitter" }
// exponentialRetry // { attempts: 4, maxRetries: 8, backoff: "exponential" }
// Optional: Timeout configuration, default as follows
timeout: {
initialToken: 5000, // 5s to first token
interToken: 10000, // 10s between tokens
},
// Optional: Guardrail check intervals, default as follows
checkIntervals: {
guardrails: 5, // Check every N tokens
drift: 10,
checkpoint: 10,
},
// Optional: User context (attached to all observability events)
context: { requestId: "req_123", userId: "user_456" },
// Optional: Abort signal
signal: abortController.signal,
// Optional: Enable telemetry
monitoring: { enabled: true },
// Optional: Lifecycle callbacks (all are optional)
onStart: (attempt, isRetry, isFallback) => {},
onComplete: (state) => {},
onError: (error, willRetry, willFallback) => {},
onViolation: (violation) => {},
onRetry: (attempt, reason) => {},
onFallback: (index, reason) => {},
onToolCall: (toolName, toolCallId, args) => {},
});
// Read the stream
for await (const event of result.stream) {
if (event.type === "token") {
process.stdout.write(event.value);
}
}
```
**See Also: [API.md](./API.md) - Complete API reference**
### With OpenAI SDK
```typescript
import OpenAI from "openai";
import { l0, openaiStream, recommendedGuardrails } from "@ai2070/l0";
const openai = new OpenAI();
const result = await l0({
stream: openaiStream(openai, {
model: "gpt-4o",
messages: [{ role: "user", content: "Generate a haiku about coding" }],
}),
guardrails: recommendedGuardrails,
});
for await (const event of result.stream) {
if (event.type === "token") process.stdout.write(event.value);
}
```
### With Mastra AI
```typescript
import { Agent } from "@mastra/core/agent";
import { l0, mastraStream, recommendedGuardrails } from "@ai2070/l0";
const agent = new Agent({
name: "haiku-writer",
instructions: "You are a poet who writes haikus",
model: "openai/gpt-4o",
});
const result = await l0({
stream: mastraStream(agent, "Generate a haiku about coding"),
guardrails: recommendedGuardrails,
});
for await (const event of result.stream) {
if (event.type === "token") process.stdout.write(event.value);
}
```
## Core Features
| Feature | Description |
| --------------------------------------------------------------------- | --------------------------------------------------------------- |
| [Streaming Runtime](#streaming-runtime) | Token-by-token normalization, checkpoints, resumable generation |
| [Retry Logic](#retry-logic) | Smart retries with backoff, network vs model error distinction |
| [Network Protection](#network-protection) | Auto-recovery from 12+ network failure types |
| [Structured Output](#structured-output) | Guaranteed valid JSON with Zod, Effect Schema, or JSON Schema |
| [Fallback Models](#fallback-models) | Sequential fallback when primary model fails |
| [Document Windows](#document-windows) | Automatic chunking for long documents |
| [Formatting Helpers](#formatting-helpers) | Context, memory, tools, and output formatting utilities |
| [Last-Known-Good Token Resumption](#last-known-good-token-resumption) | Resume from last checkpoint on retry/fallback (opt-in) |
| [Guardrails](#guardrails) | JSON, Markdown, LaTeX validation, pattern detection |
| [Consensus](#consensus) | Multi-model agreement with voting strategies |
| [Parallel Operations](#parallel-operations) | Race, batch, pool patterns for concurrent LLM calls |
| [Type-Safe Generics](#type-safe-generics) | Forward output types through all L0 functions |
| [Custom Adapters (BYOA)](#custom-adapters-byoa) | Bring your own adapter for any LLM provider |
| [Multimodal Support](#multimodal-support) | Image, audio, video generation with progress tracking |
| [Lifecycle Callbacks](#lifecycle-callbacks) | Full observability into every stream phase |
| [Event Sourcing](#event-sourcing) | Record/replay streams for testing and audit trails |
| [Error Handling](#error-handling) | Typed errors with categorization and recovery hints |
| [Monitoring](#monitoring) | Built-in OTel and Sentry integrations |
| [Testing](#testing) | 2,600+ tests covering all features and SDK adapters |
---
## Streaming Runtime
L0 wraps `streamText()` with deterministic behavior:
```typescript
const result = await l0({
stream: () => streamText({ model, prompt }),
// Optional: Timeouts (ms)
timeout: {
initialToken: 5000, // 5s to first token
interToken: 10000, // 10s between tokens
},
signal: abortController.signal,
});
// Unified event format
for await (const event of result.stream) {
switch (event.type) {
case "token":
console.log(event.value);
break;
case "complete":
console.log("Complete");
break;
case "error":
console.error(event.error, event.reason); // reason: ErrorCategory
break;
}
}
// Access final state
console.log(result.state.content); // Full accumulated content
console.log(result.state.tokenCount); // Total tokens received
console.log(result.state.checkpoint); // Last stable checkpoint
```
⚠️ Free and low-priority models may take **3–7 seconds** before emitting the first token and **10 seconds** between tokens.
---
## Retry Logic
Smart retry system that distinguishes network errors from model errors:
```typescript
const result = await l0({
stream: () => streamText({ model, prompt }),
retry: {
attempts: 3, // Model errors only (default: 3)
maxRetries: 6, // Absolute cap across all error types (default: 6)
baseDelay: 1000,
maxDelay: 10000,
backoff: "fixed-jitter", // or "exponential", "linear", "fixed", "full-jitter"
// Optional: specify which error types to retry on, defaults to all recoverable errors
retryOn: [
"zero_output",
"guardrail_violation",
"drift",
"incomplete",
"network_error",
"timeout",
"rate_limit",
"server_error",
],
// Custom delays per error type (overrides baseDelay)
errorTypeDelays: {
connectionDropped: 2000,
timeout: 1500,
dnsError: 5000,
},
},
});
```
### Retry Behavior
| Error Type | Category | Retries | Counts Toward `attempts` | Counts Toward `maxRetries` |
| -------------------- | ----------- | ------- | ------------------------ | -------------------------- |
| Network disconnect | `NETWORK` | Yes | No | Yes |
| Zero output | `CONTENT` | Yes | **Yes** | Yes |
| Timeout | `TRANSIENT` | Yes | No | Yes |
| 429 rate limit | `TRANSIENT` | Yes | No | Yes |
| 503 server error | `TRANSIENT` | Yes | No | Yes |
| Guardrail violation | `CONTENT` | Yes | **Yes** | Yes |
| Drift detected | `CONTENT` | Yes | **Yes** | Yes |
| Model error | `MODEL` | Yes | **Yes** | Yes |
| Auth error (401/403) | `FATAL` | No | - | - |
| Invalid config | `INTERNAL` | No | - | - |
---
## Network Protection
Automatic detection and recovery from network failures:
```typescript
import { isNetworkError, analyzeNetworkError } from "@ai2070/l0";
try {
await l0({ stream, retry: recommendedRetry });
} catch (error) {
if (isNetworkError(error)) {
const analysis = analyzeNetworkError(error);
console.log(analysis.type); // "connection_dropped", "timeout", etc.
console.log(analysis.retryable); // true/false
console.log(analysis.suggestion); // Recovery suggestion
}
}
```
Detected error types: connection dropped, fetch errors, ECONNRESET, ECONNREFUSED, SSE aborted, DNS errors, timeouts, mobile background throttle, and more.
---
## Structured Output
Guaranteed valid JSON matching your schema. Supports **Zod** (v3/v4), **Effect Schema**, and **JSON Schema**:
### With Zod
```typescript
import { structured } from "@ai2070/l0";
import { z } from "zod";
const schema = z.object({
name: z.string(),
age: z.number(),
email: z.string().email(),
});
const result = await structured({
schema,
stream: () => streamText({ model, prompt: "Generate user data as JSON" }),
autoCorrect: true, // Fix trailing commas, missing braces, etc.
});
// Type-safe access
console.log(result.data.name); // string
console.log(result.data.age); // number
console.log(result.corrected); // true if auto-corrected
```
### With Effect Schema
```typescript
import {
structured,
registerEffectSchemaAdapter,
wrapEffectSchema,
} from "@ai2070/l0";
import { Schema } from "effect";
// Register the adapter once at app startup
registerEffectSchemaAdapter({
decodeUnknownSync: (schema, data) => Schema.decodeUnknownSync(schema)(data),
decodeUnknownEither: (schema, data) => {
try {
return { _tag: "Right", right: Schema.decodeUnknownSync(schema)(data) };
} catch (error) {
return {
_tag: "Left",
left: { _tag: "ParseError", issue: error, message: error.message },
};
}
},
formatError: (error) => error.message,
});
// Define schema with Effect
const schema = Schema.Struct({
name: Schema.String,
age: Schema.Number,
email: Schema.String,
});
// Use with structured()
const result = await structured({
schema: wrapEffectSchema(schema),
stream: () => streamText({ model, prompt: "Generate user data as JSON" }),
autoCorrect: true,
});
console.log(result.data.name); // string - fully typed
```
### With JSON Schema
```typescript
import {
structured,
registerJSONSchemaAdapter,
wrapJSONSchema,
} from "@ai2070/l0";
import Ajv from "ajv"; // Or any JSON Schema validator
// Register adapter once at app startup (example with Ajv)
const ajv = new Ajv({ allErrors: true });
registerJSONSchemaAdapter({
validate: (schema, data) => {
const validate = ajv.compile(schema);
const valid = validate(data);
if (valid) return { valid: true, data };
return {
valid: false,
errors: (validate.errors || []).map((e) => ({
path: e.instancePath || "/",
message: e.message || "Validation failed",
keyword: e.keyword,
params: e.params,
})),
};
},
formatErrors: (errors) =>
errors.map((e) => `${e.path}: ${e.message}`).join(", "),
});
// Define schema with JSON Schema
const schema = {
type: "object",
properties: {
name: { type: "string" },
age: { type: "number" },
email: { type: "string", format: "email" },
},
required: ["name", "age", "email"],
};
// Use with structured()
const result = await structured({
schema: wrapJSONSchema<{ name: string; age: number; email: string }>(schema),
stream: () => streamText({ model, prompt: "Generate user data as JSON" }),
autoCorrect: true,
});
console.log(result.data.name); // string - typed via generic
```
### Helper Functions
```typescript
import { structuredObject, structuredArray, structuredStream } from "@ai2070/l0";
// Quick object schema
const result = await structuredObject({
name: z.string(),
age: z.number()
}, { stream });
// Quick array schema
const result = await structuredArray(
z.object({ name: z.string() }),
{ stream }
);
// Streaming with end validation
const { stream, result, abort } = await structuredStream({
schema,
stream: () => streamText({ model, prompt })
});
for await (const event of stream) {
if (event.type === 'token') console.log(event.value);
}
const validated = await result;
```
### Structured Output Presets
```typescript
import { minimalStructured, recommendedStructured, strictStructured } from "@ai2070/l0";
// minimalStructured: { autoCorrect: false, retry: { attempts: 1 } }
// recommendedStructured: { autoCorrect: true, retry: { attempts: 2 } }
// strictStructured: { autoCorrect: true, strictMode: true, retry: { attempts: 3 } }
const result = await structured({
schema,
stream,
...recommendedStructured
});
```
---
## Fallback Models
Sequential fallback when primary model fails:
```typescript
const result = await l0({
stream: () => streamText({ model: openai("gpt-4o"), prompt }),
fallbackStreams: [
() => streamText({ model: openai("gpt-5-nano"), prompt }),
() => streamText({ model: anthropic("claude-3-haiku"), prompt }),
],
});
// Check which model succeeded
console.log(result.state.fallbackIndex); // 0 = primary, 1+ = fallback
```
---
## Document Windows
Process documents that exceed context limits:
```typescript
import { createWindow } from "@ai2070/l0";
const window = createWindow(longDocument, {
size: 2000, // Tokens per chunk
overlap: 200, // Overlap between chunks
strategy: "paragraph", // or "token", "sentence", "char"
});
// Process all chunks
const results = await window.processAll((chunk) => ({
stream: () =>
streamText({
model,
prompt: `Summarize: ${chunk.content}`,
}),
}));
// Or navigate manually
const first = window.current();
const next = window.next();
```
---
## Formatting Helpers
Utilities for context, memory, output instructions, and tool definitions:
```typescript
import { formatContext, formatMemory, formatTool, formatJsonOutput } from "@ai2070/l0";
// Wrap documents with XML/Markdown/bracket delimiters
const context = formatContext(document, { label: "Documentation", delimiter: "xml" });
// Format conversation history (conversational, structured, or compact)
const memory = formatMemory(messages, { style: "conversational", maxEntries: 10 });
// Define tools with JSON schema, TypeScript, or natural language
const tool = formatTool({ name: "search", description: "Search", parameters: [...] });
// Request strict JSON output
const instruction = formatJsonOutput({ strict: true, schema: "..." });
```
See [FORMATTING.md](./FORMATTING.md) for complete API reference.
---
## Last-Known-Good Token Resumption
When a stream fails mid-generation, L0 can resume from the last known good checkpoint instead of starting over. This preserves already-generated content and reduces latency on retries.
```typescript
const result = await l0({
stream: () => streamText({ model, prompt }),
retry: { attempts: 3 },
// Enable continuation from last checkpoint (opt-in)
continueFromLastKnownGoodToken: true,
});
// Check if continuation was used
console.log(result.state.resumed); // true if resumed from checkpoint
console.log(result.state.resumePoint); // The checkpoint content
console.log(result.state.resumeFrom); // Character offset where resume occurred
```
### How It Works
1. L0 maintains a checkpoint of successfully received tokens (every N tokens, configurable via `checkIntervals.checkpoint`)
2. When a retry or fallback is triggered, the checkpoint is validated against guardrails and drift detection
3. If validation passes, the checkpoint content is emitted first to the consumer
4. The `buildContinuationPrompt` callback (if provided) is called to allow updating the prompt for continuation
5. Telemetry tracks whether continuation was enabled, used, and the checkpoint details
### Using buildContinuationPrompt
To have the LLM actually continue from where it left off (rather than just replaying tokens locally), use `buildContinuationPrompt` to modify the prompt:
```typescript
let continuationPrompt = "";
const originalPrompt = "Write a detailed analysis of...";
const result = await l0({
stream: () =>
streamText({
model: openai("gpt-4o"),
prompt: continuationPrompt || originalPrompt,
}),
continueFromLastKnownGoodToken: true,
buildContinuationPrompt: (checkpoint) => {
// Update the prompt to tell the LLM to continue from checkpoint
continuationPrompt = `${originalPrompt}\n\nContinue from where you left off:\n${checkpoint}`;
return continuationPrompt;
},
retry: { attempts: 3 },
});
```
When LLMs continue from a checkpoint, they often repeat words from the end. L0 automatically detects and removes this overlap (enabled by default). See [API Reference](./API.md#smart-continuation-deduplication) for configuration options.
### Example: Resuming After Network Error
```typescript
const result = await l0({
stream: () =>
streamText({
model: openai("gpt-4o"),
prompt: "Write a detailed analysis of...",
}),
fallbackStreams: [() => streamText({ model: openai("gpt-5-nano"), prompt })],
retry: { attempts: 3 },
continueFromLastKnownGoodToken: true,
checkIntervals: { checkpoint: 10 }, // Save checkpoint every 10 tokens
monitoring: { enabled: true },
});
for await (const event of result.stream) {
if (event.type === "token") {
process.stdout.write(event.value);
}
}
// Check telemetry for continuation usage
if (result.telemetry?.continuation?.used) {
console.log(
"\nResumed from checkpoint of length:",
result.telemetry.continuation.checkpointLength,
);
}
```
### Checkpoint Validation
Before using a checkpoint for continuation, L0 validates it:
- **Guardrails**: All configured guardrails are run against the checkpoint content
- **Drift Detection**: If enabled, checks for format drift in the checkpoint
- **Fatal Violations**: If any guardrail returns a fatal violation, the checkpoint is discarded and retry starts fresh
### Important Limitations
> ⚠️ **Do NOT use `continueFromLastKnownGoodToken` with structured output or `streamObject()`.**
>
> Continuation works by prepending checkpoint content to the next generation. For JSON/structured output, this can corrupt the data structure because:
>
> - The model may not properly continue the JSON syntax
> - Partial objects could result in invalid JSON
> - Schema validation may fail on malformed output
>
> For structured output, let L0 retry from scratch to ensure valid JSON.
```typescript
// ✅ GOOD - Text generation with continuation
const result = await l0({
stream: () => streamText({ model, prompt: "Write an essay..." }),
continueFromLastKnownGoodToken: true,
});
// ❌ BAD - Do NOT use with structured output
const result = await structured({
schema: mySchema,
stream: () => streamText({ model, prompt }),
continueFromLastKnownGoodToken: true, // DON'T DO THIS
});
```
---
## Guardrails
Pure functions that validate streaming output without rewriting it:
```typescript
import {
jsonRule,
markdownRule,
zeroOutputRule,
patternRule,
customPatternRule,
} from "@ai2070/l0";
const result = await l0({
stream: () => streamText({ model, prompt }),
guardrails: [
jsonRule(), // Validates JSON structure
markdownRule(), // Validates Markdown fences/tables
zeroOutputRule(), // Detects empty output
patternRule(), // Detects "As an AI..." patterns
customPatternRule([/forbidden/i], "Custom violation"),
],
});
```
### Presets
```typescript
import {
minimalGuardrails, // jsonRule, zeroOutputRule
recommendedGuardrails, // jsonRule, markdownRule, zeroOutputRule, patternRule
strictGuardrails, // jsonRule, markdownRule, latexRule, patternRule, zeroOutputRule
jsonOnlyGuardrails, // jsonRule, zeroOutputRule
markdownOnlyGuardrails, // markdownRule, zeroOutputRule
latexOnlyGuardrails, // latexRule, zeroOutputRule
} from "@ai2070/l0";
```
| Preset | Rules Included |
| ------------------------ | ------------------------------------------------------------ |
| `minimalGuardrails` | `jsonRule`, `zeroOutputRule` |
| `recommendedGuardrails` | `jsonRule`, `markdownRule`, `zeroOutputRule`, `patternRule` |
| `strictGuardrails` | `jsonRule`, `markdownRule`, `latexRule`, `patternRule`, `zeroOutputRule` |
### Fast/Slow Path Execution
L0 uses a two-path strategy to avoid blocking the streaming loop:
| Path | When | Behavior |
| -------- | ------------------------ | ------------------------------------------- |
| **Fast** | Delta < 1KB, total < 5KB | Synchronous check, immediate result |
| **Slow** | Large content | Deferred via `setImmediate()`, non-blocking |
For long outputs, tune the check frequency:
```typescript
await l0({
stream,
guardrails: recommendedGuardrails,
checkIntervals: {
guardrails: 50, // Check every 50 tokens (default: 5)
},
});
```
See [GUARDRAILS.md](./GUARDRAILS.md) for full documentation.
---
## Consensus
Multi-generation consensus for high-confidence results:
```typescript
import { consensus } from "@ai2070/l0";
const result = await consensus({
streams: [
() => streamText({ model, prompt }),
() => streamText({ model, prompt }),
() => streamText({ model, prompt }),
],
strategy: "majority", // or "unanimous", "weighted", "best"
threshold: 0.8,
});
console.log(result.consensus); // Agreed output
console.log(result.confidence); // 0-1 confidence score
console.log(result.agreements); // What they agreed on
console.log(result.disagreements); // Where they differed
```
---
## Parallel Operations
Run multiple LLM calls concurrently with different patterns:
### Race - First Response Wins
```typescript
import { race } from "@ai2070/l0";
const result = await race([
{ stream: () => streamText({ model: openai("gpt-4o"), prompt }) },
{ stream: () => streamText({ model: anthropic("claude-3-opus"), prompt }) },
{ stream: () => streamText({ model: google("gemini-pro"), prompt }) },
]);
// Returns first successful response, cancels others
console.log(result.winnerIndex); // 0-based index of winning stream
console.log(result.state.content); // Content from winning stream
```
### Parallel with Concurrency Control
```typescript
import { parallel } from "@ai2070/l0";
const results = await parallel(
[
{ stream: () => streamText({ model, prompt: "Task 1" }) },
{ stream: () => streamText({ model, prompt: "Task 2" }) },
{ stream: () => streamText({ model, prompt: "Task 3" }) },
],
{
concurrency: 2, // Max 2 concurrent
failFast: false, // Continue on errors
},
);
console.log(results.successCount);
console.log(results.results[0]?.state.content);
```
### Fall-Through vs Race
| Pattern | Execution | Cost | Best For |
| ------------ | --------------------------- | ------------------ | --------------------------------- |
| Fall-through | Sequential, next on failure | Low (pay for 1) | High availability, cost-sensitive |
| Race | Parallel, first wins | High (pay for all) | Low latency, speed-critical |
```typescript
// Fall-through: Try models sequentially
const result = await l0({
stream: () => streamText({ model: openai("gpt-4o"), prompt }),
fallbackStreams: [
() => streamText({ model: openai("gpt-5-nano"), prompt }),
() => streamText({ model: anthropic("claude-3-haiku"), prompt }),
],
});
// Race: All models simultaneously, first wins
const result = await race([
{ stream: () => streamText({ model: openai("gpt-4o"), prompt }) },
{ stream: () => streamText({ model: anthropic("claude-3-opus"), prompt }) },
]);
```
### Operation Pool
For dynamic workloads, use `OperationPool` to process operations with a shared concurrency limit:
```typescript
import { createPool } from "@ai2070/l0";
const pool = createPool(3); // Max 3 concurrent operations
// Add operations dynamically
const result1 = pool.execute({ stream: () => streamText({ model, prompt: "Task 1" }) });
const result2 = pool.execute({ stream: () => streamText({ model, prompt: "Task 2" }) });
// Wait for all operations to complete
await pool.drain();
// Pool methods
pool.getQueueLength(); // Pending operations
pool.getActiveWorkers(); // Currently executing
```
---
## Type-Safe Generics
All L0 functions support generic type parameters to forward your output types:
```typescript
import { l0, parallel, race, consensus } from "@ai2070/l0";
// Typed output (compile-time type annotation)
interface UserProfile {
name: string;
age: number;
email: string;
}
const result = await l0<UserProfile>({
stream: () => streamText({ model, prompt }),
});
// result is L0Result<UserProfile> - generic enables type inference in callbacks
// Works with all parallel operations
const raceResult = await race<UserProfile>([
{ stream: () => streamText({ model: openai("gpt-4o"), prompt }) },
{ stream: () => streamText({ model: anthropic("claude-3-opus"), prompt }) },
]);
const parallelResults = await parallel<UserProfile>(operations);
// parallelResults.results[0]?.state is typed
// Consensus with type inference
const consensusResult = await consensus<typeof schema>({
streams: [stream1, stream2, stream3],
schema,
});
```
---
## Custom Adapters (BYOA)
L0 supports custom adapters for integrating any LLM provider. Built-in adapters include `openaiAdapter`, `mastraAdapter`, and `anthropicAdapter` (reference implementation).
### Explicit Adapter Usage
```typescript
import { l0, openaiAdapter } from "@ai2070/l0";
import OpenAI from "openai";
const openai = new OpenAI();
const result = await l0({
stream: () =>
openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: "Hello!" }],
stream: true,
}),
adapter: openaiAdapter,
});
```
### Building Custom Adapters
```typescript
import { toL0Events, type L0Adapter } from "@ai2070/l0";
interface MyChunk {
text?: string;
}
const myAdapter: L0Adapter<AsyncIterable<MyChunk>> = {
name: "myai",
// Optional: Enable auto-detection
detect(input): input is AsyncIterable<MyChunk> {
return !!input && typeof input === "object" && "__myMarker" in input;
},
// Convert provider stream to L0 events
wrap(stream) {
return toL0Events(stream, (chunk) => chunk.text ?? null);
},
};
```
### Adapter Invariants
Adapters MUST:
- Preserve text exactly (no trimming, no modification)
- Include timestamps on every event
- Convert errors to error events (never throw)
- Emit complete event exactly once at end
See [CUSTOM_ADAPTERS.md](./CUSTOM_ADAPTERS.md) for complete guide including helper functions, registry API, and testing patterns.
---
## Multimodal Support
L0 supports image, audio, and video generation with progress tracking and data events:
```typescript
import { l0, toMultimodalL0Events, type L0Adapter } from "@ai2070/l0";
const fluxAdapter: L0Adapter<FluxStream> = {
name: "flux",
wrap: (stream) =>
toMultimodalL0Events(stream, {
extractProgress: (chunk) =>
chunk.type === "progress" ? { percent: chunk.percent } : null,
extractData: (chunk) =>
chunk.type === "image"
? {
contentType: "image",
mimeType: "image/png",
base64: chunk.image,
metadata: {
width: chunk.width,
height: chunk.height,
seed: chunk.seed,
},
}
: null,
}),
};
const result = await l0({
stream: () => fluxGenerate({ prompt: "A cat in space" }),
adapter: fluxAdapter,
});
for await (const event of result.stream) {
if (event.type === "progress") console.log(`${event.progress?.percent}%`);
if (event.type === "data") saveImage(event.data?.base64);
}
// All generated images available in state
console.log(result.state.dataOutputs);
```
See [MULTIMODAL.md](./MULTIMODAL.md) for complete guide.
---
## Lifecycle Callbacks
L0 provides callbacks for every phase of stream execution, giving you full observability into the streaming lifecycle:
```typescript
const result = await l0({
stream: () => streamText({ model, prompt }),
fallbackStreams: [() => streamText({ model: fallbackModel, prompt })],
guardrails: recommendedGuardrails,
continueFromLastKnownGoodToken: true,
retry: { attempts: 3 },
// Called when a new execution attempt begins
onStart: (attempt, isRetry, isFallback) => {
console.log(`Starting attempt ${attempt}`);
if (isRetry) console.log(" (retry)");
if (isFallback) console.log(" (fallback model)");
},
// Called when stream completes successfully
onComplete: (state) => {
console.log(`Completed with ${state.tokenCount} tokens`);
console.log(`Duration: ${state.duration}ms`);
},
// Called when an error occurs (before retry/fallback decision)
onError: (error, willRetry, willFallback) => {
console.error(`Error: ${error.message}`);
if (willRetry) console.log(" Will retry...");
if (willFallback) console.log(" Will try fallback...");
},
// Called for every L0 event
onEvent: (event) => {
if (event.type === "token") {
process.stdout.write(event.value || "");
}
},
// Called when a guardrail violation is detected
onViolation: (violation) => {
console.warn(`Violation: ${violation.rule}`);
console.warn(` ${violation.message}`);
},
// Called when a retry is triggered
onRetry: (attempt, reason) => {
console.log(`Retrying (attempt ${attempt}): ${reason}`);
},
// Called when switching to a fallback model
onFallback: (index, reason) => {
console.log(`Switching to fallback ${index}: ${reason}`);
},
// Called when resuming from checkpoint
onResume: (checkpoint, tokenCount) => {
console.log(`Resuming from checkpoint (${tokenCount} tokens)`);
},
// Called when a checkpoint is saved
onCheckpoint: (checkpoint, tokenCount) => {
console.log(`Checkpoint saved (${tokenCount} tokens)`);
},
// Called when a timeout occurs
onTimeout: (type, elapsedMs) => {
console.log(`Timeout: ${type} after ${elapsedMs}ms`);
},
// Called when the stream is aborted
onAbort: (tokenCount, contentLength) => {
console.log(`Aborted after ${tokenCount} tokens (${contentLength} chars)`);
},
// Called when drift is detected
onDrift: (types, confidence) => {
console.log(
`Drift detected: ${types.join(", ")} (confidence: ${confidence})`,
);
},
// Called when a tool call is detected
onToolCall: (toolName, toolCallId, args) => {
console.log(`Tool call: ${toolName} (${toolCallId})`);
console.log(` Args: ${JSON.stringify(args)}`);
},
});
```
## Deterministic Lifecycle Flow
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ L0 LIFECYCLE FLOW │
└─────────────────────────────────────────────────────────────────────────────┘
┌──────────┐
│ START │
└────┬─────┘
│
▼
┌──────────────────────────────┐
│ onStart(attempt, false, false) │
└──────────────┬───────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────────────┐
│ STREAMING PHASE │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ onEvent(event) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ During streaming, these callbacks fire as conditions occur: │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ onCheckpoint │ │ onToolCall │ │ onDrift │ │ onTimeout │ │
│ │ (checkpoint, │ │ (toolName, │ │ (types, │ │ (type, │ │
│ │ tokenCount) │ │ id, args) │ │ confidence) │ │ elapsedMs) │ │
│ └──────────────┘ └──────────────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └────────┬─────────┘ │
│ │ triggers retry │
└──────────────────────────────────────────────────────┼─────────────────────┘
│
┌────────────────────────────────────────┼────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌───────────┐ ┌──────────┐ ┌─────────┐
│ SUCCESS │ │ ERROR │ │VIOLATION │ │ ABORT │
└────┬────┘ └─────┬─────┘ └────┬─────┘ └────┬────┘
│ │ │ │
│ │ ▼ ▼
│ │ ┌─────────────┐ ┌───────────┐
│ │ │ onViolation │ │ onAbort │
│ │ └──────┬──────┘ │(tokenCount│
│ │ │ │ contentLen)│
│ ▼ ▼ └───────────┘
│ ┌────────────────────────────────┐
│ │ onError(error, willRetry, │
│ │ willFallback) │
│ └──────────────┬─────────────────┘
│ │
│ ┌───────────┼───────────┐
│ │ │ │
│ ▼ ▼ ▼
│ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ │ RETRY │ │ FALLBACK │ │ FATAL │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │ │
│ ▼ ▼ │
│ ┌───────────┐ ┌───────────┐ │
│ │ onRetry() │ │onFallback │ │
│ └─────┬─────┘ └─────┬─────┘ │
│ │ │ │
│ │ ┌────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────┐ │
│ │ Has checkpoint? │ │
│ └──────────┬──────────┘ │
│ YES │ NO │
│ ┌────┴────┐ │
│ ▼ ▼ │
│ ┌──────────┐ │ │
│ │ onResume │ │ │
│ └────┬─────┘ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────┐ │
│ │onStart(attempt, isRetry,│ │
│ │ isFallback) │─────┼──► Back to STREAMING
│ └─────────────────────────┘ │
│ │
▼ ▼
┌─────────────┐ ┌──────────┐
│ onComplete │ │ THROW │
│ (state) │ │ ERROR │
└─────────────┘ └──────────┘
```
### Callback Reference
| Callback | When Called | Signature |
| -------------- | -------------------------------------- | ------------------------------------------------------------------------------- |
| `onStart