UNPKG

autotel

Version:
1,456 lines (1,161 loc) 99.1 kB
# 🔭 autotel [![npm version](https://img.shields.io/npm/v/autotel.svg?label=autotel)](https://www.npmjs.com/package/autotel) [![npm subscribers](https://img.shields.io/npm/v/autotel-subscribers.svg?label=subscribers)](https://www.npmjs.com/package/autotel-subscribers) [![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE) **Write once, observe everywhere.** Instrument your Node.js code a single time, keep the DX you love, and stream traces, metrics, logs, and product events to **any** observability stack without vendor lock-in. - **Drop-in DX** : one `init()` and ergonomic helpers like `trace()`, `span()`, `withTracing()`, decorators, and batch instrumentation. - **Platform freedom** : OTLP-first design plus subscribers for PostHog, Mixpanel, Amplitude, and anything else via custom exporters/readers. - **Production hardening** : adaptive sampling (10% baseline, 100% errors/slow paths), rate limiting, circuit breakers, payload validation, and automatic sensitive-field redaction. - **Auto enrichment** : service metadata, deployment info, and AsyncLocalStorage-powered correlation IDs automatically flow into spans, metrics, logs, and events. > Raw OpenTelemetry is verbose, and vendor SDKs create lock-in. Autotel gives you the best parts of both: clean ergonomics **and** total ownership of your telemetry. ## Migrating from OpenTelemetry? **[Migration Guide](../../docs/MIGRATION.md)** - Pattern-by-pattern migration walkthrough with side-by-side comparisons and deployment checklist. Replace `NODE_OPTIONS` and 30+ lines of SDK boilerplate with `init()`, wrap functions with `trace()` instead of manual `span.start()`/`span.end()`. --- ## Table of Contents - [🔭 autotel](#-autotel) - [Migrating from OpenTelemetry?](#migrating-from-opentelemetry) - [Table of Contents](#table-of-contents) - [Why Autotel](#why-autotel) - [Quick Start](#quick-start) - [1. Install](#1-install) - [2. Initialize once at startup](#2-initialize-once-at-startup) - [3. Instrument code with `trace()`](#3-instrument-code-with-trace) - [4. See the value everywhere](#4-see-the-value-everywhere) - [Choose Any Destination](#choose-any-destination) - [LLM Observability with OpenLLMetry](#llm-observability-with-openllmetry) - [Installation](#installation) - [Usage](#usage) - [Sampling](#sampling) - [Preset Shorthand](#preset-shorthand) - [Tuned Presets](#tuned-presets) - [YAML Configuration](#yaml-configuration) - [Precedence Rules](#precedence-rules) - [Tail-Sampling Attributes](#tail-sampling-attributes) - [Core Building Blocks](#core-building-blocks) - [trace()](#trace) - [span()](#span) - [Trace Context (`ctx`)](#trace-context-ctx) - [Baggage (Context Propagation)](#baggage-context-propagation) - [Reusable Middleware Helpers](#reusable-middleware-helpers) - [Decorators (TypeScript 5+)](#decorators-typescript-5) - [Database Instrumentation](#database-instrumentation) - [Type-Safe Attributes](#type-safe-attributes) - [Pattern A: Key Builders](#pattern-a-key-builders) - [Pattern B: Object Builders](#pattern-b-object-builders) - [Attachers (Signal Helpers)](#attachers-signal-helpers) - [PII Guardrails](#pii-guardrails) - [Domain Helpers](#domain-helpers) - [Available Attribute Domains](#available-attribute-domains) - [Resource Merging](#resource-merging) - [Event-Driven Architectures](#event-driven-architectures) - [Message Producers (Kafka, SQS, RabbitMQ)](#message-producers-kafka-sqs-rabbitmq) - [Message Consumers](#message-consumers) - [Consumer Lag Metrics](#consumer-lag-metrics) - [Custom Messaging System Adapters](#custom-messaging-system-adapters) - [Safe Baggage Propagation](#safe-baggage-propagation) - [BusinessBaggage (Pre-built Schema)](#businessbaggage-pre-built-schema) - [Custom Baggage Schemas](#custom-baggage-schemas) - [Workflow \& Saga Tracing](#workflow--saga-tracing) - [Basic Workflows](#basic-workflows) - [Saga Pattern with Compensation](#saga-pattern-with-compensation) - [Business Metrics \& Product Events](#business-metrics--product-events) - [OpenTelemetry Metrics (Metric class + helpers)](#opentelemetry-metrics-metric-class--helpers) - [Product Events (PostHog, Mixpanel, Amplitude, …)](#product-events-posthog-mixpanel-amplitude-) - [Logging with Trace Context](#logging-with-trace-context) - [Using Pino (recommended)](#using-pino-recommended) - [Using Winston](#using-winston) - [Using Bunyan (or other loggers)](#using-bunyan-or-other-loggers) - [What you get automatically](#what-you-get-automatically) - [Canonical Log Lines (Wide Events)](#canonical-log-lines-wide-events) - [Basic Usage](#basic-usage) - [What You Get](#what-you-get) - [Query Examples](#query-examples) - [Configuration Options](#configuration-options) - [Request Logger DX](#request-logger-dx) - [Drain Pipeline (Batch + Retry + Flush)](#drain-pipeline-batch--retry--flush) - [parseError (Frontend/API Consumers)](#parseerror-frontendapi-consumers) - [Auto Instrumentation \& Advanced Configuration](#auto-instrumentation--advanced-configuration) - [⚠️ autoInstrumentations vs. Manual Instrumentations](#️-autoinstrumentations-vs-manual-instrumentations) - [Option A: Auto-instrumentations only (all defaults)](#option-a-auto-instrumentations-only-all-defaults) - [Option B: Manual instrumentations with custom configs](#option-b-manual-instrumentations-with-custom-configs) - [Option C: Mix auto + manual (best of both)](#option-c-mix-auto--manual-best-of-both) - [⚠️ Auto-Instrumentation Setup Requirements](#️-auto-instrumentation-setup-requirements) - [Operational Safety \& Runtime Controls](#operational-safety--runtime-controls) - [Configuration Reference](#configuration-reference) - [Building Custom Instrumentation](#building-custom-instrumentation) - [Instrumenting Queue Consumers](#instrumenting-queue-consumers) - [Instrumenting Scheduled Jobs / Cron](#instrumenting-scheduled-jobs--cron) - [Creating Custom Event Subscribers](#creating-custom-event-subscribers) - [Low-Level Span Manipulation](#low-level-span-manipulation) - [Custom Metrics](#custom-metrics) - [Serverless \& Short-lived Processes](#serverless--short-lived-processes) - [Manual Flush (Recommended for Serverless)](#manual-flush-recommended-for-serverless) - [Auto-Flush Spans (Opt-in)](#auto-flush-spans-opt-in) - [Edge Runtimes (Cloudflare Workers, Vercel Edge)](#edge-runtimes-cloudflare-workers-vercel-edge) - [API Reference](#api-reference) - [FAQ \& Next Steps](#faq--next-steps) - [Troubleshooting \& Debugging](#troubleshooting--debugging) - [Quick Debug Mode (Recommended)](#quick-debug-mode-recommended) - [Manual Configuration (Advanced)](#manual-configuration-advanced) - [ConsoleSpanExporter (Visual Debugging)](#consolespanexporter-visual-debugging) - [InMemorySpanExporter (Testing \& Assertions)](#inmemoryspanexporter-testing--assertions) - [Using Both (Advanced)](#using-both-advanced) - [Creating Custom Instrumentation](#creating-custom-instrumentation) - [Quick Start Template](#quick-start-template) - [Step-by-Step Tutorial: Instrumenting Axios](#step-by-step-tutorial-instrumenting-axios) - [Best Practices](#best-practices) - [1. Idempotent Instrumentation](#1-idempotent-instrumentation) - [2. Error Handling](#2-error-handling) - [3. Security - Don't Capture Sensitive Data](#3-security---dont-capture-sensitive-data) - [4. Follow OpenTelemetry Semantic Conventions](#4-follow-opentelemetry-semantic-conventions) - [5. Choose the Right SpanKind](#5-choose-the-right-spankind) - [6. TypeScript Type Safety](#6-typescript-type-safety) - [Available Utilities](#available-utilities) - [From `autotel/trace-helpers`](#from-autoteltrace-helpers) - [From `@opentelemetry/api`](#from-opentelemetryapi) - [Semantic Conventions (Optional)](#semantic-conventions-optional) - [Real-World Examples](#real-world-examples) - [When to Create Custom Instrumentation](#when-to-create-custom-instrumentation) - [Using Official Instrumentation](#using-official-instrumentation) ## Why Autotel | Challenge | With autotel | | ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | | Writing raw OpenTelemetry spans/metrics takes dozens of lines and manual lifecycle management. | Wrap any function in `trace()` or `span()` and get automatic span lifecycle, error capture, attributes, and adaptive sampling. | | Vendor SDKs simplify setup but trap your data in a single platform. | Autotel is OTLP-native and works with Grafana Cloud, Datadog, New Relic, Tempo, Honeycomb, Elasticsearch, or your own collector. | | Teams need both observability **and** product events. | Ship technical telemetry and funnel/behavior events through the same API with contextual enrichment. | | Production readiness requires redaction, rate limiting, and circuit breakers. | Those guardrails are on by default so you can safely enable telemetry everywhere. | ## Quick Start > Want to follow along in code? This repo ships with `apps/example-basic` (mirrors the steps below) and `apps/example-http` for an Express server, you can run either with `pnpm start` after `pnpm install && pnpm build` at the root. ### 1. Install ```bash npm install autotel npm install -D autotel-devtools # optional but recommended for local DX # or pnpm add autotel pnpm add -D autotel-devtools ``` ### 2. Initialize once at startup ```typescript import { init } from 'autotel'; init({ service: 'checkout-api', devtools: true, }); ``` Defaults: - OTLP endpoint: `process.env.OTLP_ENDPOINT || http://localhost:4318` - Metrics: on in every environment - Sampler: adaptive (10% baseline, 100% for errors/slow spans) - Version: auto-detected from `package.json` - Events auto-flush when the root span finishes Recommended local workflow: ```typescript init({ service: 'checkout-api', devtools: true, }); ``` - `devtools: true` points traces, metrics, and logs at local `autotel-devtools` - `devtools: { embedded: true }` tries to start `autotel-devtools` for you - when you switch to a hosted backend, replace `devtools` with `endpoint` and optional `headers` Example remote backend config: ```typescript init({ service: 'checkout-api', endpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT, headers: process.env.OTEL_EXPORTER_OTLP_HEADERS, }); ``` Sampling presets: - Simple path: `sampling: 'development' | 'errors-only' | 'production' | 'off'` - Advanced path: `samplingPresets.development()`, `samplingPresets.errorsOnly()`, `samplingPresets.production({...})`, `samplingPresets.off()` - Precedence is always `sampler > sampling > default` - If you use YAML `sampling.preset`, extra tuning fields in that same block are ignored. Use the programmatic API with `sampler` or `samplingPresets.production({...})` when you need overrides. - Tail-sampling hint attributes use the `autotel.*` namespace, for example `autotel.sampling.tail.keep`. This is intentional: OpenTelemetry does not define an official semantic-convention key for these internal hints, so autotel uses library-prefixed custom attributes rather than inventing fake `otel.*` semconv keys. ### 3. Instrument code with `trace()` ```typescript import { trace } from 'autotel'; export const createUser = trace(async function createUser( data: CreateUserData, ) { const user = await db.users.insert(data); return user; }); ``` - Named function expressions automatically become span names (`code.function`). - Errors are recorded, spans are ended, and status is set automatically. ### 4. See the value everywhere ```typescript import { init, track } from 'autotel'; init({ service: 'checkout-api', endpoint: 'https://otlp-gateway-prod.grafana.net/otlp', subscribers: [new PostHogSubscriber({ apiKey: process.env.POSTHOG_KEY! })], }); export const processOrder = trace(async function processOrder(order) { track('order.completed', { amount: order.total }); return charge(order); }); ``` Every span, metric, log line, and event includes `traceId`, `spanId`, `operation.name`, `service.version`, and `deployment.environment` automatically. ## Choose Any Destination ```typescript import { init } from 'autotel'; init({ service: 'my-app', // Grafana / Tempo / OTLP collector endpoint: 'https://otlp-gateway-prod.grafana.net/otlp', }); init({ service: 'my-app', // Datadog (traces + metrics + logs via OTLP) endpoint: 'https://otlp.datadoghq.com', headers: 'dd-api-key=...', }); init({ service: 'my-app', // Honeycomb (gRPC protocol) protocol: 'grpc', endpoint: 'api.honeycomb.io:443', headers: { 'x-honeycomb-team': process.env.HONEYCOMB_API_KEY!, }, }); init({ service: 'my-app', // Custom pipeline with your own exporters/readers spanProcessor: new BatchSpanProcessor( new JaegerExporter({ endpoint: 'http://otel:14268/api/traces' }), ), metricReader: new PeriodicExportingMetricReader({ exporter: new OTLPMetricExporter({ url: 'https://metrics.example.com/v1/metrics', }), }), logRecordProcessors: [ new BatchLogRecordProcessor( new OTLPLogExporter({ url: 'https://logs.example.com/v1/logs' }), ), ], instrumentations: [new HttpInstrumentation()], }); init({ service: 'my-app', // Product events subscribers (ship alongside OTLP) subscribers: [ new PostHogSubscriber({ apiKey: process.env.POSTHOG_KEY! }), new MixpanelSubscriber({ projectToken: process.env.MIXPANEL_TOKEN! }), ], }); init({ service: 'my-app', // OpenLLMetry integration for LLM observability openllmetry: { enabled: true, options: { disableBatch: process.env.NODE_ENV !== 'production', apiKey: process.env.TRACELOOP_API_KEY, }, }, }); ``` Autotel never owns your data, it's a thin layer over OpenTelemetry with optional adapters. ## LLM Observability with OpenLLMetry Autotel integrates seamlessly with [OpenLLMetry](https://github.com/traceloop/openllmetry) to provide comprehensive observability for LLM applications. OpenLLMetry automatically instruments LLM providers (OpenAI, Anthropic, etc.), vector databases, and frameworks (LangChain, LlamaIndex, etc.). ### Installation Install the OpenLLMetry SDK as an optional peer dependency: ```bash pnpm add @traceloop/node-server-sdk # or npm install @traceloop/node-server-sdk ``` ### Usage Enable OpenLLMetry in your autotel configuration: ```typescript import { init } from 'autotel'; init({ service: 'my-llm-app', endpoint: process.env.OTLP_ENDPOINT, openllmetry: { enabled: true, options: { // Disable batching in development for immediate traces disableBatch: process.env.NODE_ENV !== 'production', // Optional: Traceloop API key if using Traceloop backend apiKey: process.env.TRACELOOP_API_KEY, }, }, }); ``` OpenLLMetry will automatically: - Instrument LLM calls (OpenAI, Anthropic, Cohere, etc.) - Track vector database operations (Pinecone, Chroma, Qdrant, etc.) - Monitor LLM frameworks (LangChain, LlamaIndex, LangGraph, etc.) - Reuse autotel's OpenTelemetry tracer provider for unified traces All LLM spans will appear alongside your application traces in your observability backend. **AI Workflow Patterns:** See [AI/LLM Workflow Documentation](../../docs/AI_WORKFLOWS.md) for comprehensive patterns including: - Multi-agent workflows (orchestration and handoffs) - RAG pipelines (embeddings, search, generation) - Streaming responses - Evaluation loops - Working examples in `apps/example-ai-agent` ## Sampling Autotel defaults to production-ready adaptive sampling: a 10% baseline, with errors and slow requests kept automatically. ### Preset Shorthand Use the `sampling` field on `init()` when you want the shortest path: ```typescript import { init } from 'autotel'; init({ service: 'checkout-api', sampling: 'production', }); ``` Available string presets: - `'development'`: keep everything - `'errors-only'`: drop healthy baseline traffic, keep errors - `'production'`: 10% baseline plus errors and slow traces - `'off'`: disable sampling entirely String presets intentionally use kebab-case. For example, the string form is `sampling: 'errors-only'`. ### Tuned Presets Use `samplingPresets` when you want preset behavior with tuned thresholds or rates: ```typescript import { init, samplingPresets } from 'autotel'; init({ service: 'checkout-api', sampler: samplingPresets.production({ baselineSampleRate: 0.05, slowThresholdMs: 500, }), }); ``` Factory names intentionally use JavaScript-style camelCase. For example, the factory form is `samplingPresets.errorsOnly()`. ### YAML Configuration Use `sampling.preset` for the simple YAML path: ```yaml sampling: preset: production ``` If you need tuned sampling in YAML today, prefer the explicit sampler config block: ```yaml sampling: type: adaptive baseline_rate: 0.05 always_sample_errors: true always_sample_slow: true slow_threshold_ms: 500 ``` When `sampling.preset` is set, other keys in the same YAML sampling block are ignored and autotel will warn. Use the programmatic API with `sampler` or `samplingPresets.production({...})` for tuned presets. ### Precedence Rules Sampling always resolves in this order: ```text sampler > sampling > default ``` That means: - `sampler` always wins if you provide both - `sampling` is the simple preset shorthand - OpenTelemetry env vars such as `OTEL_TRACES_SAMPLER` are used after explicit config and YAML - default behavior is `samplingPresets.production()` Example: ```typescript import { init, NeverSampler } from 'autotel'; init({ service: 'checkout-api', sampler: new NeverSampler(), sampling: 'development', // ignored because sampler wins }); ``` For OpenTelemetry SDK compatibility, autotel also reads `OTEL_TRACES_SAMPLER` and `OTEL_TRACES_SAMPLER_ARG`. Supported values: - `always_on` - `always_off` - `traceidratio` - `parentbased_always_on` - `parentbased_always_off` - `parentbased_traceidratio` Currently unsupported and ignored with an error log: - `jaeger_remote` - `parentbased_jaeger_remote` - `xray` ### Tail-Sampling Attributes Autotel uses internal span attributes such as `autotel.sampling.tail.keep` and `autotel.sampling.tail.evaluated` to communicate tail-sampling decisions. These use the `autotel.*` namespace intentionally. OpenTelemetry does not define an official semantic-convention key for these internal hints, so autotel uses library-prefixed custom attributes rather than inventing fake `otel.*` semantic convention keys. ## Core Building Blocks ### trace() Wrap any sync/async function to create spans automatically. ```typescript import { trace } from 'autotel'; export const updateUser = trace(async function updateUser( id: string, data: UserInput, ) { return db.users.update(id, data); }); // Explicit name (useful for anonymous/arrow functions) export const deleteUser = trace('user.delete', async (id: string) => { return db.users.delete(id); }); // Factory form exposes the `ctx` helper (see below) export const createOrder = trace((ctx) => async (order: Order) => { ctx.setAttribute('order.id', order.id); return submit(order); }); // Immediate execution - wraps and executes instantly (for middleware/wrappers) function timed<T>(operation: string, fn: () => Promise<T>): Promise<T> { return trace(operation, async (ctx) => { ctx.setAttribute('operation', operation); return await fn(); }); } // Executes immediately, returns Promise<T> directly ``` **Two patterns supported:** 1. **Factory pattern** `trace(ctx => (...args) => result)` : Returns a wrapped function for reuse 2. **Immediate execution** `trace(ctx => result)` : Executes once immediately, returns the result directly - Automatic span lifecycle (`start`, `end`, status, and error recording). - Function names feed `operation.name`, `code.function`, and events enrichment. - Works with promises, async/await, or sync functions. ### span() Create nested spans for individual code blocks without wrapping entire functions. ```typescript import { span, trace } from 'autotel'; export const rollDice = trace(async function rollDice(rolls: number) { const results: number[] = []; for (let i = 0; i < rolls; i++) { await span( { name: 'roll.once', attributes: { roll: i + 1 } }, async (span) => { span.setAttribute('range', '1-6'); span.addEvent('dice.rolled', { value: rollOnce() }); results.push(rollOnce()); }, ); } return results; }); ``` Nested spans automatically inherit context and correlation IDs. ### Trace Context (`ctx`) Every `trace((ctx) => ...)` factory receives a type-safe helper backed by `AsyncLocalStorage`. ```typescript export const createUser = trace((ctx) => async (input: CreateUserData) => { logger.info({ traceId: ctx.traceId }, 'Handling request'); ctx.setAttributes({ 'user.id': input.id, 'user.plan': input.plan }); try { const user = await db.users.create(input); ctx.setStatus({ code: SpanStatusCode.OK }); return user; } catch (error) { ctx.recordException(error as Error); ctx.setStatus({ code: SpanStatusCode.ERROR, message: 'Failed to create user', }); throw error; } }); ``` Available helpers: `traceId`, `spanId`, `correlationId`, `setAttribute`, `setAttributes`, `setStatus`, `recordException`, `getBaggage`, `setBaggage`, `deleteBaggage`, `getAllBaggage`. #### Baggage (Context Propagation) Baggage allows you to propagate custom key-value pairs across distributed traces. Baggage is automatically included in HTTP headers when using `injectTraceContext()` from `autotel/http`. ```typescript import { trace, withBaggage } from 'autotel'; import { injectTraceContext } from 'autotel/http'; // Set baggage for downstream services export const createOrder = trace((ctx) => async (order: Order) => { return await withBaggage({ baggage: { 'tenant.id': order.tenantId, 'user.id': order.userId, }, fn: async () => { // Baggage is available to all child spans and HTTP calls const tenantId = ctx.getBaggage('tenant.id'); ctx.setAttribute('tenant.id', tenantId || 'unknown'); // HTTP headers automatically include baggage const headers = injectTraceContext(); await fetch('/api/charge', { headers, body: JSON.stringify(order) }); }, }); }); ``` **Typed Baggage (Optional):** For type-safe baggage operations, use `defineBaggageSchema()`: ```typescript import { trace, defineBaggageSchema } from 'autotel'; type TenantBaggage = { tenantId: string; region?: string }; const tenantBaggage = defineBaggageSchema<TenantBaggage>('tenant'); export const handler = trace<TenantBaggage>((ctx) => async () => { // Type-safe get const tenant = tenantBaggage.get(ctx); if (tenant?.tenantId) { console.log('Tenant:', tenant.tenantId); } // Type-safe set with proper scoping return await tenantBaggage.with(ctx, { tenantId: 't1' }, async () => { // Baggage is available here and in child spans }); }); ``` **Automatic Baggage → Span Attributes:** Enable `baggage: true` in `init()` to automatically copy all baggage entries to span attributes, making them visible in trace UIs without manual `ctx.setAttribute()` calls: ```typescript import { init, trace, withBaggage } from 'autotel'; init({ service: 'my-app', baggage: true, // Auto-copy baggage to span attributes }); export const processOrder = trace((ctx) => async (order: Order) => { return await withBaggage({ baggage: { 'tenant.id': order.tenantId, 'user.id': order.userId, }, fn: async () => { // Span automatically has baggage.tenant.id and baggage.user.id attributes! // No need for: ctx.setAttribute('tenant.id', ctx.getBaggage('tenant.id')) await chargeCustomer(order); }, }); }); ``` **Custom prefix:** ```typescript init({ service: 'my-app', baggage: 'ctx', // Creates ctx.tenant.id, ctx.user.id // Or use '' for no prefix: tenant.id, user.id }); ``` **Extracting Baggage from Incoming Requests:** ```typescript import { extractTraceContext, trace, context } from 'autotel'; // In Express middleware app.use((req, res, next) => { const extractedContext = extractTraceContext(req.headers); context.with(extractedContext, () => { next(); }); }); ``` **Key Points:** - Typed baggage is completely optional - existing untyped baggage code continues to work without changes - `baggage: true` in `init()` eliminates manual attribute setting for baggage - Baggage values are strings (convert numbers/objects before setting) - Never put PII in baggage - it propagates in HTTP headers across services! ### Reusable Middleware Helpers - `withTracing(options)` : create a preconfigured wrapper (service name, default attributes, skip rules). - `instrument(object, options)` : batch-wrap entire modules while skipping helpers or private functions. ```typescript import { withTracing, instrument } from 'autotel'; const traceFn = withTracing({ serviceName: 'user' }); export const create = traceFn((ctx) => async (payload) => { /* ... */ }); export const update = traceFn((ctx) => async (id, payload) => { /* ... */ }); export const repository = instrument( { createUser: async () => { /* ... */ }, updateUser: async () => { /* ... */ }, _internal: async () => { /* skipped */ }, }, { serviceName: 'repository', skip: ['_internal'] }, ); ``` ### Decorators (TypeScript 5+) Prefer classes or NestJS-style services? Use the `@Trace` decorator. ```typescript import { Trace } from 'autotel/decorators'; class OrderService { @Trace('order.create', { withMetrics: true }) async createOrder(data: OrderInput) { return db.orders.create(data); } // No arguments → method name becomes the span name @Trace() async processPayment(orderId: string) { return charge(orderId); } @Trace() async refund(orderId: string) { const ctx = (this as any).ctx; ctx.setAttribute('order.id', orderId); return refund(orderId); } } ``` Decorators are optional, everything also works in plain functions. ### Database Instrumentation Turn on query tracing in one line. ```typescript import { instrumentDatabase } from 'autotel/db'; const db = drizzle(pool); instrumentDatabase(db, { dbSystem: 'postgresql', database: 'myapp', }); await db.select().from(users); // queries emit spans automatically ``` ## Type-Safe Attributes Autotel provides type-safe attribute builders following OpenTelemetry semantic conventions. These helpers give you autocomplete, compile-time validation, and automatic PII redaction. ### Pattern A: Key Builders Build individual attributes with full autocomplete: ```typescript import { attrs, mergeAttrs } from 'autotel/attributes'; // Single attribute ctx.setAttributes(attrs.user.id('user-123')); // → { 'user.id': 'user-123' } ctx.setAttributes(attrs.http.request.method('GET')); // → { 'http.request.method': 'GET' } ctx.setAttributes(attrs.db.client.system('postgresql')); // → { 'db.system.name': 'postgresql' } // Combine multiple attributes ctx.setAttributes( mergeAttrs( attrs.user.id('user-123'), attrs.session.id('sess-456'), attrs.http.response.statusCode(200), ), ); ``` ### Pattern B: Object Builders Pass an object to set multiple related attributes at once: ```typescript import { attrs } from 'autotel/attributes'; // User attributes ctx.setAttributes( attrs.user.data({ id: 'user-123', email: 'user@example.com', roles: ['admin', 'editor'], }), ); // → { 'user.id': 'user-123', 'user.email': 'user@example.com', 'user.roles': ['admin', 'editor'] } // HTTP server attributes ctx.setAttributes( attrs.http.server({ method: 'POST', route: '/api/users/:id', statusCode: 201, }), ); // → { 'http.request.method': 'POST', 'http.route': '/api/users/:id', 'http.response.status_code': 201 } // Database attributes ctx.setAttributes( attrs.db.client.data({ system: 'postgresql', name: 'myapp_db', // Maps to db.namespace operation: 'SELECT', collectionName: 'users', }), ); ``` ### Attachers (Signal Helpers) Attachers know WHERE to attach attributes - they handle spans, resources, and apply guardrails automatically: ```typescript import { setUser, httpServer, identify, dbClient } from 'autotel/attributes'; // Set user attributes with automatic PII redaction export const handleRequest = trace((ctx) => async (req) => { setUser(ctx, { id: req.userId, email: req.userEmail, // Automatically redacted by default }); // HTTP attributes + automatic span name update httpServer(ctx, { method: req.method, route: req.route, statusCode: 200, }); // Span name becomes: "HTTP GET /api/users" }); // Bundle user, session, and device attributes together export const identifyUser = trace((ctx) => async (data) => { identify(ctx, { user: { id: data.userId, name: data.userName }, session: { id: data.sessionId }, device: { id: data.deviceId, manufacturer: 'Apple' }, }); }); // Database client attributes export const queryUsers = trace((ctx) => async () => { dbClient(ctx, { system: 'postgresql', operation: 'SELECT', collectionName: 'users', }); return await db.query('SELECT * FROM users'); }); ``` ### PII Guardrails `safeSetAttributes()` applies automatic PII detection and configurable guardrails: ```typescript import { safeSetAttributes, attrs } from 'autotel/attributes'; export const processUser = trace((ctx) => async (user) => { // Default: PII is redacted automatically safeSetAttributes(ctx, attrs.user.data({ email: 'user@example.com' })); // → { 'user.email': '[REDACTED]' } // Allow PII (use with caution) safeSetAttributes(ctx, attrs.user.data({ email: 'user@example.com' }), { guardrails: { pii: 'allow' }, }); // → { 'user.email': 'user@example.com' } // Hash PII for correlation without exposing raw values safeSetAttributes(ctx, attrs.user.data({ email: 'user@example.com' }), { guardrails: { pii: 'hash' }, }); // → { 'user.email': 'hash_a1b2c3d4...' } // Truncate long values safeSetAttributes(ctx, attrs.user.data({ id: 'a'.repeat(500) }), { guardrails: { maxLength: 255 }, }); // → { 'user.id': 'aaaa...aaa...' } (truncated with ellipsis) // Warn on deprecated attributes safeSetAttributes( ctx, { 'http.method': 'GET' }, // Deprecated! { guardrails: { warnDeprecated: true } }, ); // Console: [autotel/attributes] Attribute "http.method" is deprecated. Use "http.request.method" instead. }); ``` **Guardrail Options:** | Option | Values | Default | Description | | ---------------- | ------------------------------------------ | ---------- | ------------------------------------------ | | `pii` | `'allow'`, `'redact'`, `'hash'`, `'block'` | `'redact'` | How to handle PII in attribute values | | `maxLength` | number | `255` | Maximum string length before truncation | | `validateEnum` | boolean | `true` | Normalize enum values (e.g., HTTP methods) | | `warnDeprecated` | boolean | `true` | Log warnings for deprecated attributes | ### Domain Helpers Domain helpers bundle multiple attribute groups for common scenarios: ```typescript import { transaction } from 'autotel/attributes'; // Bundle HTTP request with user context export const handleRequest = trace((ctx) => async (req) => { transaction(ctx, { user: { id: req.userId }, session: { id: req.sessionId }, method: req.method, route: req.route, statusCode: 200, clientIp: req.ip, }); // Sets: user.id, session.id, http.request.method, http.route, // http.response.status_code, network.peer.address // Also updates span name to "HTTP GET /api/users" }); ``` ### Available Attribute Domains | Domain | Key Builders | Object Builder | | ----------- | ---------------------------------------------------- | -------------------------------------------- | | `user` | `id`, `email`, `name`, `fullName`, `hash`, `roles` | `attrs.user.data()` | | `session` | `id`, `previousId` | `attrs.session.data()` | | `device` | `id`, `manufacturer`, `modelIdentifier`, `modelName` | `attrs.device.data()` | | `http` | `request.*`, `response.*`, `route` | `attrs.http.server()`, `attrs.http.client()` | | `db` | `client.system`, `client.operation`, etc. | `attrs.db.client.data()` | | `service` | `name`, `instance`, `version` | `attrs.service.data()` | | `network` | `peerAddress`, `peerPort`, `transport`, etc. | `attrs.network.data()` | | `error` | `type`, `message`, `stackTrace`, `code` | `attrs.error.data()` | | `exception` | `escaped`, `message`, `stackTrace`, `type` | `attrs.exception.data()` | | `cloud` | `provider`, `accountId`, `region`, etc. | `attrs.cloud.data()` | | `messaging` | `system`, `destination`, `operation`, etc. | `attrs.messaging.data()` | | `genAI` | `system`, `requestModel`, `responseModel`, etc. | - | | `rpc` | `system`, `service`, `method` | - | | `graphql` | `document`, `operationName`, `operationType` | - | ### Resource Merging For enriching OpenTelemetry Resources with service attributes (Resource.attributes is readonly), use `mergeServiceResource`: ```typescript import { mergeServiceResource } from 'autotel/attributes'; import { Resource } from '@opentelemetry/resources'; // Create enriched resource for custom SDK configurations const baseResource = Resource.default(); const enrichedResource = mergeServiceResource(baseResource, { name: 'my-service', version: '1.0.0', instance: 'instance-1', }); // Use with custom TracerProvider const provider = new NodeTracerProvider({ resource: enrichedResource }); ``` ## Event-Driven Architectures Autotel provides first-class support for tracing message-based systems like Kafka, SQS, and RabbitMQ. The `traceProducer` and `traceConsumer` helpers automatically set semantic attributes, handle context propagation, and create proper span links. ### Message Producers (Kafka, SQS, RabbitMQ) Use `traceProducer` to wrap message publishing functions with automatic tracing: ```typescript import { traceProducer, type ProducerContext } from 'autotel'; // Kafka producer export const publishUserEvent = traceProducer({ system: 'kafka', destination: 'user-events', messageIdFrom: (args) => args[0].eventId, // Extract message ID from args })((ctx) => async (event: UserEvent) => { // Get W3C trace headers to inject into message const headers = ctx.getTraceHeaders(); await producer.send({ topic: 'user-events', messages: [ { key: event.userId, value: JSON.stringify(event), headers, // Trace context propagates to consumers }, ], }); }); // SQS producer with custom attributes export const publishOrder = traceProducer({ system: 'sqs', destination: 'orders-queue', attributes: { 'custom.priority': 'high' }, })((ctx) => async (order: Order) => { ctx.setAttribute('order.total', order.total); await sqs.sendMessage({ QueueUrl: QUEUE_URL, MessageBody: JSON.stringify(order), MessageAttributes: { traceparent: { DataType: 'String', StringValue: ctx.getTraceHeaders().traceparent, }, }, }); }); ``` **Automatic Span Attributes (OTel Semantic Conventions):** - `messaging.system` - The messaging system (kafka, sqs, rabbitmq, etc.) - `messaging.operation` - Always "publish" for producers - `messaging.destination.name` - Topic/queue name - `messaging.message.id` - Extracted message ID (if configured) - `messaging.kafka.destination.partition` - Partition number (Kafka-specific) ### Message Consumers Use `traceConsumer` to wrap message handlers with automatic link extraction and DLQ support: ```typescript import { traceConsumer, extractLinksFromBatch } from 'autotel'; // Single message consumer export const processUserEvent = traceConsumer({ system: 'kafka', destination: 'user-events', consumerGroup: 'event-processor', headersFrom: (msg) => msg.headers, // Extract trace headers })((ctx) => async (message: KafkaMessage) => { // Links to producer span are automatically created const event = JSON.parse(message.value); await processEvent(event); }); // Batch consumer with automatic link extraction export const processBatch = traceConsumer({ system: 'kafka', destination: 'user-events', consumerGroup: 'batch-processor', batchMode: true, // Extract links from all messages headersFrom: (msg) => msg.headers, })((ctx) => async (messages: KafkaMessage[]) => { // ctx.links contains SpanContext from each message's traceparent for (const msg of messages) { await processMessage(msg); } }); // Consumer with DLQ handling export const processWithDLQ = traceConsumer({ system: 'sqs', destination: 'orders-queue', headersFrom: (msg) => msg.MessageAttributes, })((ctx) => async (message: SQSMessage) => { try { await processOrder(JSON.parse(message.Body)); } catch (error) { if (message.ApproximateReceiveCount > 3) { // Record DLQ routing ctx.recordDLQ('orders-dlq', error.message); throw error; // Let SQS move to DLQ } throw error; // Retry } }); ``` **Consumer-Specific Attributes:** - `messaging.consumer.group` - Consumer group name - `messaging.batch.message_count` - Batch size (if batch mode) - `messaging.operation` - "receive" or "process" ### Consumer Lag Metrics Track consumer lag for performance monitoring: ```typescript import { traceConsumer } from 'autotel'; export const processWithLag = traceConsumer({ system: 'kafka', destination: 'events', consumerGroup: 'processor', lagMetrics: { getCurrentOffset: (msg) => Number(msg.offset), getEndOffset: async () => { const offsets = await admin.fetchTopicOffsets('events'); return Number(offsets[0].high); }, partition: 0, }, })((ctx) => async (message) => { // Lag attributes automatically added: // - messaging.kafka.consumer_lag // - messaging.kafka.message_offset await processMessage(message); }); ``` ### Custom Messaging System Adapters For messaging systems not directly supported (NATS, Temporal, Cloudflare Queues, etc.), use pre-built adapters or create your own: ```typescript import { traceConsumer, traceProducer } from 'autotel/messaging'; import { natsAdapter, temporalAdapter, cloudflareQueuesAdapter, datadogContextExtractor, b3ContextExtractor, } from 'autotel/messaging/adapters'; // NATS JetStream consumer with automatic attribute extraction const processNatsMessage = traceConsumer({ system: 'nats', destination: 'orders.created', consumerGroup: 'order-processor', ...natsAdapter.consumer, // Adds nats.subject, nats.stream, nats.consumer })((ctx) => async (msg) => { await handleOrder(msg.data); msg.ack(); }); // Temporal activity with workflow context const processActivity = traceConsumer({ system: 'temporal', destination: 'order-activities', ...temporalAdapter.consumer, // Adds temporal.workflow_id, temporal.run_id, temporal.attempt })((ctx) => async (info, input) => { return processOrder(input); }); // Consume messages with Datadog trace context (non-W3C format) const processFromDatadog = traceConsumer({ system: 'kafka', destination: 'events', customContextExtractor: datadogContextExtractor, // Converts Datadog decimal IDs to OTel hex })((ctx) => async (msg) => { // Links to parent Datadog span automatically }); ``` **Available Adapters:** | Adapter | Captures | | ------------------------- | ----------------------------------------------------- | | `natsAdapter` | subject, stream, consumer, pending, redelivery_count | | `temporalAdapter` | workflow_id, run_id, activity_id, task_queue, attempt | | `cloudflareQueuesAdapter` | message_id, timestamp, attempts | | `datadogContextExtractor` | Converts Datadog decimal trace IDs to OTel hex | | `b3ContextExtractor` | Parses B3/Zipkin single or multi-header format | | `xrayContextExtractor` | Parses AWS X-Ray trace header | **Building Custom Adapters:** See [Bring Your Own System Guide](./docs/messaging-byos-guide.md) for step-by-step instructions on creating adapters for any messaging system. ## Safe Baggage Propagation Baggage allows key-value pairs to propagate across service boundaries. Autotel provides safe baggage schemas with built-in guardrails for PII detection, size limits, and high-cardinality value hashing. ### BusinessBaggage (Pre-built Schema) Use the pre-built `BusinessBaggage` schema for common business context: ```typescript import { BusinessBaggage, trace } from 'autotel'; export const processOrder = trace((ctx) => async (order: Order) => { // Set business context (propagates to downstream services) BusinessBaggage.set(ctx, { tenantId: order.tenantId, userId: order.userId, // Auto-hashed for privacy priority: 'high', // Validated against enum correlationId: order.id, }); // Make downstream call - baggage propagates automatically await fetch('/api/charge', { headers: ctx.getTraceHeaders(), // Includes baggage header }); }); // In downstream service export const chargeOrder = trace((ctx) => async () => { // Read business context const { tenantId, userId, priority } = BusinessBaggage.get(ctx); // Use for routing, logging, access control, etc. logger.info({ tenantId, priority }, 'Processing charge'); }); ``` **Pre-defined Fields:** - `tenantId` - String, max 64 chars - `userId` - String, auto-hashed for privacy - `correlationId` - String, for request correlation - `workflowId` - String, for saga/workflow tracking - `priority` - Enum: 'low', 'normal', 'high', 'critical' - `region` - String, deployment region - `channel` - String (web, mobile, api, etc.) ### Custom Baggage Schemas Create type-safe baggage schemas with validation and guardrails: ```typescript import { createSafeBaggageSchema } from 'autotel'; // Define custom schema const OrderBaggage = createSafeBaggageSchema( { orderId: { type: 'string', maxLength: 36 }, customerId: { type: 'string', hash: true }, // Auto-hash for privacy tier: { type: 'enum', values: ['free', 'pro', 'enterprise'] as const }, amount: { type: 'number' }, isVip: { type: 'boolean' }, }, { prefix: 'order', // Baggage keys: order.orderId, order.tier, etc. maxKeyLength: 64, // Validate key length maxValueLength: 256, // Validate value length redactPII: true, // Auto-detect and redact PII patterns hashHighCardinality: true, // Hash values that look high-cardinality }, ); // Use in traced functions export const processOrder = trace((ctx) => async (order: Order) => { // Type-safe set (TypeScript validates fields) OrderBaggage.set(ctx, { orderId: order.id, customerId: order.customerId, // Will be hashed tier: order.tier, // Must be 'free' | 'pro' | 'enterprise' amount: order.total, isVip: order.customer.isVip, }); // Type-safe get const { orderId, tier, isVip } = OrderBaggage.get(ctx); // Check if specific field is set if (OrderBaggage.has(ctx, 'customerId')) { // ... } // Delete specific field OrderBaggage.delete(ctx, 'amount'); // Clear all fields OrderBaggage.clear(ctx); }); ``` **Guardrails:** - **Size Limits** - Prevents baggage from growing unbounded - **PII Detection** - Auto-redacts email, phone, SSN patterns - **High-Cardinality Hashing** - Hashes UUIDs, timestamps to reduce cardinality - **Enum Validation** - Rejects invalid enum values - **Type Coercion** - Numbers/booleans serialized correctly ## Workflow & Saga Tracing Track distributed workflows and sagas with compensation support. Each step creates a linked span, and failed steps can trigger automatic compensation. ### Basic Workflows Use `traceWorkflow` and `traceStep` for multi-step processes: ```typescript import { traceWorkflow, traceStep } from 'autotel'; // Define workflow with unique ID export const processOrder = traceWorkflow({ name: 'OrderFulfillment', workflowId: (order) => order.id, // Generate from first arg })((ctx) => async (order: Order) => { // Step 1: Validate order await traceStep({ name: 'ValidateOrder' })((ctx) => async () => { await validateOrder(order); })(); // Step 2: Reserve inventory (links to previous step) await traceStep({ name: 'ReserveInventory', linkToPrevious: true, })((ctx) => async () => { await inventoryService.reserve(order.items); })(); // Step 3: Process payment await traceStep({ name: 'ProcessPayment', linkToPrevious: true, })((ctx) => async () => { await paymentService.charge(order); })(); return { success: true }; }); ``` **Workflow Attributes:** - `workflow.name` - Workflow type name - `workflow.id` - Unique instance ID - `workflow.version` - Optional version - `workflow.step.name` - Current step name - `workflow.step.index` - Step sequence number - `workflow.step.status` - completed, failed, compensated ### Saga Pattern with Compensation Define compensating actions for rollback on failure: ```typescript import { traceWorkflow, traceStep } from 'autotel'; export const orderSaga = traceWorkflow({ name: 'OrderSaga', workflowId: (order) => order.id, })((ctx) => async (order: Order) => { // Step 1: Reserve inventory (with compensation) await traceStep({ name: 'ReserveInventory', compensate: async (stepCtx, error) => { // Called if later step fails await inventoryService.release(order.items); stepCtx.setAttribute('compensation.reason', error.message); }, })((ctx) => async () => { await inventoryService.reserve(order.items); })(); // Step 2: Charge payment (with compensation) await traceStep({ name: 'ChargePayment', linkToPrevious: true, compensate: async (stepCtx, error) => { await paymentService.refund(order.id); }, })((ctx) => async () => { await paymentService.charge(order); })(); // Step 3: Ship order (no compensation - point of no return) await traceStep({ name: 'ShipOrder', linkToPrevious: true, })((ctx) => async () => { await shippingService.ship(order); })(); }); // If ShipOrder fails, compensations run in reverse: // 1. ChargePayment.compensate (refund) // 2. ReserveInventory.compensate (release) ``` **WorkflowContext Methods:** - `ctx.getWorkflowId()` - Get current workflow instance ID - `ctx.getWorkflowName()` - Get workflow type name - `ctx.getStepIndex()` - Current step number - `ctx.getPreviousStepContext()` - SpanContext for linking **Compensation Attributes:** - `workflow.step.compensated` - Boolean, true if compensation ran - `workflow.compensation.executed` - Number of compensations executed - `compensation.reason` - Why compensation was triggered ## Business Metrics & Product Events Autotel treats metrics and events as first-class citizens so engineers and product teams share the same context. ### OpenTelemetry Metrics (Metric class + helpers) ```typescript import { Metric, createHistogram } from 'autotel'; const metrics = new Metric('checkout'); const revenue = createHistogram('checkout.revenue'); export const processOrder = trace((ctx) => async (order) => { metrics.trackEvent('order.completed', { orderId: order.id, amount: order.total, }); metrics.trackValue('revenue', order.total, { currency: order.currency }); revenue.record(order.total, { currency: order.currency }); }); ``` - Emits OpenTelemetry counters/histograms via the OTLP endpoint configured in `init()`. - Infrastructure metrics are enabled by default in **every** environment. ### Product Events (PostHog, Mixpanel, Amplitude, …) Track user behavior, conversion funnels, and business outcomes alongside your OpenTelemetry traces. **Recommended: Configure subscribers in `init()`, use global `track()` function:** ```typescript import { init, track, trace } from 'autotel'; import { PostHogSubscriber } from 'autotel-subscribers/posthog'; init({ service: 'checkout', subscribers: [new PostHogSubscriber({ apiKey: process.env.POSTHOG_KEY! })], }); export const signup = trace('user.signup', async (user) => { // All events use subscribers from init() automatically