UNPKG

@ooples/token-optimizer-mcp

Version:

Intelligent context window optimization for Claude Code - store content externally via caching and compression, freeing up your context window for what matters

102 lines 3.51 kB
export interface TokenCountResult { tokens: number; characters: number; estimatedCost?: number; } export declare class TokenCounter { private encoder; private readonly model; constructor(model?: string); /** * Map Claude/Anthropic models to tiktoken model names */ private mapToTiktokenModel; /** * Count tokens in text */ count(text: string): TokenCountResult; /** * Count tokens in multiple texts */ countBatch(texts: string[]): TokenCountResult; /** * Estimate token count without encoding (faster, less accurate) */ estimate(text: string): number; /** * Calculate token savings based on context window management * * @param originalText - The original text content * @param contextTokens - Number of tokens remaining in LLM context (default: 0 for full caching) * @returns Token savings calculation * * @remarks * This method measures context window optimization, NOT compression ratio. * When content is cached externally (SQLite, Redis, etc.), it's completely * removed from the LLM's context window, resulting in 100% token savings. * * Use cases: * - External caching: contextTokens = 0 (100% savings) * - Metadata-only: contextTokens = tokens in metadata (e.g., 8) * - Summarization: contextTokens = tokens in summary (e.g., 50) */ calculateSavings(originalText: string, contextTokens?: number): { originalTokens: number; contextTokens: number; tokensSaved: number; percentSaved: number; }; /** * Calculate context window savings for externally cached content * * @param originalText - The original text content being cached * @returns Token savings calculation with 100% savings * * @remarks * When content is compressed and stored in an external cache (SQLite, Redis, etc.), * it's completely removed from the LLM's context window. The compressed/encoded * data is NEVER sent to the LLM, so we measure 100% token savings. * * Key insight: We're measuring CONTEXT WINDOW CLEARANCE, not compression ratio. * - ✅ Content removed from LLM context (saves tokens) * - ✅ Storage compressed (saves disk space) * - ❌ Don't count tokens in compressed data (it's not sent to LLM!) * * @example * ```typescript * const tokenCounter = new TokenCounter(); * const content = "Large file content..."; * const compressed = compress(content); * * // Store in external cache * await cache.set(key, compressed); * * // Calculate context window savings * const savings = tokenCounter.calculateCacheSavings(content); * // Returns: { originalTokens: 250, contextTokens: 0, tokensSaved: 250, percentSaved: 100 } * ``` */ calculateCacheSavings(originalText: string): { originalTokens: number; contextTokens: number; tokensSaved: number; percentSaved: number; }; /** * Check if text exceeds token limit */ exceedsLimit(text: string, limit: number): boolean; /** * Truncate text to fit within token limit */ truncate(text: string, maxTokens: number): string; /** * Get token-to-character ratio for text */ getTokenCharRatio(text: string): number; /** * Free the encoder resources */ free(): void; } //# sourceMappingURL=token-counter.d.ts.map