UNPKG

llm-stream-parser

Version:

A TypeScript library for parsing and processing structured data from LLM streaming responses with custom tag definitions and event-driven architecture

1,021 lines (789 loc) โ€ข 27.9 kB
# ๐Ÿš€ LLM Stream Parser <div align="center"> [![npm version](https://img.shields.io/npm/v/llm-stream-parser.svg?style=for-the-badge)](https://www.npmjs.com/package/llm-stream-parser) [![TypeScript](https://img.shields.io/badge/TypeScript-007ACC?style=for-the-badge&logo=typescript&logoColor=white)](https://www.typescriptlang.org/) [![Node.js](https://img.shields.io/badge/Node.js-43853D?style=for-the-badge&logo=node.js&logoColor=white)](https://nodejs.org/) [![MIT License](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT) **๐ŸŽฏ Professional-grade XML parser optimized for Large Language Model streaming responses** _Real-time parsing โ€ข Type-safe โ€ข Event-driven โ€ข Production-ready_ </div> --- ## ๐ŸŒŸ Overview **LLM Stream Parser** is a high-performance, TypeScript-first library designed specifically for parsing structured XML tags from streaming Large Language Model responses. Whether you're building AI chatbots, processing OpenAI streams, or handling Claude responses, this library provides the robust, real-time parsing capabilities you need. ### ๐ŸŽฏ **Why LLM Stream Parser?** - **๐Ÿ”„ Stream-First Design**: Parse content as it arrives, no waiting for completion - **๐Ÿง  LLM-Optimized**: Built specifically for AI model output patterns - **โšก High Performance**: >10,000 tags/second with intelligent buffering - **๐Ÿ›ก๏ธ Type-Safe**: Full TypeScript support with strict typing - **๐Ÿ—๏ธ Dual Parsing Modes**: Simple flat parsing or complex nested hierarchies - **๐ŸŽจ Flexible Configuration**: Extensive customization and validation options - **๐Ÿ“Š Event-Driven**: Real-time monitoring with comprehensive event system - **๐Ÿงช Production Ready**: Comprehensive testing, error handling, and documentation ## ๐Ÿ“‹ Table of Contents - [๐ŸŒŸ Overview](#-overview) - [โšก Quick Start](#-quick-start) - [๐Ÿ“ฆ Installation](#-installation) - [๐ŸŽฏ Real-World Example](#-real-world-example) - [๐Ÿ“– API Reference](#-api-reference) - [๐Ÿ”ง Configuration](#-configuration) - [๐ŸŽจ Advanced Usage](#-advanced-usage) - [๐Ÿš€ Performance](#-performance) - [๐Ÿงช Testing](#-testing) - [๐Ÿค Contributing](#-contributing) ## โœจ Key Features <table> <tr> <td width="50%"> ### ๐ŸŽฏ **Core Capabilities** - โœ… **Stream Processing** - Real-time parsing as data arrives - โœ… **Dual Modes** - Flat parsing or nested hierarchies - โœ… **Type Safety** - Full TypeScript support - โœ… **Event System** - Real-time monitoring and callbacks - โœ… **Error Recovery** - Robust error handling and recovery </td> <td width="50%"> ### ๐Ÿ”ง **Advanced Features** - โœ… **Validation** - Content and attribute validation - โœ… **Transformation** - Built-in content transformers - โœ… **Buffer Management** - Intelligent memory management - โœ… **Performance** - Optimized for high throughput - โœ… **Extensibility** - Modular, customizable architecture </td> </tr> </table> ## ๐Ÿ“ฆ Installation ```bash npm install llm-stream-parser ``` ```bash yarn add llm-stream-parser ``` ```bash pnpm add llm-stream-parser ``` ## โšก Quick Start Get up and running in under 2 minutes: ### ๐Ÿš€ **Basic Usage** ```typescript import { LLMStreamParser } from 'llm-stream-parser'; // 1. Create parser const parser = new LLMStreamParser(); // 2. Define tags parser.addSimpleTags(['thinking', 'response', 'code']); // 3. Handle events parser.on('tag_completed', tag => { console.log(`โœ… ${tag.tagName}: ${tag.content}`); }); // 4. Parse streaming content parser.parse('<thinking>Analyzing the problem...</thinking>'); parser.parse("<response>Here's the solution:</response>"); parser.finalize(); ``` ### ๐Ÿง  **LLM Integration Example** ```typescript // Perfect for Gemini, OpenAI, Claude, or any streaming LLM import * as dotenv from 'dotenv'; import { GoogleGenAI } from '@google/genai'; import { LLMStreamParser } from 'llm-stream-parser'; dotenv.config(); const systemPrompt = ` You are an instructor creating technical training content. You must respond in the following XML format: <lesson> <title>Title here</title> <step number="1" difficulty="easy"> <explanation>exp here</explanation> <code>code here</code> </step> <summary>summary_here</summary> </lesson> `; const GEMINI_API_KEY = process.env.GEMINI_API_KEY; const ai = new GoogleGenAI({ apiKey: GEMINI_API_KEY }); async function main() { const question = 'Create a lesson on developing a REST API with Express.js'; const parser = new LLMStreamParser(); parser.on('tag_completed', tag => { console.log(`${JSON.stringify(tag)}\n`); }); parser.addSimpleTags([ 'lesson', 'title', 'step', 'explanation', 'code', 'summary', ]); const response = await ai.models.generateContentStream({ model: 'gemini-2.0-flash', contents: `${systemPrompt}\n\n${question}`, }); for await (const chunk of response) { if (chunk.text) { parser.parse(chunk.text); } } } main().catch(console.error); ``` ## ๐ŸŽฏ Real-World Example See the parser in action with our **live streaming demo** that simulates actual LLM responses: ### ๐ŸŒŠ **Chunk Streaming Demo** ```bash # Run the real-world streaming demo node demo-streaming.js ``` This demo showcases: - **Realistic chunk streaming** (25-character chunks with delays) - **Complex nested XML** parsing in real-time - **Event-driven processing** with live updates - **Performance monitoring** and statistics - **Error handling** and recovery **Sample Output:** ``` ๐Ÿš€ LLM Stream Parser - Real World Demo ๐Ÿงช Starting Chat Conversation Demo ๐Ÿ“‹ Registered 10 tag types ๐ŸŒŠ Starting streaming simulation... ๐Ÿ“ฆ Chunk 1/47: "<conversation id=\"support" ๐Ÿ”“ Tag opened: <conversation> (depth: 1) ๐Ÿ“ฆ Chunk 2/47: "-chat\" timestamp=\"2024-0" ๐Ÿ“ฆ Chunk 3/47: "1-15T14:30:00Z\">\n <me" ๐Ÿ”“ Tag opened: <message> (depth: 2) ๐Ÿ“ฆ Chunk 4/47: "ssage role=\"user\" id=\"m" ... โœ… Tag completed: <thinking> with 108 chars ๐Ÿ“ Content preview: "The user is experiencing login issues. I should ga..." โœ… Tag completed: <message> with 891 chars ๐ŸŽ‰ Parsing complete! Total tags: 15 ๐Ÿ“Š DEMO RESULTS โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โฑ๏ธ Processing time: 2485ms ๐Ÿ“ฆ Total chunks: 47 ๐Ÿ“ Content length: 1174 characters ๐Ÿท๏ธ Tags completed: 15 ๐Ÿšจ Errors: 0 โœ… Success rate: 100.0% ``` ### ๐ŸŽจ **Advanced Demo Features** The demo includes sophisticated examples like: - **Chat conversations** with nested thinking blocks - **AI reasoning** with evidence and analysis - **Code generation** with syntax highlighting - **Real-time monitoring** with performance metrics - **Error simulation** and recovery testing Perfect for testing integration with OpenAI, Claude, Gemini, or any LLM that uses XML-structured responses. ## ๐Ÿ“– API Reference ### ๐Ÿ—๏ธ **Core Classes** <details> <summary><strong>๐Ÿ“˜ LLMStreamParser&lt;T&gt; - Main API Class</strong></summary> The primary interface for all parsing operations. ```typescript class LLMStreamParser<T extends BaseTag = BaseTag> { constructor(config?: ParserConfig); } ``` #### **๐Ÿ”ง Tag Management** ```typescript // Define custom tags with validation and transformation defineTag(definition: TagDefinition<T>): this; defineTags(definitions: TagDefinition<T>[]): this; // Quick setup for simple tags addSimpleTag(tagName: string, options?: SimpleTagOptions): this; addSimpleTags(tagNames: string[]): this; // Tag registry management removeTag(tagName: string): boolean; hasTag(tagName: string): boolean; getRegisteredTags(): readonly string[]; ``` #### **โšก Parsing Operations** ```typescript // Stream processing parse(chunk: string): void; // Process individual chunks parseComplete(content: string): void; // Parse complete content at once finalize(): void; // Complete parsing session reset(): void; // Reset parser state ``` #### **๐Ÿ“Š State & Monitoring** ```typescript getState(): ParserState; // Current parser state getStats(): ParserStats; // Performance statistics getCurrentDepth(): number; // Current nesting depth getCurrentPath(): string; // Current tag path getBufferSize(): number; // Current buffer usage ``` #### **๐ŸŽฏ Event System** ```typescript // Event listeners on<K extends keyof ParserEvents<T>>(event: K, listener: ParserEvents<T>[K]): this; off<K extends keyof ParserEvents<T>>(event: K, listener: ParserEvents<T>[K]): this; once<K extends keyof ParserEvents<T>>(event: K, listener: ParserEvents<T>[K]): this; // Utility methods clone(): LLMStreamParser<T>; // Create parser copy ``` </details> ### ๐ŸŽฏ **Key Types & Interfaces** <details> <summary><strong>๐Ÿ“‹ ParserConfig - Configuration Options</strong></summary> ```typescript interface ParserConfig { // ๐Ÿ”ง Basic Configuration maxBufferSize?: number; // Default: 1MB - Buffer size limit trimWhitespace?: boolean; // Default: true - Auto-trim content caseSensitive?: boolean; // Default: true - Case-sensitive tags // ๐Ÿ—๏ธ Nested Mode Configuration enableNested?: boolean; // Default: false - Enable hierarchy maxDepth?: number; // Default: 10 - Max nesting depth autoCloseUnclosed?: boolean; // Default: false - Auto-close tags // โšก Performance Configuration batchSize?: number; // Default: 1000 - Processing batch size enableStatistics?: boolean; // Default: true - Track performance } ``` #### **๐ŸŽฏ Common Configurations** ```typescript // ๐Ÿš€ High Performance Setup const performanceConfig: ParserConfig = { maxBufferSize: 2 * 1024 * 1024, // 2MB buffer batchSize: 2000, // Large batches enableStatistics: false, // Skip stats for speed trimWhitespace: false, // Skip trimming }; // ๐Ÿ”’ Strict Validation Setup const strictConfig: ParserConfig = { caseSensitive: true, // Exact case matching autoCloseUnclosed: false, // Require proper closing enableStatistics: true, // Monitor for errors maxDepth: 5, // Limit nesting }; // ๐Ÿง  LLM Optimized Setup const llmConfig: ParserConfig = { enableNested: true, // Support hierarchies maxDepth: 15, // Deep nesting for AI autoCloseUnclosed: true, // Forgive LLM errors trimWhitespace: true, // Clean AI output }; ``` </details> <details> <summary><strong>๐Ÿท๏ธ TagDefinition - Advanced Tag Configuration</strong></summary> ```typescript interface TagDefinition<T extends BaseTag = BaseTag> { readonly tagName: T['tagName']; // โœ… Validation validateContent?: (content: string) => ValidationResult; validateAttributes?: (attributes?: T['attributes']) => ValidationResult; validateChildren?: (children: NestedTag[]) => ValidationResult; // ๐Ÿ”„ Transformation transformContent?: (content: string) => string; transformAttributes?: ( attributes?: Record<string, unknown> ) => T['attributes']; // โš™๏ธ Configuration allowChildren?: boolean; // Default: true allowSelfClosing?: boolean; // Default: true // ๐ŸŽฏ Event Handlers onStart?: (tagName: T['tagName'], attributes?: T['attributes']) => void; onContentUpdate?: (partialContent: string, tag: Partial<T>) => void; onComplete?: (tag: T) => void; onChildAdded?: (child: NestedTag, parent: T) => void; } ``` </details> ## ๐Ÿ”ง Configuration ### โš™๏ธ **Default Configuration** ```typescript const defaultConfig: ParserConfig = { maxBufferSize: 1024 * 1024, // 1MB buffer trimWhitespace: true, // Auto-trim content caseSensitive: true, // Case-sensitive tags enableNested: false, // Flat parsing mode maxDepth: 10, // Max nesting depth autoCloseUnclosed: false, // Strict tag closing batchSize: 1000, // Processing batch size enableStatistics: true, // Performance tracking }; ``` ### ๐ŸŽฏ **Configuration Examples** <details> <summary><strong>๐Ÿš€ OpenAI/ChatGPT Integration</strong></summary> ```typescript import { LLMStreamParser } from 'llm-stream-parser'; const chatGPTParser = new LLMStreamParser({ enableNested: true, // Support nested reasoning maxDepth: 8, // Reasonable nesting limit autoCloseUnclosed: true, // Forgive incomplete tags trimWhitespace: true, // Clean output caseSensitive: false, // Flexible tag matching }); // Define common ChatGPT tags chatGPTParser.addSimpleTags([ 'thinking', 'analysis', 'answer', 'code', 'explanation', 'summary', ]); // Handle streaming response chatGPTParser.on('tag_completed', tag => { console.log(`ChatGPT ${tag.tagName}:`, tag.content); }); ``` </details> <details> <summary><strong>๐Ÿง  Claude/Anthropic Integration</strong></summary> ```typescript const claudeParser = new LLMStreamParser({ enableNested: true, maxDepth: 12, // Claude can be deeply nested autoCloseUnclosed: true, // Handle incomplete responses maxBufferSize: 2 * 1024 * 1024, // Larger buffer for long responses }); // Claude-specific tags claudeParser.addSimpleTags([ 'thinking', 'reasoning', 'analysis', 'conclusion', 'example', 'code', 'explanation', 'clarification', ]); ``` </details> <details> <summary><strong>โšก High-Performance Setup</strong></summary> ```typescript const performanceParser = new LLMStreamParser({ maxBufferSize: 4 * 1024 * 1024, // 4MB buffer batchSize: 5000, // Large batch processing enableStatistics: false, // Disable overhead trimWhitespace: false, // Skip trimming enableNested: false, // Flat mode for speed }); // Minimal event handling for maximum speed performanceParser.on('tag_completed', tag => { // Process tags with minimal overhead processTag(tag.tagName, tag.content); }); ``` </details> ## ๐ŸŽจ Advanced Usage ### ๐Ÿค– **LLM Integration Patterns** <details> <summary><strong>๐Ÿ’ฌ Real-Time Chat Applications</strong></summary> ```typescript import { LLMStreamParser } from 'llm-stream-parser'; class AIChatHandler { private parser = new LLMStreamParser({ enableNested: true, autoCloseUnclosed: true, }); constructor() { this.setupParser(); } private setupParser() { // Define chat-specific tags this.parser.addSimpleTags([ 'thinking', 'response', 'code', 'explanation', 'question', ]); // Real-time UI updates this.parser.on('tag_started', tagName => { this.showTypingIndicator(tagName); }); this.parser.on('tag_content_update', (tagName, content) => { this.updateChatMessage(tagName, content); }); this.parser.on('tag_completed', tag => { this.finalizeChatMessage(tag); }); } async handleStreamingResponse(stream: ReadableStream) { const reader = stream.getReader(); try { while (true) { const { done, value } = await reader.read(); if (done) break; const chunk = new TextDecoder().decode(value); this.parser.parse(chunk); } } finally { this.parser.finalize(); reader.releaseLock(); } } private showTypingIndicator(tagName: string) { // Update UI to show AI is generating specific content type console.log(`๐Ÿค– AI is generating ${tagName}...`); } private updateChatMessage(tagName: string, content: string) { // Real-time content updates in chat interface document.getElementById(`${tagName}-content`).textContent = content; } private finalizeChatMessage(tag: any) { // Apply final formatting and save to chat history console.log(`โœ… Completed ${tag.tagName}:`, tag.content); } } ``` </details> <details> <summary><strong>๐Ÿ”ฌ Complex AI Reasoning Analysis</strong></summary> ```typescript import { LLMStreamParser, ContentValidators, ContentTransformers, } from 'llm-stream-parser'; class ReasoningAnalyzer { private parser = new LLMStreamParser({ enableNested: true, maxDepth: 15, }); constructor() { this.setupAdvancedTags(); } private setupAdvancedTags() { // Evidence tag with validation this.parser.defineTag({ tagName: 'evidence', validateAttributes: attrs => { if (!attrs?.source) return 'Evidence must have a source'; if ( !attrs?.strength || !['weak', 'medium', 'strong'].includes(attrs.strength as string) ) { return 'Evidence strength must be weak, medium, or strong'; } return true; }, transformContent: ContentTransformers.trim(), onComplete: tag => { this.analyzeEvidence(tag); }, }); // Analysis tag with confidence scoring this.parser.defineTag({ tagName: 'analysis', validateAttributes: attrs => { const confidence = attrs?.confidence as number; if (confidence && (confidence < 0 || confidence > 1)) { return 'Confidence must be between 0 and 1'; } return true; }, onComplete: tag => { this.trackAnalysisQuality(tag); }, }); // Conclusion with summary generation this.parser.defineTag({ tagName: 'conclusion', transformContent: ContentTransformers.chain( ContentTransformers.trim(), ContentTransformers.normalizeWhitespace() ), onComplete: tag => { this.generateSummary(tag); }, }); } private analyzeEvidence(tag: any) { const strength = tag.attributes?.strength; const source = tag.attributes?.source; console.log(`๐Ÿ“Š Evidence Analysis: Source: ${source} Strength: ${strength} Content: ${tag.content.substring(0, 100)}...`); } private trackAnalysisQuality(tag: any) { const confidence = tag.attributes?.confidence || 0.5; console.log(`๐ŸŽฏ Analysis Quality Score: ${(confidence * 100).toFixed(1)}%`); } private generateSummary(tag: any) { console.log(`๐Ÿ“ Conclusion Summary: ${tag.content}`); } } ``` </details> ## ๐Ÿ”ง Advanced Configuration ### Custom Validators ```typescript import { ContentValidators, AttributeValidators } from 'llm-stream-parser'; const parser = createParser(); parser.defineTag({ tagName: 'email', validateContent: ContentValidators.pattern( /^[^\s@]+@[^\s@]+\.[^\s@]+$/, 'Invalid email format' ), validateAttributes: AttributeValidators.required(['type']), transformContent: ContentTransformers.toLowerCase(), }); parser.defineTag({ tagName: 'phone', validateContent: ContentValidators.custom(content => { const cleaned = content.replace(/\D/g, ''); return cleaned.length >= 10 || 'Phone number must have at least 10 digits'; }), transformContent: content => content.replace(/\D/g, ''), }); ``` ### Custom Transformers ```typescript import { ContentTransformers } from 'llm-stream-parser'; // Create custom transformer const sanitizeHtml = ContentTransformers.custom( content => content.replace(/<script[^>]*>.*?<\/script>/gi, ''), 'Failed to sanitize HTML' ); parser.defineTag({ tagName: 'content', transformContent: ContentTransformers.chain( ContentTransformers.stripHtml(), sanitizeHtml, ContentTransformers.trim(), ContentTransformers.normalizeWhitespace() ), }); ``` ### Buffer Management ```typescript const parser = createParser({ maxBufferSize: 512 * 1024, // 512KB limit batchSize: 500, // Process in smaller batches }); // Monitor buffer usage parser.on('stats_updated', stats => { const usage = (stats.bufferSize / (512 * 1024)) * 100; if (usage > 80) { console.warn(`Buffer usage high: ${usage.toFixed(1)}%`); } }); ``` ## ๐Ÿ—๏ธ Architecture ### Modular Design The library follows a clean modular architecture with focused responsibilities: ``` src/ โ”œโ”€โ”€ types/ # Type definitions (6 modules) โ”‚ โ”œโ”€โ”€ base.ts # Core types (BaseTag, NestedTag, etc.) โ”‚ โ”œโ”€โ”€ config.ts # Configuration interfaces โ”‚ โ”œโ”€โ”€ errors.ts # Error handling types โ”‚ โ”œโ”€โ”€ events.ts # Event system types โ”‚ โ”œโ”€โ”€ schema.ts # Schema definition types โ”‚ โ””โ”€โ”€ index.ts # Type exports โ”œโ”€โ”€ core/ # Core functionality (5 modules) โ”‚ โ”œโ”€โ”€ buffer-manager.ts # Stream buffer management โ”‚ โ”œโ”€โ”€ tag-matcher.ts # XML pattern matching โ”‚ โ”œโ”€โ”€ validator.ts # Content validation โ”‚ โ”œโ”€โ”€ transformer.ts # Content transformation โ”‚ โ””โ”€โ”€ stream-parser.ts # Main parsing engine โ”œโ”€โ”€ llm-stream-parser.ts # User-facing API โ””โ”€โ”€ index.ts # Main exports ``` ### Design Principles - **๐Ÿ”จ Single Responsibility**: Each module has one focused purpose - **๐Ÿ“ฆ Modular**: Files kept under 300 lines for maintainability - **๐Ÿ›ก๏ธ Type Safety**: Full TypeScript coverage with strict typing - **โšก Performance**: Optimized for high-throughput streaming - **๐Ÿ”„ Extensibility**: Easy to extend with new functionality - **๐Ÿงช Testability**: Isolated modules for comprehensive testing ## ๐Ÿš€ Performance ### ๐Ÿ“Š **Benchmarks** <table> <tr> <td width="50%"> #### **๐ŸŽฏ Processing Speed** - **Throughput**: >10,000 tags/second - **Latency**: <1ms average completion - **Scalability**: Linear with content size - **Memory**: O(1) with buffer limits </td> <td width="50%"> #### **โšก Real-World Performance** - **OpenAI Streams**: 15-25ms processing - **Large Documents**: 500MB+ supported - **Concurrent Parsers**: 100+ instances - **Memory Usage**: <10MB typical </td> </tr> </table> ### ๐Ÿ”ง **Optimization Strategies** <details> <summary><strong>โšก High-Throughput Configuration</strong></summary> ```typescript // For maximum performance (>50,000 tags/sec) const speedParser = new LLMStreamParser({ maxBufferSize: 8 * 1024 * 1024, // 8MB buffer batchSize: 10000, // Large batches enableStatistics: false, // Disable overhead trimWhitespace: false, // Skip processing enableNested: false, // Flat mode only caseSensitive: false, // Faster matching }); // Minimal event handling speedParser.on('tag_completed', tag => { // Direct processing without logging processTagDirectly(tag.tagName, tag.content); }); ``` </details> <details> <summary><strong>๐Ÿ’พ Memory-Optimized Configuration</strong></summary> ```typescript // For memory-constrained environments (<1MB usage) const memoryParser = new LLMStreamParser({ maxBufferSize: 32 * 1024, // 32KB buffer batchSize: 50, // Small batches enableStatistics: true, // Monitor usage trimWhitespace: true, // Clean up immediately }); // Monitor memory usage memoryParser.on('stats_updated', stats => { if (stats.bufferSize > 24 * 1024) { console.warn('High memory usage detected'); } }); ``` </details> <details> <summary><strong>๐Ÿ”„ Production Deployment</strong></summary> ```typescript // Production-ready configuration const productionParser = new LLMStreamParser({ maxBufferSize: 2 * 1024 * 1024, // 2MB buffer batchSize: 1000, // Balanced batches enableStatistics: true, // Monitor performance autoCloseUnclosed: true, // Handle malformed input maxDepth: 20, // Support complex nesting enableNested: true, // Full feature set }); // Comprehensive error handling productionParser.on('parse_error', (error, context) => { logger.error('Parser error', { error, context }); metrics.increment('parser.errors'); }); // Performance monitoring productionParser.on('stats_updated', stats => { metrics.gauge('parser.buffer_size', stats.bufferSize); metrics.gauge('parser.tags_processed', stats.totalTagsParsed); }); ``` </details> ## ๐Ÿงช Testing ### Unit Tests ```typescript import { createParser } from 'llm-stream-parser'; describe('LLMStreamParser', () => { let parser: LLMStreamParser; beforeEach(() => { parser = createParser(); }); test('should parse simple tags', () => { parser.addSimpleTag('test'); const results: any[] = []; parser.on('tag_completed', tag => results.push(tag)); parser.parseComplete('<test>Hello World</test>'); expect(results).toHaveLength(1); expect(results[0].tagName).toBe('test'); expect(results[0].content).toBe('Hello World'); }); test('should handle streaming content', () => { parser.addSimpleTag('stream'); const results: any[] = []; parser.on('tag_completed', tag => results.push(tag)); parser.parse('<stream>Part 1 '); parser.parse('Part 2 '); parser.parse('Part 3</stream>'); parser.finalize(); expect(results[0].content).toBe('Part 1 Part 2 Part 3'); }); }); ``` ### Integration Tests ```typescript describe('Integration Tests', () => { test('should handle complex nested structure', () => { const parser = createParser({ enableNested: true }); parser.addSimpleTags(['doc', 'section', 'para']); const structure: any[] = []; parser.on('tag_closed', (tag, depth, path) => { structure.push({ tag: tag.tagName, depth, path }); }); parser.parseComplete(` <doc> <section> <para>Content</para> </section> </doc> `); expect(structure).toHaveLength(3); expect(structure[0].path).toBe('doc/section/para'); }); }); ``` ## ๐Ÿค Contributing ### ๐Ÿ› ๏ธ **Development Setup** ```bash # Clone the repository git clone https://github.com/brankrts/llm-stream-parser.git cd llm-stream-parser # Install dependencies npm install # Run tests npm test # Build the library npm run build # Run linting npm run lint # Watch mode for development npm run dev ``` ### ๐Ÿ“‹ **Code Standards** - โœ… **TypeScript** - Strict mode enabled - โœ… **ESLint** - Follow configuration rules - โœ… **File Size** - Keep modules under 300 lines - โœ… **Test Coverage** - Write comprehensive tests - โœ… **Documentation** - Document public APIs --- ## ๐Ÿ“„ License This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details. ## ๐Ÿ™ Acknowledgments - ๐Ÿง  **Inspired by** the need for real-time LLM response parsing - ๐Ÿ—๏ธ **Built with** modern TypeScript best practices - ๐Ÿค– **Designed for** the era of AI-powered applications ## ๐Ÿ“ž Support & Resources <div> | Resource | Link | | -------------------- | -------------------------------------------------------------- | | ๐Ÿ“– **Documentation** | [Wiki](https://github.com/brankrts/llm-stream-parser) | | ๐Ÿ› **Bug Reports** | [Issues](https://github.com/brankrts/llm-stream-parser/issues) | | ๐Ÿ“ง **Email** | [brankrts@gmail.com](mailto:brankrts@gmail.com) | </div> --- <div> ### ๐Ÿš€ **Ready to parse your LLM streams?** ```bash npm install llm-stream-parser ``` **[โญ Star this repository](https://github.com/brankrts/llm-stream-parser)** if you find it useful! _Built with โค๏ธ for the AI community_ </div>