3gpp-mcp-server

# Architecture Guide ## System Overview The 3GPP MCP Server is designed as a layered architecture that transforms static 3GPP specifications into an intelligent, queryable knowledge base. The system follows the Model Context Protocol (MCP) specification while providing domain-specific intelligence for telecommunications standards. ## Architecture Principles ### 1. Layered Separation - **Interface Layer**: MCP protocol handling - **Intelligence Layer**: 3GPP domain logic - **Data Layer**: Specification storage and retrieval ### 2. Protocol Compliance - Full MCP specification compliance - Standardized tool/resource/prompt interfaces - Transport-agnostic design (STDIO primary) ### 3. Domain Expertise - 3GPP-specific search and analysis - Protocol-aware information extraction - Contextual cross-referencing ### 4. Scalability & Performance - Lazy loading of large datasets - Efficient caching strategies - Concurrent query handling ## High-Level Architecture ```mermaid graph TB subgraph "Client Applications" CD[Claude Desktop] CA[Custom Applications] API[REST API Wrapper] end subgraph "MCP Protocol Layer" MCP[MCP Server Core] TH[Tools Handler] RH[Resources Handler] PH[Prompts Handler] end subgraph "3GPP Intelligence Layer" SM[Specification Manager] SE[Search Engine] DP[Document Processor] QP[Query Processor] end subgraph "Data Access Layer" FC[File Cache] MC[Metadata Cache] IDX[Search Index] end subgraph "Storage Layer" TS[TSpec-LLM Dataset] CF[Configuration Files] LOG[Log Files] end CD --> MCP CA --> MCP API --> MCP MCP --> TH MCP --> RH MCP --> PH TH --> SM TH --> SE RH --> SM PH --> DP SM --> FC SE --> IDX DP --> MC QP --> FC FC --> TS MC --> TS IDX --> TS SM --> CF SE --> LOG ``` ## Component Architecture ### MCP Server Core **File**: `src/index.ts` **Role**: Primary MCP protocol implementation ```typescript class GPPMCPServer { private server: MCP.Server private specManager: GPPSpecificationManager private searchEngine: GPPSearchEngine private documentProcessor: GPPDocumentProcessor } ``` **Responsibilities**: - MCP protocol message handling - Request routing to appropriate handlers - Error handling and response formatting - Concurrent request management **Key Design Decisions**: - **Single-threaded**: Node.js event loop for concurrency - **Stateless**: Each request is independent - **Error-first**: Comprehensive error handling - **Typed**: Full TypeScript coverage ### Specification Manager **File**: `src/utils/specification-manager.ts` **Role**: 3GPP specification metadata management ```typescript interface SpecificationManager { initialize(): Promise<void> getSpecification(id: string): Promise<GPPSpecification | null> getAllSpecifications(): Promise<GPPSpecification[]> getCatalog(): Promise<CatalogSummary> } ``` **Architecture**: - **Lazy Initialization**: Load specifications on first access - **Memory-Efficient**: Stream large directories - **Metadata Extraction**: Parse specification metadata - **Caching Strategy**: LRU cache for frequently accessed specs **Data Flow**: ```mermaid sequenceDiagram participant Client participant SM as Specification Manager participant FS as File System participant Cache Client->>SM: getSpecification(id) SM->>Cache: check cache alt Cache Hit Cache-->>SM: specification else Cache Miss SM->>FS: load specification file FS-->>SM: file content SM->>SM: parse metadata SM->>Cache: store specification end SM-->>Client: specification ``` ### Search Engine **File**: `src/utils/search-engine.ts` **Role**: Intelligent search across 3GPP specifications **Search Architecture**: ```typescript interface SearchEngine { search(query: SearchQuery): Promise<SearchResult[]> findRelatedSpecifications(topic: string): Promise<GPPSpecification[]> queryProtocol(name: string, type: QueryType): Promise<string> } ``` **Multi-Stage Search Process**: 1. **Query Analysis**: ```typescript parseQuery(query: string) { // Extract technical terms // Identify 3GPP concepts // Expand abbreviations // Generate search terms } ``` 2. **Filtering**: ```typescript applyFilters(specs: GPPSpecification[], filters: SearchFilters) { // Filter by series (24, 36, 38, etc.) // Filter by release (Rel-15, Rel-16, etc.) // Apply keyword constraints } ``` 3. **Relevance Scoring**: ```typescript calculateRelevance(spec: GPPSpecification, terms: string[]): number { let score = 0; score += titleMatch * 0.4; // Title keyword matches score += keywordMatch * 0.3; // Metadata keywords score += abstractMatch * 0.2; // Abstract content score += contextBoost * 0.1; // Domain context return Math.min(score, 1.0); } ``` 4. **Result Ranking**: ```typescript rankResults(results: SearchResult[]): SearchResult[] { return results .sort((a, b) => b.relevanceScore - a.relevanceScore) .slice(0, limit); } ``` ### Document Processor **File**: `src/utils/document-processor.ts` **Role**: Content extraction and processing **Processing Pipeline**: ```mermaid graph LR A[Raw Document] --> B[Format Detection] B --> C{Format Type} C -->|Markdown| D[MD Parser] C -->|DOCX| E[DOCX Parser] D --> F[Content Extraction] E --> F F --> G[Metadata Parsing] G --> H[Section Analysis] H --> I[Keyword Extraction] I --> J[Summary Generation] J --> K[Processed Content] ``` **Content Analysis**: - **Section Identification**: Extract document structure - **Technical Term Recognition**: Identify 3GPP-specific terminology - **Cross-Reference Extraction**: Find specification references - **Summary Generation**: Create structured summaries ## Data Architecture ### TSpec-LLM Integration **Dataset Structure**: ``` TSpec-LLM/ ├── Rel-8/ │ ├── series-21/ │ │ ├── 21.101.md │ │ └── 21.102.md │ ├── series-22/ │ └── ... ├── Rel-9/ └── ... ``` **File Organization Strategy**: - **Hierarchical**: Organized by release and series - **Consistent Naming**: Predictable file naming patterns - **Preserved Structure**: Original document structure maintained - **Format Standardization**: Markdown with LaTeX formulas ### Memory Management **Specification Loading**: ```typescript class SpecificationCache { private cache: LRU<string, GPPSpecification> private maxSize: number = 1000 async get(id: string): Promise<GPPSpecification | null> { // Check memory cache first let spec = this.cache.get(id); if (!spec) { // Load from disk spec = await this.loadFromDisk(id); if (spec) { this.cache.set(id, spec); } } return spec; } } ``` **Memory Usage Patterns**: - **Metadata Always**: Keep specification metadata in memory - **Content On-Demand**: Load full content only when requested - **LRU Eviction**: Remove least recently used content - **Size Limits**: Configurable memory limits ## Protocol Integration ### MCP Implementation Details **Tool Registration**: ```typescript server.setRequestHandler(ListToolsRequestSchema, async () => { return { tools: [ { name: 'search_3gpp_specs', description: 'Search through 3GPP specifications', inputSchema: SearchQuerySchema.jsonSchema() } // ... other tools ] }; }); ``` **Request Handling**: ```typescript server.setRequestHandler(CallToolRequestSchema, async (request) => { const { name, arguments: args } = request.params; try { switch (name) { case 'search_3gpp_specs': return await this.handleSearchSpecs(args); // ... other handlers } } catch (error) { return this.formatError(error); } }); ``` ### Transport Layer **STDIO Transport** (Primary): ```typescript const transport = new StdioServerTransport(); await server.connect(transport); ``` **Alternative Transports** (Future): - **HTTP**: REST API wrapper - **WebSocket**: Real-time applications - **gRPC**: High-performance integrations ## Performance Architecture ### Concurrency Model **Request Processing**: ```mermaid sequenceDiagram participant C1 as Client 1 participant C2 as Client 2 participant S as Server participant SE as Search Engine participant SM as Spec Manager C1->>S: search_3gpp_specs C2->>S: get_specification_details par Concurrent Processing S->>SE: search(query1) and S->>SM: getSpec(id) end SE-->>S: search results SM-->>S: specification S-->>C1: search response S-->>C2: spec details ``` **Async Processing**: - **Non-blocking I/O**: File operations are asynchronous - **Promise-based**: All operations return Promises - **Concurrent Limits**: Configurable concurrent operation limits - **Backpressure**: Automatic throttling under load ### Caching Strategy **Multi-Level Caching**: ```typescript interface CacheArchitecture { L1: InMemoryCache<string, GPPSpecification> // Hot specifications L2: InMemoryCache<string, SearchResult[]> // Search results L3: FileSystemCache<string, ProcessedContent> // Processed content } ``` **Cache Policies**: - **TTL**: Time-to-live for search results (configurable) - **LRU**: Least Recently Used for specification cache - **Write-through**: Immediate persistence for processed content - **Cache Warming**: Preload frequently accessed specifications ### Indexing Strategy **Search Index Architecture**: ```typescript interface SearchIndex { metadata: Map<string, SpecMetadata> // Fast metadata lookup keywords: Map<string, string[]> // Keyword to spec mapping fullText: Map<string, DocumentChunk[]> // Full-text search index (future) relationships: Graph<string, string> // Spec relationships (future) } ``` ## Security Architecture ### Data Access Security **File System Access**: - **Read-only**: No modification of source specifications - **Path Validation**: Prevent directory traversal attacks - **Sandboxing**: Restrict file access to dataset directory **Input Validation**: ```typescript const SearchQuerySchema = z.object({ query: z.string().min(1).max(1000), // Prevent DoS series: z.array(z.string()).max(10), // Limit array size limit: z.number().min(1).max(50) // Reasonable limits }); ``` ### Resource Protection **Memory Limits**: - **Query Complexity**: Limit concurrent complex searches - **Response Size**: Cap maximum response size - **Timeout Protection**: Timeout long-running operations **Rate Limiting** (Future): ```typescript interface RateLimiter { checkRate(clientId: string, operation: string): boolean incrementCount(clientId: string, operation: string): void } ``` ## Error Handling Architecture ### Error Categories **Validation Errors**: ```typescript class ValidationError extends Error { constructor(field: string, message: string) { super(`Validation error in ${field}: ${message}`); this.name = 'ValidationError'; } } ``` **System Errors**: ```typescript class SystemError extends Error { constructor(operation: string, cause: Error) { super(`System error in ${operation}: ${cause.message}`); this.name = 'SystemError'; this.cause = cause; } } ``` ### Error Recovery **Graceful Degradation**: - **Mock Data Fallback**: Use mock data when dataset unavailable - **Partial Results**: Return available results even if some operations fail - **Retry Logic**: Automatic retry for transient failures **Error Reporting**: ```typescript class ErrorReporter { report(error: Error, context: RequestContext): void { // Log error with context // Update metrics // Alert if necessary } } ``` ## Monitoring Architecture ### Metrics Collection **Performance Metrics**: ```typescript interface ServerMetrics { requestCount: Counter responseTime: Histogram errorRate: Counter activeConnections: Gauge memoryUsage: Gauge cacheHitRate: Gauge } ``` **Business Metrics**: ```typescript interface BusinessMetrics { searchQueries: Counter specificationsAccessed: Counter popularSeries: Counter queryComplexity: Histogram } ``` ### Health Monitoring **Health Checks**: ```typescript interface HealthCheck { checkDatasetAccess(): Promise<boolean> checkMemoryUsage(): Promise<boolean> checkResponseTime(): Promise<boolean> checkCacheStatus(): Promise<boolean> } ``` This architecture provides a robust, scalable foundation for the 3GPP MCP Server while maintaining the flexibility to evolve and add new capabilities as requirements grow.