3gpp-mcp-server
Version:
MCP Server for querying 3GPP telecom protocol specifications
523 lines (430 loc) • 12.9 kB
Markdown
# Architecture Guide
## System Overview
The 3GPP MCP Server is designed as a layered architecture that transforms static 3GPP specifications into an intelligent, queryable knowledge base. The system follows the Model Context Protocol (MCP) specification while providing domain-specific intelligence for telecommunications standards.
## Architecture Principles
### 1. Layered Separation
- **Interface Layer**: MCP protocol handling
- **Intelligence Layer**: 3GPP domain logic
- **Data Layer**: Specification storage and retrieval
### 2. Protocol Compliance
- Full MCP specification compliance
- Standardized tool/resource/prompt interfaces
- Transport-agnostic design (STDIO primary)
### 3. Domain Expertise
- 3GPP-specific search and analysis
- Protocol-aware information extraction
- Contextual cross-referencing
### 4. Scalability & Performance
- Lazy loading of large datasets
- Efficient caching strategies
- Concurrent query handling
## High-Level Architecture
```mermaid
graph TB
subgraph "Client Applications"
CD[Claude Desktop]
CA[Custom Applications]
API[REST API Wrapper]
end
subgraph "MCP Protocol Layer"
MCP[MCP Server Core]
TH[Tools Handler]
RH[Resources Handler]
PH[Prompts Handler]
end
subgraph "3GPP Intelligence Layer"
SM[Specification Manager]
SE[Search Engine]
DP[Document Processor]
QP[Query Processor]
end
subgraph "Data Access Layer"
FC[File Cache]
MC[Metadata Cache]
IDX[Search Index]
end
subgraph "Storage Layer"
TS[TSpec-LLM Dataset]
CF[Configuration Files]
LOG[Log Files]
end
CD --> MCP
CA --> MCP
API --> MCP
MCP --> TH
MCP --> RH
MCP --> PH
TH --> SM
TH --> SE
RH --> SM
PH --> DP
SM --> FC
SE --> IDX
DP --> MC
QP --> FC
FC --> TS
MC --> TS
IDX --> TS
SM --> CF
SE --> LOG
```
## Component Architecture
### MCP Server Core
**File**: `src/index.ts`
**Role**: Primary MCP protocol implementation
```typescript
class GPPMCPServer {
private server: MCP.Server
private specManager: GPPSpecificationManager
private searchEngine: GPPSearchEngine
private documentProcessor: GPPDocumentProcessor
}
```
**Responsibilities**:
- MCP protocol message handling
- Request routing to appropriate handlers
- Error handling and response formatting
- Concurrent request management
**Key Design Decisions**:
- **Single-threaded**: Node.js event loop for concurrency
- **Stateless**: Each request is independent
- **Error-first**: Comprehensive error handling
- **Typed**: Full TypeScript coverage
### Specification Manager
**File**: `src/utils/specification-manager.ts`
**Role**: 3GPP specification metadata management
```typescript
interface SpecificationManager {
initialize(): Promise<void>
getSpecification(id: string): Promise<GPPSpecification | null>
getAllSpecifications(): Promise<GPPSpecification[]>
getCatalog(): Promise<CatalogSummary>
}
```
**Architecture**:
- **Lazy Initialization**: Load specifications on first access
- **Memory-Efficient**: Stream large directories
- **Metadata Extraction**: Parse specification metadata
- **Caching Strategy**: LRU cache for frequently accessed specs
**Data Flow**:
```mermaid
sequenceDiagram
participant Client
participant SM as Specification Manager
participant FS as File System
participant Cache
Client->>SM: getSpecification(id)
SM->>Cache: check cache
alt Cache Hit
Cache-->>SM: specification
else Cache Miss
SM->>FS: load specification file
FS-->>SM: file content
SM->>SM: parse metadata
SM->>Cache: store specification
end
SM-->>Client: specification
```
### Search Engine
**File**: `src/utils/search-engine.ts`
**Role**: Intelligent search across 3GPP specifications
**Search Architecture**:
```typescript
interface SearchEngine {
search(query: SearchQuery): Promise<SearchResult[]>
findRelatedSpecifications(topic: string): Promise<GPPSpecification[]>
queryProtocol(name: string, type: QueryType): Promise<string>
}
```
**Multi-Stage Search Process**:
1. **Query Analysis**:
```typescript
parseQuery(query: string) {
// Extract technical terms
// Identify 3GPP concepts
// Expand abbreviations
// Generate search terms
}
```
2. **Filtering**:
```typescript
applyFilters(specs: GPPSpecification[], filters: SearchFilters) {
// Filter by series (24, 36, 38, etc.)
// Filter by release (Rel-15, Rel-16, etc.)
// Apply keyword constraints
}
```
3. **Relevance Scoring**:
```typescript
calculateRelevance(spec: GPPSpecification, terms: string[]): number {
let score = 0;
score += titleMatch * 0.4; // Title keyword matches
score += keywordMatch * 0.3; // Metadata keywords
score += abstractMatch * 0.2; // Abstract content
score += contextBoost * 0.1; // Domain context
return Math.min(score, 1.0);
}
```
4. **Result Ranking**:
```typescript
rankResults(results: SearchResult[]): SearchResult[] {
return results
.sort((a, b) => b.relevanceScore - a.relevanceScore)
.slice(0, limit);
}
```
### Document Processor
**File**: `src/utils/document-processor.ts`
**Role**: Content extraction and processing
**Processing Pipeline**:
```mermaid
graph LR
A[Raw Document] --> B[Format Detection]
B --> C{Format Type}
C -->|Markdown| D[MD Parser]
C -->|DOCX| E[DOCX Parser]
D --> F[Content Extraction]
E --> F
F --> G[Metadata Parsing]
G --> H[Section Analysis]
H --> I[Keyword Extraction]
I --> J[Summary Generation]
J --> K[Processed Content]
```
**Content Analysis**:
- **Section Identification**: Extract document structure
- **Technical Term Recognition**: Identify 3GPP-specific terminology
- **Cross-Reference Extraction**: Find specification references
- **Summary Generation**: Create structured summaries
## Data Architecture
### TSpec-LLM Integration
**Dataset Structure**:
```
TSpec-LLM/
├── Rel-8/
│ ├── series-21/
│ │ ├── 21.101.md
│ │ └── 21.102.md
│ ├── series-22/
│ └── ...
├── Rel-9/
└── ...
```
**File Organization Strategy**:
- **Hierarchical**: Organized by release and series
- **Consistent Naming**: Predictable file naming patterns
- **Preserved Structure**: Original document structure maintained
- **Format Standardization**: Markdown with LaTeX formulas
### Memory Management
**Specification Loading**:
```typescript
class SpecificationCache {
private cache: LRU<string, GPPSpecification>
private maxSize: number = 1000
async get(id: string): Promise<GPPSpecification | null> {
// Check memory cache first
let spec = this.cache.get(id);
if (!spec) {
// Load from disk
spec = await this.loadFromDisk(id);
if (spec) {
this.cache.set(id, spec);
}
}
return spec;
}
}
```
**Memory Usage Patterns**:
- **Metadata Always**: Keep specification metadata in memory
- **Content On-Demand**: Load full content only when requested
- **LRU Eviction**: Remove least recently used content
- **Size Limits**: Configurable memory limits
## Protocol Integration
### MCP Implementation Details
**Tool Registration**:
```typescript
server.setRequestHandler(ListToolsRequestSchema, async () => {
return {
tools: [
{
name: 'search_3gpp_specs',
description: 'Search through 3GPP specifications',
inputSchema: SearchQuerySchema.jsonSchema()
}
// ... other tools
]
};
});
```
**Request Handling**:
```typescript
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
try {
switch (name) {
case 'search_3gpp_specs':
return await this.handleSearchSpecs(args);
// ... other handlers
}
} catch (error) {
return this.formatError(error);
}
});
```
### Transport Layer
**STDIO Transport** (Primary):
```typescript
const transport = new StdioServerTransport();
await server.connect(transport);
```
**Alternative Transports** (Future):
- **HTTP**: REST API wrapper
- **WebSocket**: Real-time applications
- **gRPC**: High-performance integrations
## Performance Architecture
### Concurrency Model
**Request Processing**:
```mermaid
sequenceDiagram
participant C1 as Client 1
participant C2 as Client 2
participant S as Server
participant SE as Search Engine
participant SM as Spec Manager
C1->>S: search_3gpp_specs
C2->>S: get_specification_details
par Concurrent Processing
S->>SE: search(query1)
and
S->>SM: getSpec(id)
end
SE-->>S: search results
SM-->>S: specification
S-->>C1: search response
S-->>C2: spec details
```
**Async Processing**:
- **Non-blocking I/O**: File operations are asynchronous
- **Promise-based**: All operations return Promises
- **Concurrent Limits**: Configurable concurrent operation limits
- **Backpressure**: Automatic throttling under load
### Caching Strategy
**Multi-Level Caching**:
```typescript
interface CacheArchitecture {
L1: InMemoryCache<string, GPPSpecification> // Hot specifications
L2: InMemoryCache<string, SearchResult[]> // Search results
L3: FileSystemCache<string, ProcessedContent> // Processed content
}
```
**Cache Policies**:
- **TTL**: Time-to-live for search results (configurable)
- **LRU**: Least Recently Used for specification cache
- **Write-through**: Immediate persistence for processed content
- **Cache Warming**: Preload frequently accessed specifications
### Indexing Strategy
**Search Index Architecture**:
```typescript
interface SearchIndex {
metadata: Map<string, SpecMetadata> // Fast metadata lookup
keywords: Map<string, string[]> // Keyword to spec mapping
fullText: Map<string, DocumentChunk[]> // Full-text search index (future)
relationships: Graph<string, string> // Spec relationships (future)
}
```
## Security Architecture
### Data Access Security
**File System Access**:
- **Read-only**: No modification of source specifications
- **Path Validation**: Prevent directory traversal attacks
- **Sandboxing**: Restrict file access to dataset directory
**Input Validation**:
```typescript
const SearchQuerySchema = z.object({
query: z.string().min(1).max(1000), // Prevent DoS
series: z.array(z.string()).max(10), // Limit array size
limit: z.number().min(1).max(50) // Reasonable limits
});
```
### Resource Protection
**Memory Limits**:
- **Query Complexity**: Limit concurrent complex searches
- **Response Size**: Cap maximum response size
- **Timeout Protection**: Timeout long-running operations
**Rate Limiting** (Future):
```typescript
interface RateLimiter {
checkRate(clientId: string, operation: string): boolean
incrementCount(clientId: string, operation: string): void
}
```
## Error Handling Architecture
### Error Categories
**Validation Errors**:
```typescript
class ValidationError extends Error {
constructor(field: string, message: string) {
super(`Validation error in ${field}: ${message}`);
this.name = 'ValidationError';
}
}
```
**System Errors**:
```typescript
class SystemError extends Error {
constructor(operation: string, cause: Error) {
super(`System error in ${operation}: ${cause.message}`);
this.name = 'SystemError';
this.cause = cause;
}
}
```
### Error Recovery
**Graceful Degradation**:
- **Mock Data Fallback**: Use mock data when dataset unavailable
- **Partial Results**: Return available results even if some operations fail
- **Retry Logic**: Automatic retry for transient failures
**Error Reporting**:
```typescript
class ErrorReporter {
report(error: Error, context: RequestContext): void {
// Log error with context
// Update metrics
// Alert if necessary
}
}
```
## Monitoring Architecture
### Metrics Collection
**Performance Metrics**:
```typescript
interface ServerMetrics {
requestCount: Counter
responseTime: Histogram
errorRate: Counter
activeConnections: Gauge
memoryUsage: Gauge
cacheHitRate: Gauge
}
```
**Business Metrics**:
```typescript
interface BusinessMetrics {
searchQueries: Counter
specificationsAccessed: Counter
popularSeries: Counter
queryComplexity: Histogram
}
```
### Health Monitoring
**Health Checks**:
```typescript
interface HealthCheck {
checkDatasetAccess(): Promise<boolean>
checkMemoryUsage(): Promise<boolean>
checkResponseTime(): Promise<boolean>
checkCacheStatus(): Promise<boolean>
}
```
This architecture provides a robust, scalable foundation for the 3GPP MCP Server while maintaining the flexibility to evolve and add new capabilities as requirements grow.