@physics91/openrouter-mcp
Version: 
A Model Context Protocol (MCP) server for OpenRouter API with Collective Intelligence - Multi-model consensus, ensemble reasoning, and collaborative problem solving
310 lines (253 loc) • 8.82 kB
Markdown
# System Architecture Overview
This document describes the technical architecture of the OpenRouter MCP Server, designed with TDD principles and modular components.
## Table of Contents
- [System Overview](#system-overview)
- [Component Architecture](#component-architecture)
- [Data Flow](#data-flow)
- [Technology Stack](#technology-stack)
- [Design Patterns](#design-patterns)
- [Security Considerations](#security-considerations)
## System Overview
The OpenRouter MCP Server is a hybrid Node.js/Python application that implements the Model Context Protocol (MCP) to provide unified access to 200+ AI models through OpenRouter's API.
```
┌─────────────────────────────────────────────────┐
│                  Client Layer                    │
│  (Claude Desktop / Claude Code CLI / Custom)     │
└─────────────────┬───────────────────────────────┘
                  │ MCP Protocol
┌─────────────────▼───────────────────────────────┐
│              MCP Server Layer                    │
│            (FastMCP Framework)                   │
├──────────────────────────────────────────────────┤
│                Tool Handlers                     │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐        │
│  │   Chat   │ │Multimodal│ │Benchmark │        │
│  └──────────┘ └──────────┘ └──────────┘        │
├──────────────────────────────────────────────────┤
│              Service Layer                       │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐        │
│  │  Client  │ │  Cache   │ │ Metadata │        │
│  └──────────┘ └──────────┘ └──────────┘        │
├──────────────────────────────────────────────────┤
│            OpenRouter API Layer                  │
│         (External API Integration)               │
└──────────────────────────────────────────────────┘
```
## Component Architecture
### 1. CLI Interface (Node.js)
**Location**: `bin/`
- **openrouter-mcp.js**: Main CLI entry point
- **check-python.js**: Python environment validator
**Responsibilities**:
- User interaction and configuration
- Python server process management
- Claude Desktop/Code integration setup
### 2. MCP Server Core (Python)
**Location**: `src/openrouter_mcp/`
- **server.py**: FastMCP server initialization and tool registration
**Key Features**:
- Asynchronous request handling
- Tool registration and dispatch
- Error handling and logging
### 3. Tool Handlers
**Location**: `src/openrouter_mcp/handlers/`
#### Chat Handler (`chat.py`)
- Text-based chat completions
- Streaming response support
- Token usage tracking
#### Multimodal Handler (`multimodal.py`)
- Image processing with Pillow
- Base64 encoding/decoding
- Vision model selection
#### Benchmark Handler (`benchmark.py`)
- Parallel model execution
- Performance metric collection
- Report generation (MD/CSV/JSON)
### 4. Client Layer
**Location**: `src/openrouter_mcp/client/`
- **openrouter.py**: OpenRouter API client implementation
**Features**:
- HTTP/HTTPS request handling
- Rate limiting and retry logic
- Response parsing and validation
### 5. Models & Cache
**Location**: `src/openrouter_mcp/models/`
- **cache.py**: Dual-layer caching system
**Cache Architecture**:
```python
Memory Cache (LRU)
    ├── Model List
    ├── Model Metadata
    └── Recent Responses
File Cache (JSON)
    ├── openrouter_model_cache.json
    └── Timestamped entries
```
### 6. Configuration
**Location**: `src/openrouter_mcp/config/`
- **providers.json**: Provider metadata and categories
- **providers.py**: Configuration loader
## Data Flow
### Request Flow
```
1. Client Request → MCP Server
2. MCP Server → Tool Handler Selection
3. Tool Handler → Parameter Validation
4. Cache Check → Hit/Miss Decision
5. If Miss → OpenRouter API Call
6. Response Processing → Cache Update
7. Format Response → Client
```
### Benchmark Flow
```
1. Receive Benchmark Request
2. Parse Model List & Parameters
3. Initialize Benchmark Session
4. Parallel Execution:
   ├── Model 1 → Multiple Runs
   ├── Model 2 → Multiple Runs
   └── Model N → Multiple Runs
5. Collect Metrics
6. Generate Rankings
7. Create Reports (MD/CSV/JSON)
8. Return Results
```
## Technology Stack
### Core Technologies
- **Python 3.9+**: Server implementation
- **Node.js 16+**: CLI and package management
- **FastMCP**: MCP framework
- **httpx**: Async HTTP client
- **Pillow**: Image processing
### Development Stack
- **pytest**: Test framework (TDD)
- **ESLint**: JavaScript linting
- **Black**: Python code formatting
- **mypy**: Type checking
### Dependencies
```
Production:
├── fastmcp>=0.1.0
├── httpx>=0.24.0
├── Pillow>=10.0.0
├── python-dotenv>=1.0.0
└── pydantic>=2.0.0
Development:
├── pytest>=7.0.0
├── pytest-asyncio>=0.21.0
├── pytest-cov>=4.0.0
└── black>=23.0.0
```
## Design Patterns
### 1. Strategy Pattern
Used for different handler implementations:
```python
class BaseHandler(ABC):
    @abstractmethod
    async def handle(self, params): pass
class ChatHandler(BaseHandler):
    async def handle(self, params): ...
class MultimodalHandler(BaseHandler):
    async def handle(self, params): ...
```
### 2. Singleton Pattern
Cache manager implementation:
```python
class CacheManager:
    _instance = None
    
    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance
```
### 3. Factory Pattern
Model metadata creation:
```python
class ModelMetadataFactory:
    @staticmethod
    def create(model_id: str) -> ModelMetadata:
        # Enhanced metadata generation
        return ModelMetadata(...)
```
### 4. Observer Pattern
Benchmark progress tracking:
```python
class BenchmarkObserver:
    def update(self, event: BenchmarkEvent):
        # Handle progress updates
```
## Security Considerations
### API Key Management
- Environment variable storage
- No hardcoded credentials
- Secure transmission over HTTPS
### Input Validation
- Parameter type checking with Pydantic
- Size limits for image uploads
- Prompt injection prevention
### Rate Limiting
- Request throttling
- Exponential backoff
- Circuit breaker pattern
### Error Handling
- Sanitized error messages
- No sensitive data in logs
- Graceful degradation
## Performance Optimizations
### Caching Strategy
- Memory cache for frequent requests
- File cache for model metadata
- TTL-based invalidation
- LRU eviction policy
### Async Processing
- Non-blocking I/O operations
- Concurrent request handling
- Stream processing for large responses
### Resource Management
- Connection pooling
- Memory usage monitoring
- Automatic cleanup
## Testing Architecture
### Test Structure
```
tests/
├── unit/           # Isolated component tests
├── integration/    # Component interaction tests
├── e2e/           # End-to-end scenarios
└── fixtures/      # Test data and mocks
```
### TDD Workflow
1. Write failing test
2. Implement minimal code
3. Refactor with confidence
4. Maintain >80% coverage
## Deployment Architecture
### Local Development
```bash
python -m venv venv
pip install -r requirements-dev.txt
npm install
npm run dev
```
### Production Deployment
```bash
npm install -g openrouter-mcp
openrouter-mcp init
openrouter-mcp start --production
```
## Future Considerations
### Scalability
- Horizontal scaling with load balancing
- Distributed caching with Redis
- Queue-based job processing
### Monitoring
- Performance metrics collection
- Error tracking integration
- Usage analytics
### Extensions
- Plugin system for custom handlers
- WebSocket support for real-time updates
- GraphQL API layer
---
**Last Updated**: 2025-01-12
**Version**: 1.0.0