@physics91/openrouter-mcp

Version:

A Model Context Protocol (MCP) server for OpenRouter API with Collective Intelligence - Multi-model consensus, ensemble reasoning, and collaborative problem solving

github.com/yourusername/openrouter-mcp

yourusername/openrouter-mcp

310 lines (253 loc) • 8.82 kB

Markdown

# System Architecture Overview This document describes the technical architecture of the OpenRouter MCP Server, designed with TDD principles and modular components. ## Table of Contents - [System Overview](#system-overview) - [Component Architecture](#component-architecture) - [Data Flow](#data-flow) - [Technology Stack](#technology-stack) - [Design Patterns](#design-patterns) - [Security Considerations](#security-considerations) ## System Overview The OpenRouter MCP Server is a hybrid Node.js/Python application that implements the Model Context Protocol (MCP) to provide unified access to 200+ AI models through OpenRouter's API. ``` ┌─────────────────────────────────────────────────┐ │ Client Layer │ │ (Claude Desktop / Claude Code CLI / Custom) │ └─────────────────┬───────────────────────────────┘ │ MCP Protocol ┌─────────────────▼───────────────────────────────┐ │ MCP Server Layer │ │ (FastMCP Framework) │ ├──────────────────────────────────────────────────┤ │ Tool Handlers │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ Chat │ │Multimodal│ │Benchmark │ │ │ └──────────┘ └──────────┘ └──────────┘ │ ├──────────────────────────────────────────────────┤ │ Service Layer │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ Client │ │ Cache │ │ Metadata │ │ │ └──────────┘ └──────────┘ └──────────┘ │ ├──────────────────────────────────────────────────┤ │ OpenRouter API Layer │ │ (External API Integration) │ └──────────────────────────────────────────────────┘ ``` ## Component Architecture ### 1. CLI Interface (Node.js) **Location**: `bin/` - **openrouter-mcp.js**: Main CLI entry point - **check-python.js**: Python environment validator **Responsibilities**: - User interaction and configuration - Python server process management - Claude Desktop/Code integration setup ### 2. MCP Server Core (Python) **Location**: `src/openrouter_mcp/` - **server.py**: FastMCP server initialization and tool registration **Key Features**: - Asynchronous request handling - Tool registration and dispatch - Error handling and logging ### 3. Tool Handlers **Location**: `src/openrouter_mcp/handlers/` #### Chat Handler (`chat.py`) - Text-based chat completions - Streaming response support - Token usage tracking #### Multimodal Handler (`multimodal.py`) - Image processing with Pillow - Base64 encoding/decoding - Vision model selection #### Benchmark Handler (`benchmark.py`) - Parallel model execution - Performance metric collection - Report generation (MD/CSV/JSON) ### 4. Client Layer **Location**: `src/openrouter_mcp/client/` - **openrouter.py**: OpenRouter API client implementation **Features**: - HTTP/HTTPS request handling - Rate limiting and retry logic - Response parsing and validation ### 5. Models & Cache **Location**: `src/openrouter_mcp/models/` - **cache.py**: Dual-layer caching system **Cache Architecture**: ```python Memory Cache (LRU) ├── Model List ├── Model Metadata └── Recent Responses File Cache (JSON) ├── openrouter_model_cache.json └── Timestamped entries ``` ### 6. Configuration **Location**: `src/openrouter_mcp/config/` - **providers.json**: Provider metadata and categories - **providers.py**: Configuration loader ## Data Flow ### Request Flow ``` 1. Client Request → MCP Server 2. MCP Server → Tool Handler Selection 3. Tool Handler → Parameter Validation 4. Cache Check → Hit/Miss Decision 5. If Miss → OpenRouter API Call 6. Response Processing → Cache Update 7. Format Response → Client ``` ### Benchmark Flow ``` 1. Receive Benchmark Request 2. Parse Model List & Parameters 3. Initialize Benchmark Session 4. Parallel Execution: ├── Model 1 → Multiple Runs ├── Model 2 → Multiple Runs └── Model N → Multiple Runs 5. Collect Metrics 6. Generate Rankings 7. Create Reports (MD/CSV/JSON) 8. Return Results ``` ## Technology Stack ### Core Technologies - **Python 3.9+**: Server implementation - **Node.js 16+**: CLI and package management - **FastMCP**: MCP framework - **httpx**: Async HTTP client - **Pillow**: Image processing ### Development Stack - **pytest**: Test framework (TDD) - **ESLint**: JavaScript linting - **Black**: Python code formatting - **mypy**: Type checking ### Dependencies ``` Production: ├── fastmcp>=0.1.0 ├── httpx>=0.24.0 ├── Pillow>=10.0.0 ├── python-dotenv>=1.0.0 └── pydantic>=2.0.0 Development: ├── pytest>=7.0.0 ├── pytest-asyncio>=0.21.0 ├── pytest-cov>=4.0.0 └── black>=23.0.0 ``` ## Design Patterns ### 1. Strategy Pattern Used for different handler implementations: ```python class BaseHandler(ABC): @abstractmethod async def handle(self, params): pass class ChatHandler(BaseHandler): async def handle(self, params): ... class MultimodalHandler(BaseHandler): async def handle(self, params): ... ``` ### 2. Singleton Pattern Cache manager implementation: ```python class CacheManager: _instance = None def __new__(cls): if cls._instance is None: cls._instance = super().__new__(cls) return cls._instance ``` ### 3. Factory Pattern Model metadata creation: ```python class ModelMetadataFactory: @staticmethod def create(model_id: str) -> ModelMetadata: # Enhanced metadata generation return ModelMetadata(...) ``` ### 4. Observer Pattern Benchmark progress tracking: ```python class BenchmarkObserver: def update(self, event: BenchmarkEvent): # Handle progress updates ``` ## Security Considerations ### API Key Management - Environment variable storage - No hardcoded credentials - Secure transmission over HTTPS ### Input Validation - Parameter type checking with Pydantic - Size limits for image uploads - Prompt injection prevention ### Rate Limiting - Request throttling - Exponential backoff - Circuit breaker pattern ### Error Handling - Sanitized error messages - No sensitive data in logs - Graceful degradation ## Performance Optimizations ### Caching Strategy - Memory cache for frequent requests - File cache for model metadata - TTL-based invalidation - LRU eviction policy ### Async Processing - Non-blocking I/O operations - Concurrent request handling - Stream processing for large responses ### Resource Management - Connection pooling - Memory usage monitoring - Automatic cleanup ## Testing Architecture ### Test Structure ``` tests/ ├── unit/ # Isolated component tests ├── integration/ # Component interaction tests ├── e2e/ # End-to-end scenarios └── fixtures/ # Test data and mocks ``` ### TDD Workflow 1. Write failing test 2. Implement minimal code 3. Refactor with confidence 4. Maintain >80% coverage ## Deployment Architecture ### Local Development ```bash python -m venv venv pip install -r requirements-dev.txt npm install npm run dev ``` ### Production Deployment ```bash npm install -g openrouter-mcp openrouter-mcp init openrouter-mcp start --production ``` ## Future Considerations ### Scalability - Horizontal scaling with load balancing - Distributed caching with Redis - Queue-based job processing ### Monitoring - Performance metrics collection - Error tracking integration - Usage analytics ### Extensions - Plugin system for custom handlers - WebSocket support for real-time updates - GraphQL API layer --- **Last Updated**: 2025-01-12 **Version**: 1.0.0