observability-analyzer

# Observability Dashboard Analyzer Production-ready MCP Server for intelligent Loki log analysis and Grafana dashboard generation based on industry-standard monitoring methodologies (RED Method). ## 🎯 Key Features - **Smart Analysis**: Analyzes Loki logs to recommend optimal dashboard configurations - **RED Method Implementation**: Industry-standard Rate, Errors, Duration monitoring dashboards - **Service Discovery**: Automatic detection of services from log data - **Query Optimization**: Performance optimization suggestions for LogQL queries - **Production Ready**: Handles authentication, errors, and enterprise deployment scenarios ## 🚀 Quick Start ### Installation ```bash npm install -g observability-dashboard-analyzer ``` ### Configuration The tool uses multi-tier configuration (environment variables > config file > defaults): 1. **Environment Variables** (highest priority): ```bash export LOKI_URL=http://localhost:3100 export LOKI_USERNAME=your-username export LOKI_PASSWORD=your-password ``` 2. **Config File** (medium priority): Edit `~/.observability-analyzer/config.json`: ```json { "loki": { "url": "http://localhost:3100", "auth": { "type": "basic", "username": "your-username", "password": "your-password" } } } ``` ### Usage with Claude Desktop Add to your Claude Desktop MCP configuration: ```json { "mcpServers": { "observability-analyzer": { "command": "npx", "args": ["observability-dashboard-analyzer"] } } } ``` ## 🔧 MCP Tools ### Core Analysis Tools #### `analyze_loki_stack` Analyzes your Loki setup and discovers services with log structure analysis. ```typescript // Example usage in Claude "Analyze my Loki logs for the last 24 hours" ``` **Returns:** - Service discovery from log data - Log structure quality assessment - Available labels and error patterns - Dashboard recommendations based on data structure #### `generate_loki_dashboard` Creates production-ready Loki monitoring dashboard with service-specific panels. ```typescript "Generate a dashboard for my payment-api and user-service" ``` **Features:** - Log volume monitoring by service - Error rate tracking with thresholds - Log level distribution analysis - Service health overview - Error pattern detection #### `validate_loki_queries` Tests LogQL queries against real Loki API to validate query performance. ```typescript "Validate these LogQL queries for performance" ``` **Features:** - Query syntax validation - Performance optimization suggestions - Success rate analysis - Query execution testing #### `export_loki_dashboard` Exports Loki dashboard to Grafana-compatible JSON file. ```typescript "Export dashboard for my services to a JSON file" ``` **Features:** - Grafana-compatible JSON export - Service-specific configurations - Optimized LogQL queries - Production-ready dashboard structure ## 🏗️ Architecture ### Research-Driven Design The tool implements monitoring methodologies based on industry research: - **RED Method** (Request rate, Error rate, Duration) - Universal microservices standard - **Service Discovery** - Automatic detection from log labels and content - **Log Structure Analysis** - Quality assessment for dashboard feasibility ### Log Analysis Algorithm Dashboards are recommended based on log data analysis: - **Service Discovery**: Automatic detection from service labels and JSON content - **Log Structure Quality**: Assessment of structured logs and error patterns - **Volume Analysis**: Log throughput and service activity measurement ### Query Optimization The tool provides performance optimization suggestions: - **LogQL Optimization**: Avoid regex wildcards, use exact string matching first - **Label Filtering**: Use specific label selectors for better performance - **Query Patterns**: Efficient time-based and service-based queries ## 🔒 Security & Authentication Supports authentication methods: - **Basic Auth**: Username/password authentication (most common) - **No Auth**: For local development setups ## 📊 Dashboard Features ### Generated Dashboard Includes: 1. **Log Volume Monitoring** - Service-based log volume tracking - Request rate analysis from log data - Time-based volume patterns 2. **Error Rate Tracking** - Error detection from log patterns - Industry-standard thresholds (1% yellow, 5% red) - Error pattern analysis 3. **Service Health Overview** - Multi-service comparison - Log level distribution - Service activity monitoring ## 🧪 Testing Run the comprehensive test suite: ```bash # Run all tests npm test # Run with coverage npm run test:coverage # Run specific test suites npm test -- red-method.test.ts ``` **Test Coverage:** - Unit tests for dashboard generators - Integration tests with mocked Loki APIs - Query validation and optimization testing - Configuration and authentication testing ## 🚀 Development ### Project Structure ``` src/ ├── index.ts # MCP server entry point ├── config/ │ ├── ConfigManager.ts # Multi-tier configuration │ └── AuthHandler.ts # Authentication handling ├── analyzers/ │ └── LokiAnalyzer.ts # LogQL query generation & analysis ├── dashboards/ │ ├── REDMethodGenerator.ts # RED method dashboards │ └── GrafanaExporter.ts # Dashboard export utilities ├── types/ │ ├── loki-api.ts # Loki API types │ ├── grafana-config.ts # Grafana dashboard types │ ├── monitoring-methods.ts # Monitoring methodology types │ └── config.ts # Configuration types └── __tests__/ # Test suite ``` ### Building ```bash npm run build # TypeScript compilation npm run lint # ESLint checking npm run typecheck # TypeScript type checking ``` ## 📈 Success Metrics - **Functional**: Generates working dashboards for common Loki setups - **Performance**: Analyzes Loki logs in <30 seconds - **Usability**: `npm install -g` → working in Claude within 5 minutes - **Professional**: 90%+ test coverage + comprehensive TypeScript types ## 🤝 Contributing 1. Fork the repository 2. Create a feature branch: `git checkout -b feature/amazing-feature` 3. Run tests: `npm test` 4. Commit changes: `git commit -m 'Add amazing feature'` 5. Push to branch: `git push origin feature/amazing-feature` 6. Open a Pull Request ## 📄 License MIT License - see LICENSE file for details. ## 🔗 Links - [RED Method Documentation](https://www.weave.works/blog/the-red-method-key-metrics-for-microservices-architecture/) - [Loki Documentation](https://grafana.com/docs/loki/) - [LogQL Documentation](https://grafana.com/docs/loki/latest/query/) --- **Built for production observability teams who need immediate value from their Loki setup.** 🚀