UNPKG

apple-hig-mcp

Version:

High-performance MCP server providing instant access to Apple's Human Interface Guidelines via hybrid static/dynamic content delivery

274 lines (214 loc) 12.4 kB
# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is an Apple Human Interface Guidelines MCP (Model Context Protocol) server that provides AI-powered access to Apple's design guidelines. It scrapes content from Apple's HIG website and serves it through MCP resources and tools for AI assistants like Claude. ## Development Commands ### Build and Test - `npm run build` - Compile TypeScript to JavaScript in `dist/` - `npm run clean:build` - Clean and rebuild the project - `npm test` - Run Jest test suite - `npm run test:watch` - Run tests in watch mode - `npm run lint` - Run ESLint on TypeScript files - `npm run lint:fix` - Fix linting issues automatically ### Development - `npm run dev` - Start development server using tsx - `npm start` - Run compiled server from `dist/` - `npm run health-check` - Test scraper functionality - `npm run generate-content` - Generate static HIG content files (full discovery + enhanced keyword search) - `npm run generate-content:offline` - Fast offline generation (14 core sections, keyword search only) - `npm run validate-content` - Validate generated content ### Testing with MCP Inspector ```bash npx @modelcontextprotocol/inspector dist/server.js ``` ## Architecture Overview The project uses a hybrid static/dynamic architecture with static content generation and live scraping fallback: ### Core Components 1. **AppleHIGMCPServer** (`src/server.ts`) - Main MCP server entry point - Coordinates all components and handles MCP protocol communication - Sets up request handlers for resources and tools - Manages graceful startup/shutdown - Initializes static content provider with fallback to scraping 2. **HIGStaticContentProvider** (`src/static-content.ts`) - Primary content source - Loads pre-generated markdown files from `content/` directory - Provides instant responses (no scraping delays) - Uses pre-built search indices for fast queries - Falls back to scraper if static content unavailable 3. **HIGScraper** (`src/scraper.ts`) - Fallback web scraping engine - Respectful scraping with rate limiting (1 second delays) - Intelligent fallback content when Apple's SPA fails to load - Maintains curated list of known HIG sections (~65 URLs) - Converts HTML to clean markdown format 4. **HIGCache** (`src/cache.ts`) - Smart caching layer (for scraping) - TTL-based caching with graceful degradation - Backup cache entries for offline resilience - Two-tier caching: fresh data + stale fallback data - Methods: `getWithGracefulFallback()`, `setWithGracefulDegradation()` 5. **HIGResourceProvider** (`src/resources.ts`) - MCP Resources implementation - Serves structured content via URIs like `hig://ios`, `hig://ios/buttons` - Platform-specific and category-specific resource organization - Prefers static content, falls back to scraping - Generates comprehensive content with proper Apple attribution 6. **HIGToolsService** (`src/services/tools.service.ts`) - MCP Tools implementation with enhanced keyword search - Interactive search with advanced keyword matching and intent recognition - Four main tools: `search_guidelines`, `get_component_spec`, `get_design_tokens`, `get_accessibility_requirements` - Multi-factor relevance scoring (keyword + structure + context + synonym expansion) - Enhanced keyword search with synonym expansion and intelligent matching - Optimized for fast response times without external model dependencies 7. **EnhancedKeywordSearchService** (`src/services/enhanced-keyword-search.service.ts`) - Advanced search capabilities - Sophisticated keyword matching with synonym expansion and stemming - Query analysis with intent recognition and entity extraction - Multi-dimensional relevance scoring with configurable weights - Support for contextual search across Apple platform design patterns without external dependencies 8. **ContentProcessor** (`src/services/content-processor.service.ts`) - Content processing pipeline - HTML to markdown conversion using Turndown.js (images removed for MCP efficiency) - Structured content extraction (overview, guidelines, examples, specifications) - Quality validation with comprehensive scoring and SLA monitoring - Apple-specific content pattern recognition and enhancement ### Data Flow ``` MCP Client → AppleHIGMCPServer → HIGResourceProvider/HIGToolsService ↓ HIGStaticContentProvider (primary) ↓ (fallback) HIGScraper → HIGCache → Apple's Website Search Flow: Query → EnhancedKeywordSearchService → Advanced Keyword Matching + Synonym Expansion ↓ Multi-factor Scoring (keyword + synonym + structure + context) ↓ Ranked Results (with intent recognition and boost factors) ``` ### Content Generation and Processing ``` GitHub Action (every 4 months) → ContentGenerator → Enhanced Content Processing Pipeline ↓ ContentProcessor (Turndown.js + Structure Extraction) ↓ Quality Validation + SLA Monitoring ↓ Markdown Files + Search Indices + Enhanced Keyword Indexes ↓ content/ directory ↓ HIGStaticContentProvider + EnhancedKeywordSearchService ``` ### Key Patterns **Static-First with Fallback**: The system prioritizes pre-generated static content for performance and reliability, falling back to live scraping only when static content is unavailable. **Graceful Degradation**: Multiple fallback layers ensure availability - static content → cached scraping → live scraping → contextual fallback content. **Performance Optimization**: Static content provides instant responses (no scraping delays) and scales to unlimited concurrent users. **Enhanced Keyword Search**: Multi-factor relevance scoring combines advanced keyword matching with synonym expansion, content structure analysis, and contextual relevance for superior search results. **Intent Recognition**: Query analysis extracts user intent (find_component, find_guideline, compare_platforms, etc.) and entities (components, platforms, properties) for more accurate results. **Optimized Performance**: The system uses fast keyword-based search with intelligent synonym expansion and relevance scoring, providing consistent performance without external model dependencies. **Respectful Scraping**: Rate limiting, appropriate user agents, and fallback to known URLs when Apple's SPA architecture prevents dynamic discovery. **Attribution Compliance**: All content includes proper Apple attribution and fair use notices. ## Platform Support The server supports all Apple platforms with specific categories: - **Platforms**: iOS, macOS, watchOS, tvOS, visionOS, universal - **Categories**: foundations, layout, navigation, presentation, selection-and-input, status, system-capabilities, visual-design, icons-and-images, color-and-materials, typography, motion, technologies ## Testing Strategy ### Unit Tests Structure - `__tests__/cache.test.ts` - Cache functionality and TTL behavior - `__tests__/scraper.test.ts` - Web scraping and content parsing - `__tests__/resources.test.ts` - MCP resource generation - `__tests__/tools.test.ts` - MCP tool functionality - `__tests__/server.test.ts` - Integration testing ### Mocking - `__mocks__/node-fetch.ts` - HTTP request mocking for tests - Tests should mock external dependencies and focus on business logic ## Content Management ### Static Content Generation The system uses GitHub Actions to generate optimized static content: **Content Structure:** ``` content/ ├── platforms/ # Platform-specific markdown files │ ├── ios/ │ ├── macos/ │ └── ... ├── metadata/ # Search indices and metadata │ ├── search-index.json │ ├── cross-references.json │ └── generation-info.json ``` **Generation Process:** 1. **Scrape all known HIG URLs** (~65 sections across all platforms) 2. **Process to AI-friendly markdown** with front matter metadata 3. **Generate search indices** for fast querying 4. **Create cross-references** between related sections 5. **Download and optimize images** **Scheduled Updates:** - **Every 4 months** via GitHub Action - **Manual triggers** for immediate updates - **Content validation** ensures quality and completeness ### Fallback Content Strategy (for scraping) When Apple's website returns JavaScript placeholders, the scraper uses contextual fallback content: - Button guidelines → `getButtonFallbackContent()` - Navigation → `getNavigationFallbackContent()` - Color → `getColorFallbackContent()` - Typography → `getTypographyFallbackContent()` - Layout → `getLayoutFallbackContent()` - General → `getFallbackContent()` ### Known Sections Management The content generator maintains a curated list of ~65 core HIG sections in `discoverSections()`. When adding new sections: 1. Add to the `knownSections` array with proper platform/category classification 2. Test the URL accessibility 3. Regenerate static content with `npm run generate-content` ## Error Handling The system uses multiple layers of error resilience: 1. **Graceful cache degradation** - serves stale content when fresh fetches fail 2. **Fallback content** - contextual content when scraping fails completely 3. **MCP error wrapping** - proper error codes for the MCP protocol 4. **Retry logic** - 3 attempts with exponential backoff for network requests ## Configuration ### Scraping Configuration - Rate limiting: 1000ms between requests - Timeout: 10 seconds per request - Retry attempts: 3 with exponential backoff - User agent: Educational/development purpose identification ### Cache Configuration - Default TTL: 1 hour for normal content - Resource list cache: 2 hours - Section content cache: 2 hours - Graceful degradation: 24x longer TTL for backup entries ## Static Content vs Live Scraping ### Performance Comparison - **Static Content**: Instant responses, unlimited concurrency - **Live Scraping**: 1-10 second delays, rate limited to 30 req/min ### Reliability Comparison - **Static Content**: 99.9% availability, immune to Apple website changes - **Live Scraping**: Dependent on Apple website availability and structure ### Content Freshness - **Static Content**: Updated every 4 months (sufficient for HIG changes) - **Live Scraping**: Real-time but often returns stale cached content ### When to Use Each - **Static Content**: Default for all production use - **Live Scraping**: Fallback when static content unavailable - **Manual Generation**: When Apple announces major design updates ## Maintenance Notes ### Expected Maintenance - **Static content updates**: Automatic every 4 months via GitHub Actions - **Manual content updates**: When Apple announces major design changes - **Scraper updates**: Only needed when static content fails (rare) - **New platform support**: Add new platforms and regenerate content ### Content Generation System Run `npm run generate-content` to: - Scrape all current HIG content - Generate optimized markdown files - Create search indices and metadata - Validate content completeness **GitHub Action Triggers:** - **Scheduled**: Every 4 months on the 1st at 2 AM UTC - **Manual**: `workflow_dispatch` for immediate updates - **Auto-PR**: Creates pull request for review when content changes ### Health Check System Run `npm run health-check` to verify: - Static content availability and freshness - Fallback scraper functionality - MCP server integration - Content validation When issues occur: 1. Check if static content exists and is current 2. Regenerate content with `npm run generate-content` 3. For scraper issues: update selectors in `cleanContent()` method 4. Test with `npm run health-check`