apple-hig-mcp
Version:
High-performance MCP server providing instant access to Apple's Human Interface Guidelines via hybrid static/dynamic content delivery
274 lines (214 loc) • 12.4 kB
Markdown
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
This is an Apple Human Interface Guidelines MCP (Model Context Protocol) server that provides AI-powered access to Apple's design guidelines. It scrapes content from Apple's HIG website and serves it through MCP resources and tools for AI assistants like Claude.
## Development Commands
### Build and Test
- `npm run build` - Compile TypeScript to JavaScript in `dist/`
- `npm run clean:build` - Clean and rebuild the project
- `npm test` - Run Jest test suite
- `npm run test:watch` - Run tests in watch mode
- `npm run lint` - Run ESLint on TypeScript files
- `npm run lint:fix` - Fix linting issues automatically
### Development
- `npm run dev` - Start development server using tsx
- `npm start` - Run compiled server from `dist/`
- `npm run health-check` - Test scraper functionality
- `npm run generate-content` - Generate static HIG content files (full discovery + enhanced keyword search)
- `npm run generate-content:offline` - Fast offline generation (14 core sections, keyword search only)
- `npm run validate-content` - Validate generated content
### Testing with MCP Inspector
```bash
npx @modelcontextprotocol/inspector dist/server.js
```
## Architecture Overview
The project uses a hybrid static/dynamic architecture with static content generation and live scraping fallback:
### Core Components
1. **AppleHIGMCPServer** (`src/server.ts`) - Main MCP server entry point
- Coordinates all components and handles MCP protocol communication
- Sets up request handlers for resources and tools
- Manages graceful startup/shutdown
- Initializes static content provider with fallback to scraping
2. **HIGStaticContentProvider** (`src/static-content.ts`) - Primary content source
- Loads pre-generated markdown files from `content/` directory
- Provides instant responses (no scraping delays)
- Uses pre-built search indices for fast queries
- Falls back to scraper if static content unavailable
3. **HIGScraper** (`src/scraper.ts`) - Fallback web scraping engine
- Respectful scraping with rate limiting (1 second delays)
- Intelligent fallback content when Apple's SPA fails to load
- Maintains curated list of known HIG sections (~65 URLs)
- Converts HTML to clean markdown format
4. **HIGCache** (`src/cache.ts`) - Smart caching layer (for scraping)
- TTL-based caching with graceful degradation
- Backup cache entries for offline resilience
- Two-tier caching: fresh data + stale fallback data
- Methods: `getWithGracefulFallback()`, `setWithGracefulDegradation()`
5. **HIGResourceProvider** (`src/resources.ts`) - MCP Resources implementation
- Serves structured content via URIs like `hig://ios`, `hig://ios/buttons`
- Platform-specific and category-specific resource organization
- Prefers static content, falls back to scraping
- Generates comprehensive content with proper Apple attribution
6. **HIGToolsService** (`src/services/tools.service.ts`) - MCP Tools implementation with enhanced keyword search
- Interactive search with advanced keyword matching and intent recognition
- Four main tools: `search_guidelines`, `get_component_spec`, `get_design_tokens`, `get_accessibility_requirements`
- Multi-factor relevance scoring (keyword + structure + context + synonym expansion)
- Enhanced keyword search with synonym expansion and intelligent matching
- Optimized for fast response times without external model dependencies
7. **EnhancedKeywordSearchService** (`src/services/enhanced-keyword-search.service.ts`) - Advanced search capabilities
- Sophisticated keyword matching with synonym expansion and stemming
- Query analysis with intent recognition and entity extraction
- Multi-dimensional relevance scoring with configurable weights
- Support for contextual search across Apple platform design patterns without external dependencies
8. **ContentProcessor** (`src/services/content-processor.service.ts`) - Content processing pipeline
- HTML to markdown conversion using Turndown.js (images removed for MCP efficiency)
- Structured content extraction (overview, guidelines, examples, specifications)
- Quality validation with comprehensive scoring and SLA monitoring
- Apple-specific content pattern recognition and enhancement
### Data Flow
```
MCP Client → AppleHIGMCPServer → HIGResourceProvider/HIGToolsService
↓
HIGStaticContentProvider (primary)
↓ (fallback)
HIGScraper → HIGCache → Apple's Website
Search Flow:
Query → EnhancedKeywordSearchService → Advanced Keyword Matching + Synonym Expansion
↓
Multi-factor Scoring (keyword + synonym + structure + context)
↓
Ranked Results (with intent recognition and boost factors)
```
### Content Generation and Processing
```
GitHub Action (every 4 months) → ContentGenerator → Enhanced Content Processing Pipeline
↓
ContentProcessor (Turndown.js + Structure Extraction)
↓
Quality Validation + SLA Monitoring
↓
Markdown Files + Search Indices + Enhanced Keyword Indexes
↓
content/ directory
↓
HIGStaticContentProvider + EnhancedKeywordSearchService
```
### Key Patterns
**Static-First with Fallback**: The system prioritizes pre-generated static content for performance and reliability, falling back to live scraping only when static content is unavailable.
**Graceful Degradation**: Multiple fallback layers ensure availability - static content → cached scraping → live scraping → contextual fallback content.
**Performance Optimization**: Static content provides instant responses (no scraping delays) and scales to unlimited concurrent users.
**Enhanced Keyword Search**: Multi-factor relevance scoring combines advanced keyword matching with synonym expansion, content structure analysis, and contextual relevance for superior search results.
**Intent Recognition**: Query analysis extracts user intent (find_component, find_guideline, compare_platforms, etc.) and entities (components, platforms, properties) for more accurate results.
**Optimized Performance**: The system uses fast keyword-based search with intelligent synonym expansion and relevance scoring, providing consistent performance without external model dependencies.
**Respectful Scraping**: Rate limiting, appropriate user agents, and fallback to known URLs when Apple's SPA architecture prevents dynamic discovery.
**Attribution Compliance**: All content includes proper Apple attribution and fair use notices.
## Platform Support
The server supports all Apple platforms with specific categories:
- **Platforms**: iOS, macOS, watchOS, tvOS, visionOS, universal
- **Categories**: foundations, layout, navigation, presentation, selection-and-input, status, system-capabilities, visual-design, icons-and-images, color-and-materials, typography, motion, technologies
## Testing Strategy
### Unit Tests Structure
- `__tests__/cache.test.ts` - Cache functionality and TTL behavior
- `__tests__/scraper.test.ts` - Web scraping and content parsing
- `__tests__/resources.test.ts` - MCP resource generation
- `__tests__/tools.test.ts` - MCP tool functionality
- `__tests__/server.test.ts` - Integration testing
### Mocking
- `__mocks__/node-fetch.ts` - HTTP request mocking for tests
- Tests should mock external dependencies and focus on business logic
## Content Management
### Static Content Generation
The system uses GitHub Actions to generate optimized static content:
**Content Structure:**
```
content/
├── platforms/ # Platform-specific markdown files
│ ├── ios/
│ ├── macos/
│ └── ...
├── metadata/ # Search indices and metadata
│ ├── search-index.json
│ ├── cross-references.json
│ └── generation-info.json
```
**Generation Process:**
1. **Scrape all known HIG URLs** (~65 sections across all platforms)
2. **Process to AI-friendly markdown** with front matter metadata
3. **Generate search indices** for fast querying
4. **Create cross-references** between related sections
5. **Download and optimize images**
**Scheduled Updates:**
- **Every 4 months** via GitHub Action
- **Manual triggers** for immediate updates
- **Content validation** ensures quality and completeness
### Fallback Content Strategy (for scraping)
When Apple's website returns JavaScript placeholders, the scraper uses contextual fallback content:
- Button guidelines → `getButtonFallbackContent()`
- Navigation → `getNavigationFallbackContent()`
- Color → `getColorFallbackContent()`
- Typography → `getTypographyFallbackContent()`
- Layout → `getLayoutFallbackContent()`
- General → `getFallbackContent()`
### Known Sections Management
The content generator maintains a curated list of ~65 core HIG sections in `discoverSections()`. When adding new sections:
1. Add to the `knownSections` array with proper platform/category classification
2. Test the URL accessibility
3. Regenerate static content with `npm run generate-content`
## Error Handling
The system uses multiple layers of error resilience:
1. **Graceful cache degradation** - serves stale content when fresh fetches fail
2. **Fallback content** - contextual content when scraping fails completely
3. **MCP error wrapping** - proper error codes for the MCP protocol
4. **Retry logic** - 3 attempts with exponential backoff for network requests
## Configuration
### Scraping Configuration
- Rate limiting: 1000ms between requests
- Timeout: 10 seconds per request
- Retry attempts: 3 with exponential backoff
- User agent: Educational/development purpose identification
### Cache Configuration
- Default TTL: 1 hour for normal content
- Resource list cache: 2 hours
- Section content cache: 2 hours
- Graceful degradation: 24x longer TTL for backup entries
## Static Content vs Live Scraping
### Performance Comparison
- **Static Content**: Instant responses, unlimited concurrency
- **Live Scraping**: 1-10 second delays, rate limited to 30 req/min
### Reliability Comparison
- **Static Content**: 99.9% availability, immune to Apple website changes
- **Live Scraping**: Dependent on Apple website availability and structure
### Content Freshness
- **Static Content**: Updated every 4 months (sufficient for HIG changes)
- **Live Scraping**: Real-time but often returns stale cached content
### When to Use Each
- **Static Content**: Default for all production use
- **Live Scraping**: Fallback when static content unavailable
- **Manual Generation**: When Apple announces major design updates
## Maintenance Notes
### Expected Maintenance
- **Static content updates**: Automatic every 4 months via GitHub Actions
- **Manual content updates**: When Apple announces major design changes
- **Scraper updates**: Only needed when static content fails (rare)
- **New platform support**: Add new platforms and regenerate content
### Content Generation System
Run `npm run generate-content` to:
- Scrape all current HIG content
- Generate optimized markdown files
- Create search indices and metadata
- Validate content completeness
**GitHub Action Triggers:**
- **Scheduled**: Every 4 months on the 1st at 2 AM UTC
- **Manual**: `workflow_dispatch` for immediate updates
- **Auto-PR**: Creates pull request for review when content changes
### Health Check System
Run `npm run health-check` to verify:
- Static content availability and freshness
- Fallback scraper functionality
- MCP server integration
- Content validation
When issues occur:
1. Check if static content exists and is current
2. Regenerate content with `npm run generate-content`
3. For scraper issues: update selectors in `cleanContent()` method
4. Test with `npm run health-check`