UNPKG

github-pr-automation

Version:

MCP server and CLI for automated GitHub PR management, review resolution, and workflow optimization

750 lines (615 loc) 19.2 kB
# Testing Strategy ## Overview The testing strategy is designed to ensure reliability, correctness, and maintainability of the MCP server with minimal reliance on live GitHub API calls. ## Test Categorization ### 1. Unit Tests **Purpose**: Test individual functions and classes in isolation **Scope**: - Utility functions (parsers, formatters, validators) - Log parsers for different test frameworks - Comment categorization logic - Command generation logic - Data transformations **Requirements**: - No external dependencies - Fast execution (<1s for entire suite) - High coverage target: >90% - Linked to requirements with comments (per user preference) **Example**: ```typescript // tests/utils/parser.test.ts import { parsePRIdentifier } from '../../src/utils/parser'; describe('parsePRIdentifier', () => { // Test: Validates that PR identifier parsing handles standard format // Requirement: API Design - PR Identifier Parsing it('should parse standard format "owner/repo#123"', () => { const result = parsePRIdentifier('octocat/hello-world#42'); expect(result).toEqual({ owner: 'octocat', repo: 'hello-world', number: 42 }); }); // Test: Validates that PR identifier parsing handles URL format // Requirement: API Design - PR Identifier Parsing (multiple formats) it('should parse GitHub URL format', () => { const result = parsePRIdentifier( 'https://github.com/octocat/hello-world/pull/42' ); expect(result).toEqual({ owner: 'octocat', repo: 'hello-world', number: 42 }); }); // Test: Validates error handling for invalid PR identifiers // Requirement: Error Handling - User Input Errors it('should throw error for invalid format', () => { expect(() => parsePRIdentifier('invalid')).toThrow(); }); }); ``` ### 2. Integration Tests **Purpose**: Test interactions between components with real GitHub API behavior using @octokit/fixtures **Scope**: - Tool handlers with real GitHub API responses (recorded/playback) - Complete workflows from input to output - Multi-step workflows - Pagination logic - Error propagation - Real API behavior without network dependency **Requirements**: - Use @octokit/fixtures for recording/playback - Test error scenarios with real API responses - Validate data flow through layers - Execution time <10s for suite (playback mode) - Record fixtures once, play back many times **Example**: ```typescript // tests/integration/tools/get-failing-tests.integration.test.ts import { describe, it, expect, beforeAll } from 'vitest'; import { handleGetFailingTests } from '../../../src/tools/get-failing-tests/handler.js'; import { integrationManager } from '../setup.js'; describe('get_failing_tests integration', () => { const TEST_PR = process.env.TEST_PR || 'jmalicki/resolve-pr-mcp#2'; beforeAll(async () => { // Load fixture for this test scenario const fixture = await integrationManager.loadFixture('get-failing-tests/basic-pr'); if (fixture) { console.log('✓ Using recorded fixture for get-failing-tests'); } else { console.log('✓ Recording new fixture for get-failing-tests'); } }); // Test: Validates that get_failing_tests returns real PR data // Requirement: get_failing_tests tool - Real API behavior it('should fetch real PR data from GitHub', async () => { const client = integrationManager.getClient(); const result = await handleGetFailingTests(client, { pr: TEST_PR, wait: false, bail_on_first: false, page: 1, page_size: 10 }); // Verify we got real data back expect(result.pr).toContain('#'); expect(result.status).toMatch(/passed|failed|running|unknown/); expect(result.nextCursor !== undefined).toBe(true); expect(result.instructions).toBeDefined(); // Save fixture if in record mode await integrationManager.saveFixture('get-failing-tests/basic-pr', result); }); // Test: Validates pagination with real data // Requirement: get_failing_tests - Pagination support it('should handle pagination with real data', async () => { const client = integrationManager.getClient(); const page1 = await handleGetFailingTests(client, { pr: TEST_PR, wait: false, bail_on_first: false, page: 1, page_size: 5 }); expect(page1.pagination.page).toBe(1); expect(page1.pagination.page_size).toBe(5); // Save fixture if in record mode await integrationManager.saveFixture('get-failing-tests/pagination', page1); }); }); ``` ### 3. E2E Tests (Integrated with Integration Tests) **Purpose**: Complete end-to-end testing with real GitHub API behavior using @octokit/fixtures **Scope**: - Complete workflows from input to output - Real GitHub API interactions (recorded/playback) - Rate limiting behavior (simulated) - Error handling with real API errors - All tool functionality with real API responses **Requirements**: - Use integration test infrastructure with @octokit/fixtures - Record fixtures once, play back many times - Fast execution in playback mode - Deterministic results **Modes**: ```typescript // Record Mode: First time setup export RECORD_INTEGRATION_FIXTURES=true export RUN_INTEGRATION_TESTS=true npm run test:integration:record // Playback Mode: Default (fast, offline) export RUN_INTEGRATION_TESTS=true npm run test:integration:playback ``` **Benefits**: - ✅ Real API behavior without network dependency - ✅ Fast execution (no API calls in playback mode) - ✅ Deterministic results (recorded responses) - ✅ Rate limit friendly (no API consumption) - ✅ Offline capability ### 4. CLI Tests **Purpose**: Test CLI mode functionality **Scope**: - Command parsing and validation - Output formatting (JSON vs. human-readable) - Error messages and exit codes - Shell integration **Example**: ```typescript // tests/cli/commands.test.ts import { exec } from 'child_process'; import { promisify } from 'util'; const execAsync = promisify(exec); describe('CLI get-failing-tests command', () => { // Test: CLI properly parses arguments // Requirement: CLI Mode - Argument parsing it('should parse command-line arguments correctly', async () => { const { stdout } = await execAsync( 'node dist/cli.js get-failing-tests --pr "owner/repo#123" --json' ); const result = JSON.parse(stdout); expect(result).toHaveProperty('status'); expect(result).toHaveProperty('failures'); }); // Test: CLI returns proper exit codes // Requirement: CLI Mode - Error handling it('should exit with code 1 on error', async () => { try { await execAsync('node dist/cli.js get-failing-tests --pr "invalid"'); fail('Should have thrown'); } catch (error) { expect(error.code).toBe(1); expect(error.stderr).toContain('Invalid PR format'); } }); // Test: CLI formats human-readable output // Requirement: CLI Mode - Output formatting it('should format output for human consumption', async () => { const { stdout } = await execAsync( 'node dist/cli.js get-failing-tests --pr "owner/repo#123"' ); expect(stdout).toMatch(/Status:/); expect(stdout).toMatch(/Failures:/); expect(stdout).not.toContain('{'); // Not JSON }); }); ``` ### 5. Snapshot Tests **Purpose**: Ensure consistent output formats for AI consumption **Scope**: - Tool output formats - Instruction generation - Command generation - Error messages **Example**: ```typescript // tests/snapshots/instructions.test.ts import { generateInstructions } from '../../src/tools/get-failing-tests/instructions'; describe('instruction generation snapshots', () => { // Test: Ensures instruction format remains consistent for AI parsing // Requirement: Architecture - Token Efficiency it('should generate consistent instruction format', () => { const failures = loadFixture('sample-failures.json'); const instructions = generateInstructions(failures); expect(instructions).toMatchSnapshot(); }); }); ``` --- ## Test Fixtures ### Fixture Organization ``` tests/ ├── fixtures/ │ ├── pull-requests/ │ │ ├── pr-simple.json │ │ ├── pr-with-failures.json │ │ ├── pr-stacked.json │ │ └── pr-draft.json │ ├── check-runs/ │ │ ├── check-runs-passing.json │ │ ├── check-runs-failed.json │ │ └── check-runs-pending.json │ ├── logs/ │ │ ├── pytest-failures.log │ │ ├── jest-failures.log │ │ ├── go-test-failures.log │ │ └── rspec-failures.log │ ├── comments/ │ │ ├── unresolved-comments.json │ │ ├── bot-comments.json │ │ └── resolved-threads.json │ └── workflows/ │ ├── workflow-run-success.json │ └── workflow-run-failed.json ``` ### Fixture Creation Guidelines 1. **Real-world Based**: Create fixtures from actual GitHub API responses 2. **Anonymized**: Remove sensitive information 3. **Comprehensive**: Cover edge cases and error scenarios 4. **Documented**: Include comments explaining the scenario **Example Fixture**: ```json // tests/fixtures/pull-requests/pr-with-failures.json { "_comment": "PR with failing CI checks - represents common failure scenario", "number": 123, "title": "Add user authentication", "state": "open", "draft": false, "head": { "ref": "feature/auth", "sha": "abc123def456" }, "base": { "ref": "main", "sha": "def456abc123" }, "user": { "login": "contributor", "type": "User" }, "mergeable": true, "mergeable_state": "unstable" } ``` --- ## Mock Implementations ### GitHub Client Mock ```typescript // tests/mocks/github-client.ts /** * Mock GitHub client for testing * Simulates GitHub API responses without network calls */ export class MockGitHubClient { private responses = new Map<string, any>(); private callCount = new Map<string, number>(); /** * Configure mock response for a specific endpoint */ mock(endpoint: string, response: any): void { this.responses.set(endpoint, response); } /** * Get call count for an endpoint (for verification) */ getCallCount(endpoint: string): number { return this.callCount.get(endpoint) || 0; } /** * Convenience method for mocking pull request */ mockPullRequest(data: any): void { this.mock('GET /repos/:owner/:repo/pulls/:number', { data }); } /** * Convenience method for mocking check runs */ mockCheckRuns(data: any): void { this.mock('GET /repos/:owner/:repo/commits/:ref/check-runs', { data }); } /** * Simulate sequential responses (for polling scenarios) */ mockSequence(endpoint: string, responses: any[]): void { let callIndex = 0; this.mock(endpoint, () => { const response = responses[Math.min(callIndex, responses.length - 1)]; callIndex++; return response; }); } /** * Simulate rate limiting */ mockRateLimit(): void { this.mock('*', { status: 403, headers: { 'x-ratelimit-remaining': '0', 'x-ratelimit-reset': String(Math.floor(Date.now() / 1000) + 3600) } }); } } ``` ### Test Utilities ```typescript // tests/utils/helpers.ts /** * Wait for a condition with timeout */ export async function waitFor( condition: () => boolean, timeout: number = 5000 ): Promise<void> { const startTime = Date.now(); while (!condition()) { if (Date.now() - startTime > timeout) { throw new Error('Timeout waiting for condition'); } await sleep(100); } } /** * Create a test PR identifier */ export function createTestPR(overrides?: Partial<PRIdentifier>): PRIdentifier { return { owner: 'test-owner', repo: 'test-repo', number: 123, ...overrides }; } /** * Verify pagination metadata is correct */ export function expectValidPagination( pagination: PaginationMeta, expected: Partial<PaginationMeta> ): void { expect(pagination.page).toBeGreaterThanOrEqual(1); expect(pagination.page_size).toBeGreaterThanOrEqual(1); expect(pagination.total_pages).toBeGreaterThanOrEqual(0); expect(pagination.has_next).toBe(pagination.page < pagination.total_pages); if (expected) { expect(pagination).toMatchObject(expected); } } ``` --- ## Test Coverage Requirements ### Coverage Targets | Component | Target Coverage | Priority | |-----------|----------------|----------| | Utilities | 95% | Critical | | Tool Handlers | 90% | Critical | | GitHub Integration | 85% | High | | Log Parsers | 95% | Critical | | Error Handling | 100% | Critical | | Command Generation | 90% | High | ### Critical Paths (100% Coverage Required) 1. **Input Validation**: All tool inputs must be validated 2. **Error Handling**: All error paths must be tested 3. **Authentication**: Token validation and permission checks 4. **Rate Limiting**: Rate limit detection and backoff logic --- ## Testing Tools ### Test Framework: Vitest **Why Vitest**: - Fast execution with ESM support - Built-in TypeScript support - Jest-compatible API - Excellent watch mode **Configuration**: ```typescript // vitest.config.ts import { defineConfig } from 'vitest/config'; export default defineConfig({ test: { globals: true, environment: 'node', coverage: { provider: 'v8', reporter: ['text', 'json', 'html'], exclude: [ 'node_modules/', 'tests/', '**/*.test.ts', '**/*.d.ts' ], thresholds: { lines: 85, functions: 85, branches: 80, statements: 85 } }, setupFiles: ['./tests/setup.ts'] } }); ``` ### Additional Tools - **@octokit/fixtures**: GitHub API recording/playback for integration tests - **Zod**: Schema validation (also used in production) - **Faker**: Generate realistic test data --- ## Test Scenarios by Tool ### get_failing_tests Scenarios 1. ✅ No CI configured 2. ✅ CI pending/queued 3. ✅ CI in progress 4. ✅ CI passed (no failures) 5. ✅ CI failed (single test) 6. ✅ CI failed (multiple tests) 7. ✅ CI failed (50+ tests, pagination) 8. ✅ Wait mode with completion 9. ✅ Wait mode with timeout 10. ✅ Bail on first failure 11. ✅ Unknown test framework 12. ✅ Malformed logs 13. ✅ Rate limiting during poll 14. ✅ PR not found 15. ✅ Invalid PR format ### find_unresolved_comments Scenarios 1. ✅ No comments 2. ✅ All comments resolved (via heuristics) 3. ✅ Mix of resolved and unresolved 4. ✅ Bot comments excluded (via exclude_authors) 5. ✅ Bot comments included (default) 6. ✅ Comments on deleted lines 7. ✅ Multi-line comment threads 8. ✅ Review vs issue comments 9. ✅ Comments with reactions 10. ✅ Pagination with 100+ comments 11. ✅ Sort by file 12. ✅ Sort by author 13. ✅ Sort chronologically 14. ✅ Thread building and analysis 15. ✅ Summary statistics generation ### manage_stacked_prs Scenarios 1. ✅ Valid stack (no changes needed) 2. ✅ Valid stack (rebase needed) 3. ✅ Invalid stack (not related) 4. ✅ PRs in different repos 5. ✅ Base PR merged 6. ✅ Dependent PR up to date 7. ✅ Potential conflicts detected 8. ✅ Command generation 9. ✅ Multi-step automation 10. ✅ Automation failure handling 11. ✅ Rollback scenario 12. ✅ Risk assessment 13. ✅ Visualization generation 14. ✅ Pagination of commands 15. ✅ Circular dependency detection --- ## Continuous Integration ### GitHub Actions Workflow ```yaml # .github/workflows/test.yml name: Test on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - run: npm ci - run: npm run build - run: npm test -- --coverage - name: Upload coverage uses: codecov/codecov-action@v3 with: files: ./coverage/coverage-final.json - name: Check coverage thresholds run: | if [ $(jq '.total.lines.pct < 85' coverage/coverage-summary.json) = true ]; then echo "Coverage below 85%" exit 1 fi ``` ### Pre-commit Hooks ```json // package.json { "scripts": { "test": "vitest", "test:unit": "vitest run --coverage", "test:watch": "vitest watch", "test:integration": "vitest run --config vitest.integration.config.ts", "test:integration:record": "RECORD_INTEGRATION_FIXTURES=true RUN_INTEGRATION_TESTS=true vitest run --config vitest.integration.config.ts", "test:integration:playback": "RUN_INTEGRATION_TESTS=true vitest run --config vitest.integration.config.ts", "lint": "eslint src tests", "type-check": "tsc --noEmit" }, "husky": { "hooks": { "pre-commit": "npm run lint && npm run type-check", "pre-push": "npm run test:unit" } } } ``` --- ## Testing Best Practices ### 1. Test Naming Convention Format: `should [expected behavior] when [condition]` ```typescript it('should return failures when CI has completed with errors', ...); it('should throw error when PR identifier is invalid', ...); it('should paginate results when total exceeds page size', ...); ``` ### 2. Arrange-Act-Assert Pattern ```typescript it('should categorize blocking comments correctly', () => { // Arrange const comment = createMockComment({ body: 'This must be fixed before merge' }); // Act const category = categorizeComment(comment); // Assert expect(category).toBe('blocking'); }); ``` ### 3. Test Data Builders ```typescript // tests/builders/comment-builder.ts export class CommentBuilder { private comment: Partial<Comment> = {}; withBody(body: string): this { this.comment.body = body; return this; } fromBot(botName: string = 'coderabbitai'): this { this.comment.author = botName; this.comment.is_bot = true; return this; } unresolved(): this { this.comment.is_resolved = false; return this; } build(): Comment { return { id: 1, type: 'review_comment', author: 'user', is_bot: false, created_at: new Date().toISOString(), updated_at: new Date().toISOString(), body: 'Default comment', is_resolved: false, ...this.comment } as Comment; } } // Usage const comment = new CommentBuilder() .fromBot('coderabbitai') .withBody('Nit: add space here') .unresolved() .build(); ``` ### 4. Avoid Test Interdependence Each test should be independent and not rely on state from other tests. ```typescript // ❌ Bad: Tests depend on shared state let sharedClient: GitHubClient; beforeAll(() => { sharedClient = new GitHubClient(); }); // ✅ Good: Each test creates its own instance beforeEach(() => { const client = new MockGitHubClient(); // Use client in test }); ``` This comprehensive testing strategy ensures the MCP server is robust, reliable, and maintainable.