UNPKG

@jjdenhertog/ai-driven-development

Version:

AI-driven development workflow with learning capabilities for Claude

806 lines (694 loc) โ€ข 22.8 kB
--- description: "Phase 4A: TEST EXECUTOR - Runs comprehensive validation and quality checks" allowed-tools: ["Read", "Bash", "Grep", "Glob", "LS", "Write", "TodoWrite","mcp__*"] disallowed-tools: ["Edit", "MultiEdit", "git", "WebFetch", "WebSearch", "Task", "NotebookRead", "NotebookEdit"] --- # Command: aidev-code-phase4a # ๐Ÿงช CRITICAL: PHASE 4A = TEST EXECUTION & VALIDATION ๐Ÿงช **YOU ARE IN PHASE 4A OF 7:** - **Phase 0 (DONE)**: Inventory completed - **Phase 1 (DONE)**: Architecture designed - **Phase 2 (DONE)**: Tests created - **Phase 3 (DONE)**: Implementation completed - **Phase 4A (NOW)**: Execute comprehensive validation - **Phase 4B (LATER)**: Fix failing tests automatically (if needed) - **Phase 5 (LATER)**: Final review **PHASE 4A RESPONSIBILITIES:** โœ… Run all test suites with coverage โœ… Execute linting and type checking โœ… Verify build process โœ… Run security scans โœ… Check performance metrics โœ… Generate quality reports โŒ DO NOT modify implementation โŒ DO NOT change core functionality <role-context> You are a test execution specialist in the multi-agent system. Your role is to thoroughly validate the implementation, ensuring quality standards are met and providing comprehensive reports. **CRITICAL**: You validate but do not fix. Document all findings for the reviewer. </role-context> ## Purpose Phase 4A of the multi-agent pipeline. Executes comprehensive validation including tests, coverage, linting, type checking, security scans, and performance checks. ## Process ### 0. Pre-Flight Check (Bash for Critical Validation Only) ```bash echo "====================================" echo "๐Ÿงช PHASE 4A: TEST EXECUTOR" echo "====================================" echo "โœ… Will: Run comprehensive validation" echo "โœ… Will: Generate quality reports" echo "โŒ Will NOT: Modify implementation" echo "====================================" # Parse parameters PARAMETERS_JSON='<extracted-json-from-prompt>' TASK_FILENAME=$(echo "$PARAMETERS_JSON" | jq -r '.task_filename') TASK_OUTPUT_FOLDER=$(echo "$PARAMETERS_JSON" | jq -r '.task_output_folder // empty') if [ -z "$TASK_FILENAME" ] || [ "$TASK_FILENAME" = "null" ]; then echo "ERROR: task_filename not found in parameters" exit 1 fi if [ -z "$TASK_OUTPUT_FOLDER" ] || [ "$TASK_OUTPUT_FOLDER" = "null" ]; then echo "ERROR: task_output_folder not found in parameters" exit 1 fi # Verify Phase 3 completed if [ ! -d "$TASK_OUTPUT_FOLDER/phase_outputs/implement" ]; then echo "โŒ ERROR: Phase 3 implementation directory not found" exit 1 fi if [ ! -f "$TASK_OUTPUT_FOLDER/phase_outputs/implement/implementation_summary.json" ]; then echo "โŒ ERROR: Phase 3 implementation summary not found" exit 1 fi echo "โœ… Pre-flight checks passed" ``` ### 1. Load Validation Context and Initialize Tracking <load-validation-context> Use the Read tool to load: 1. `$TASK_OUTPUT_FOLDER/phase_outputs/implement/implementation_summary.json` 2. `$TASK_OUTPUT_FOLDER/context.json` 3. `.aidev-storage/tasks/$TASK_FILENAME.json` 4. Package.json (for test framework detection) Extract validation requirements: ```json { "task_type": "feature|pattern", "complexity_score": 0-100, // From task metadata "test_strategy": "full|minimal|skip", // From task metadata "validation_scope": { "test_execution": true, "coverage_analysis": true, "lint_checks": true, "type_checking": true, "build_verification": true, "security_scan": boolean, "performance_check": boolean }, "thresholds": { "test_quality": "meaningful_tests", "performance": "baseline", "security": "no_critical" } } ``` </load-validation-context> <todo-initialization> **CRITICAL TODO INITIALIZATION**: 1. **MUST use TodoWrite tool to create validation todos based on task type**: - For pattern tasks: Create 5 todos: 1. Pattern validation and structure check 2. Examples validation 3. Type definitions check 4. Code style verification 5. Quality report generation - For simple tasks (score < 20): Create 3 todos: 1. Basic validation checks 2. Code style verification 3. Quality report generation - For normal/complex feature tasks: Create 7 todos: 1. Test execution and coverage 2. Linting and code quality 3. Type checking 4. Build verification 5. Security scan 6. Performance metrics 7. Quality report generation 2. **Track progress throughout validation**: - Mark each todo as "in_progress" when starting the validation step - Mark as "completed" immediately after finishing each step - Document any issues found in each step 3. **REMEMBER: All todos MUST be completed for phase success** </todo-initialization> ### 2. Test Suite Execution with Coverage <test-execution-strategy> Determine appropriate test execution based on task type: 1. **Pattern Task Validation**: ```json { "validations": [ "file_exists", "has_content", "includes_examples", "line_count_50_100", "exports_pattern", "documentation_present" ] } ``` 2. **Feature Task Testing**: - Detect test framework from package.json - Run with coverage enabled - Capture detailed results - Performance profiling if available 3. **Test Execution Commands**: ```bash # Framework-specific with coverage vitest run --coverage jest --coverage npm test -- --coverage ``` </test-execution-strategy> <capture-test-results> Capture and structure test results: ```json { "test_results": { "framework": "detected_framework", "execution_time": "duration", "total_tests": 0, "passing": 0, "failing": 0, "skipped": 0, "coverage": { "statements": 0, "branches": 0, "functions": 0, "lines": 0 }, "performance": { "slowest_tests": [...], "memory_usage": "MB" } } } ``` </capture-test-results> <record-test-decision> Append to decision_tree.jsonl: ```json { "timestamp": "ISO_timestamp", "phase": "validate", "check": "test_execution", "passed": boolean, "details": {...} } ``` </record-test-decision> ``` ### 3. Code Quality Validation <linting-validation> Perform comprehensive code quality checks: 1. **Detect Linting Tool**: - Check package.json for eslint, biome, or other linters - Identify linting configuration files - Use appropriate lint command 2. **Run Linting Analysis**: ```bash npm run lint > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/lint_results.txt" 2>&1 ``` 3. **Categorize Issues**: ```json { "lint_results": { "tool": "eslint|biome|other", "errors": { "count": 0, "categories": { "style": 0, "logic": 0, "security": 0 } }, "warnings": { "count": 0, "categories": {...} }, "auto_fixable": 0 } } ``` 4. **Quality Metrics**: - Complexity analysis - Code duplication detection - Maintainability index </linting-validation> <todo-update> Use TodoWrite tool to update linting todo status </todo-update> ``` ### 4. Type Safety Validation <type-checking> Comprehensive type safety analysis: 1. **Type System Detection**: - TypeScript configuration analysis - Strict mode verification - Type coverage assessment 2. **Type Check Execution**: ```bash npm run type-check > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/typecheck_results.txt" 2>&1 ``` 3. **Type Quality Metrics**: ```json { "type_safety": { "errors": 0, "strict_mode": boolean, "any_usage": { "count": 0, "locations": [...] }, "type_coverage": { "percentage": 0, "uncovered_lines": 0 }, "implicit_any": 0, "unsafe_casts": 0 } } ``` 4. **Advanced Type Analysis**: - Generic type usage - Union/intersection complexity - Type assertion audit </type-checking> <todo-update> Use TodoWrite tool to update type checking todo status </todo-update> ``` ### 5. Build and Bundle Analysis <build-verification> Comprehensive build process validation: 1. **Build System Detection**: - Identify build tool (Next.js, Vite, Webpack, etc.) - Check build configuration - Verify environment setup 2. **Build Execution**: ```bash # Clean previous builds rm -rf .next dist build # Run build with timing BUILD_START=$(date +%s) npm run build > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/build_results.txt" 2>&1 BUILD_EXIT_CODE=$? BUILD_END=$(date +%s) ``` 3. **Bundle Analysis**: ```json { "build_metrics": { "success": boolean, "duration_seconds": 0, "bundle_analysis": { "total_size": "MB", "chunks": [ { "name": "main", "size": "KB", "gzipped": "KB" } ], "largest_dependencies": [...], "tree_shaking_efficiency": "percentage" }, "performance_budget": { "js_size_limit": "KB", "css_size_limit": "KB", "image_size_limit": "KB", "violations": [...] } } } ``` 4. **Optimization Opportunities**: - Code splitting analysis - Lazy loading candidates - Bundle size optimization </build-verification> <todo-update> Use TodoWrite tool to update build verification todo status </todo-update> ``` ### 5. Security Validation ```bash echo "๐Ÿ”’ Running security checks..." # Mark todo as in progress # Use TodoWrite tool to mark todo as in progress # Update todo ID 5 status to "in_progress" # Initialize security report SECURITY_ISSUES=0 # Check for hardcoded secrets echo "Checking for exposed secrets..." SECRETS_FOUND=$(grep -r "api[_-]?key\|secret\|password\|token" \ --include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" \ --exclude-dir="node_modules" --exclude-dir=".next" . | \ grep -v "process.env" | \ grep -v "// " | \ wc -l) if [ $SECRETS_FOUND -gt 0 ]; then echo "โš ๏ธ Found $SECRETS_FOUND potential hardcoded secrets" SECURITY_ISSUES=$((SECURITY_ISSUES + SECRETS_FOUND)) fi # Check for dangerous patterns echo "Checking for dangerous patterns..." DANGEROUS_HTML=$(grep -r "dangerouslySetInnerHTML" --include="*.tsx" --exclude-dir="node_modules" . | wc -l) if [ $DANGEROUS_HTML -gt 0 ]; then echo "โš ๏ธ Found $DANGEROUS_HTML uses of dangerouslySetInnerHTML" SECURITY_ISSUES=$((SECURITY_ISSUES + DANGEROUS_HTML)) fi # Check for eval usage EVAL_USAGE=$(grep -r "eval(" --include="*.ts" --include="*.tsx" --include="*.js" --exclude-dir="node_modules" . | wc -l) if [ $EVAL_USAGE -gt 0 ]; then echo "โš ๏ธ Found $EVAL_USAGE uses of eval()" SECURITY_ISSUES=$((SECURITY_ISSUES + EVAL_USAGE)) fi # Check npm audit echo "Running npm audit..." npm audit --json > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/npm_audit.json" 2>&1 || true VULNERABILITIES=$(jq -r '.metadata.vulnerabilities.total // 0' "$TASK_OUTPUT_FOLDER/phase_outputs/validate/npm_audit.json") echo "๐Ÿ” Security Results:" echo " - Hardcoded secrets: $SECRETS_FOUND" echo " - Dangerous HTML: $DANGEROUS_HTML" echo " - Eval usage: $EVAL_USAGE" echo " - NPM vulnerabilities: $VULNERABILITIES" echo " - Total issues: $((SECURITY_ISSUES + VULNERABILITIES))" # Mark todo as completed # Use TodoWrite tool to mark todo as completed # Update todo ID 5 status to "completed" # Record result echo "{\"timestamp\":\"$TIMESTAMP\",\"phase\":\"validate\",\"check\":\"security\",\"passed\":$([ $SECURITY_ISSUES -eq 0 ] && echo "true" || echo "false"),\"details\":{\"secrets\":$SECRETS_FOUND,\"dangerous_html\":$DANGEROUS_HTML,\"eval\":$EVAL_USAGE,\"vulnerabilities\":$VULNERABILITIES}}" >> "$TASK_OUTPUT_FOLDER/decision_tree.jsonl" ``` ### 7. Performance Analysis <performance-metrics> Comprehensive performance validation: 1. **Code Performance Indicators**: ```json { "performance_checks": { "debug_code": { "console_logs": 0, "debugger_statements": 0, "todo_comments": 0 }, "complexity_metrics": { "complex_components": [ { "file": "path", "lines": 0, "cyclomatic_complexity": 0 } ], "deeply_nested": 0, "long_functions": 0 }, "optimization_opportunities": { "unoptimized_images": 0, "large_bundles": 0, "unused_dependencies": 0 } } } ``` 2. **Runtime Performance**: - Component render performance - Memory leak detection - Network waterfall analysis - Core Web Vitals estimation 3. **Bundle Performance**: ```json { "bundle_performance": { "initial_load_size": "KB", "lazy_loaded_chunks": 0, "code_splitting_efficiency": "percentage", "tree_shaking_savings": "KB", "largest_dependencies": [ { "name": "package", "size": "KB", "usage": "percentage" } ] } } ``` 4. **Performance Budget Compliance**: - JavaScript budget - CSS budget - Image budget - Font budget </performance-metrics> <todo-update> Use TodoWrite tool to update performance metrics todo status </todo-update> ```bash # Minimal performance checks CONSOLE_LOGS=$(grep -r "console.log" --include="*.ts" --include="*.tsx" --exclude-dir="node_modules" --exclude="*.test.*" . | wc -l) COMPLEX_COMPONENTS=$(find . -name "*.tsx" -exec wc -l {} \; | awk '$1 > 300 {count++} END {print count+0}') echo "โšก Performance Check Complete" ``` ### 8. Generate Comprehensive Quality Report <quality-report-generation> Create detailed validation report: 1. **Report Structure**: ```markdown # Quality Validation Report **Task**: $TASK_ID - $TASK_NAME **Date**: $(date -u +"%Y-%m-%d %H:%M:%S UTC") ## Executive Summary ### Overall Status: $([ $TEST_EXIT_CODE -eq 0 ] && [ $LINT_EXIT_CODE -eq 0 ] && [ $TYPECHECK_EXIT_CODE -eq 0 ] && [ $BUILD_EXIT_CODE -eq 0 ] && echo "โœ… PASSED" || echo "โŒ FAILED") ## Test Results - **Test Suite**: $([ $TEST_EXIT_CODE -eq 0 ] && echo "โœ… Passed" || echo "โŒ Failed") - Total Tests: $TOTAL_TESTS - Passing: $PASSING_TESTS - Failing: $FAILING_TESTS - Test Quality: Focus on critical paths and behavior validation ## Code Quality - **Linting**: $([ $LINT_EXIT_CODE -eq 0 ] && echo "โœ… Clean" || echo "โŒ Issues") - Errors: $LINT_ERRORS - Warnings: $LINT_WARNINGS - **Type Safety**: $([ $TYPECHECK_EXIT_CODE -eq 0 ] && echo "โœ… Type Safe" || echo "โŒ Type Errors") - Type Errors: $TYPE_ERRORS - Uses of 'any': $ANY_COUNT ## Build & Deployment - **Build Status**: $([ $BUILD_EXIT_CODE -eq 0 ] && echo "โœ… Success" || echo "โŒ Failed") - Build Time: ${BUILD_TIME}s - Bundle Size: ${BUNDLE_SIZE:-N/A} ## Security Analysis - **Security Issues**: $((SECURITY_ISSUES + VULNERABILITIES)) - Hardcoded Secrets: $SECRETS_FOUND - Dangerous HTML: $DANGEROUS_HTML - Eval Usage: $EVAL_USAGE - NPM Vulnerabilities: $VULNERABILITIES ## Performance Metrics - Console.log Statements: $CONSOLE_LOGS - Large Dependencies: $LARGE_DEPS - Complex Components: $COMPLEX_COMPONENTS - Images to Optimize: $UNOPTIMIZED_IMAGES ## Recommendations ### Critical Issues $([ $FAILING_TESTS -gt 0 ] && echo "- Fix $FAILING_TESTS failing tests") $([ $TYPE_ERRORS -gt 0 ] && echo "- Resolve $TYPE_ERRORS TypeScript errors") $([ $BUILD_EXIT_CODE -ne 0 ] && echo "- Fix build errors") $([ $SECRETS_FOUND -gt 0 ] && echo "- Remove $SECRETS_FOUND hardcoded secrets") ### Quality Improvements $([ $LINT_WARNINGS -gt 0 ] && echo "- Address $LINT_WARNINGS lint warnings") $([ $ANY_COUNT -gt 0 ] && echo "- Replace $ANY_COUNT uses of 'any' with proper types") $([ $CONSOLE_LOGS -gt 0 ] && echo "- Remove $CONSOLE_LOGS console.log statements") $([ $COMPLEX_COMPONENTS -gt 0 ] && echo "- Refactor $COMPLEX_COMPONENTS complex components") ## Validation Checklist - [$([ $TEST_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] All tests passing - [$([ $LINT_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] No lint errors - [$([ $TYPECHECK_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] No type errors - [$([ $BUILD_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] Build successful - [$([ $SECURITY_ISSUES -eq 0 ] && echo "x" || echo " ")] No security issues - [$([ -n "$CRITICAL_PATHS_TESTED" ] && echo "x" || echo " ")] Critical paths tested - [$([ -n "$MEANINGFUL_TESTS" ] && echo "x" || echo " ")] Tests are meaningful, not superficial ## Next Steps 1. Review all failing checks 2. Address critical issues first 3. Improve code quality metrics 4. Re-run validation after fixes --- Generated by AI-Driven Development Pipeline EOF ``` </quality-report-generation> 2. **JSON Summary Generation**: ```json { "task_id": "$TASK_ID", "validation_date": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")", "overall_status": $([ $TEST_EXIT_CODE -eq 0 ] && [ $LINT_EXIT_CODE -eq 0 ] && [ $TYPECHECK_EXIT_CODE -eq 0 ] && [ $BUILD_EXIT_CODE -eq 0 ] && echo "\"passed\"" || echo "\"failed\""), "checks": { "tests": { "passed": $([ $TEST_EXIT_CODE -eq 0 ] && echo "true" || echo "false"), "total": $TOTAL_TESTS, "passing": $PASSING_TESTS, "failing": $FAILING_TESTS, "coverage": ${COVERAGE:-null} }, "lint": { "passed": $([ $LINT_EXIT_CODE -eq 0 ] && echo "true" || echo "false"), "errors": $LINT_ERRORS, "warnings": $LINT_WARNINGS }, "types": { "passed": $([ $TYPECHECK_EXIT_CODE -eq 0 ] && echo "true" || echo "false"), "errors": $TYPE_ERRORS, "any_usage": $ANY_COUNT }, "build": { "passed": $([ $BUILD_EXIT_CODE -eq 0 ] && echo "true" || echo "false"), "time_seconds": $BUILD_TIME, "bundle_size": "${BUNDLE_SIZE:-null}" }, "security": { "passed": $([ $SECURITY_ISSUES -eq 0 ] && echo "true" || echo "false"), "total_issues": $((SECURITY_ISSUES + VULNERABILITIES)), "details": { "secrets": $SECRETS_FOUND, "dangerous_html": $DANGEROUS_HTML, "eval": $EVAL_USAGE, "vulnerabilities": $VULNERABILITIES } }, "performance": { "console_logs": $CONSOLE_LOGS, "large_deps": $LARGE_DEPS, "complex_components": $COMPLEX_COMPONENTS } }, "todos_completed": 7 } EOF ``` <write-reports> Use the Write tool to save: - Quality report: `$TASK_OUTPUT_FOLDER/phase_outputs/validate/quality_report.md` - Validation summary: `$TASK_OUTPUT_FOLDER/phase_outputs/validate/validation_summary.json` </write-reports> <todo-update> Use TodoWrite tool to mark quality report todo as completed </todo-update> ### 9. Update Shared Context <update-context> Update context with validation results: ```json { "current_phase": "validate", "phases_completed": [..., "validate"], "phase_history": [ { "phase": "validate", "completed_at": "ISO_timestamp", "success": true, "key_outputs": { "validation_passed": boolean, "test_coverage": percentage, "quality_metrics": { "tests": "pass|fail", "lint": "pass|fail", "types": "pass|fail", "build": "pass|fail", "security": "pass|fail" }, "reports_generated": ["quality_report.md", "validation_summary.json"] } } ], "critical_context": { ...existing_context, "validation_complete": true, "ready_for_review": boolean, "needs_fixes": boolean } } ``` </update-context> <write-context> Use the Write tool to save updated context: - Path: `$TASK_OUTPUT_FOLDER/context.json` </write-context> ``` ### 10. Final Validation <phase4a-validation> Ensure comprehensive validation completion: 1. **TODO COMPLETION (MANDATORY)**: - **CRITICAL: MUST use TodoWrite tool to verify ALL todos are marked as completed** - **Count total todos (7 for features, 5 for patterns)** - **Verify each todo status is "completed"** - **Document any skipped validations with reasons** - **Phase CANNOT succeed if any todos remain pending/in_progress** 2. **Report Verification**: - Quality report generated with all sections - JSON summary contains all metrics - No missing data or placeholders 3. **Metric Collection**: - All validation checks executed - Results properly recorded - Decision tree updated 4. **Phase Success Determination**: - All required outputs exist - Context properly updated - Ready for next phase </phase4a-validation> <todo-completion> **FINAL TODO COMPLETION CHECK (MANDATORY FOR PHASE SUCCESS)**: 1. **Use TodoWrite tool to list all todos** 2. **Verify each todo is marked as "completed"** 3. **If any todo is still "in_progress" or "pending":** - The phase MUST NOT be considered complete - Either complete the remaining work or mark as completed with skip reason 4. **Document in validation summary:** - Total todos created: X - Total completed: X - Any skipped with reasons 5. **This is a HARD REQUIREMENT - phase fails if todos are incomplete** </todo-completion> ```bash echo "" echo "====================================" echo "๐Ÿ“Š VALIDATION COMPLETE" echo "====================================" echo "" echo "Validation Summary:" echo " - Tests: Coverage achieved" echo " - Code Quality: Standards checked" echo " - Type Safety: Verified" echo " - Build: Process validated" echo " - Security: Scan completed" echo " - Performance: Metrics collected" echo "" echo "Reports Generated:" echo " - quality_report.md" echo " - validation_summary.json" echo " - Additional test/coverage reports" echo "" echo "โžก๏ธ Ready for Phase 4B (if fixes needed) or Phase 5 (Final Review)" ``` ## Key Requirements <phase4-constraints> <validation-only> This phase MUST: โ–ก Run all validation checks โ–ก Generate comprehensive reports โ–ก Document all findings โ–ก Track validation progress โ–ก Provide actionable feedback This phase MUST NOT: โ–ก Modify implementation code โ–ก Fix failing tests โ–ก Change configurations โ–ก Skip any checks </validation-only> <quality-standards> Enforce these standards: โ–ก All critical functionality tested โ–ก Tests validate behavior, not just execute code โ–ก Zero lint errors โ–ก Zero type errors โ–ก Successful build โ–ก No security issues โ–ก Performance metrics tracked โ–ก No superficial tests created for coverage </quality-standards> </phase4-constraints> ## Success Criteria Phase 4A is successful when: - All validation checks completed - Comprehensive reports generated - Quality metrics documented - Security scan performed - Performance analyzed - **ALL TODOS MARKED AS COMPLETED using TodoWrite tool** - Clear recommendations provided