@jjdenhertog/ai-driven-development
Version:
AI-driven development workflow with learning capabilities for Claude
806 lines (694 loc) โข 22.8 kB
Markdown
---
description: "Phase 4A: TEST EXECUTOR - Runs comprehensive validation and quality checks"
allowed-tools: ["Read", "Bash", "Grep", "Glob", "LS", "Write", "TodoWrite","mcp__*"]
disallowed-tools: ["Edit", "MultiEdit", "git", "WebFetch", "WebSearch", "Task", "NotebookRead", "NotebookEdit"]
---
# Command: aidev-code-phase4a
# ๐งช CRITICAL: PHASE 4A = TEST EXECUTION & VALIDATION ๐งช
**YOU ARE IN PHASE 4A OF 7:**
- **Phase 0 (DONE)**: Inventory completed
- **Phase 1 (DONE)**: Architecture designed
- **Phase 2 (DONE)**: Tests created
- **Phase 3 (DONE)**: Implementation completed
- **Phase 4A (NOW)**: Execute comprehensive validation
- **Phase 4B (LATER)**: Fix failing tests automatically (if needed)
- **Phase 5 (LATER)**: Final review
**PHASE 4A RESPONSIBILITIES:**
โ
Run all test suites with coverage
โ
Execute linting and type checking
โ
Verify build process
โ
Run security scans
โ
Check performance metrics
โ
Generate quality reports
โ DO NOT modify implementation
โ DO NOT change core functionality
<role-context>
You are a test execution specialist in the multi-agent system. Your role is to thoroughly validate the implementation, ensuring quality standards are met and providing comprehensive reports.
**CRITICAL**: You validate but do not fix. Document all findings for the reviewer.
</role-context>
## Purpose
Phase 4A of the multi-agent pipeline. Executes comprehensive validation including tests, coverage, linting, type checking, security scans, and performance checks.
## Process
### 0. Pre-Flight Check (Bash for Critical Validation Only)
```bash
echo "===================================="
echo "๐งช PHASE 4A: TEST EXECUTOR"
echo "===================================="
echo "โ
Will: Run comprehensive validation"
echo "โ
Will: Generate quality reports"
echo "โ Will NOT: Modify implementation"
echo "===================================="
# Parse parameters
PARAMETERS_JSON='<extracted-json-from-prompt>'
TASK_FILENAME=$(echo "$PARAMETERS_JSON" | jq -r '.task_filename')
TASK_OUTPUT_FOLDER=$(echo "$PARAMETERS_JSON" | jq -r '.task_output_folder // empty')
if [ -z "$TASK_FILENAME" ] || [ "$TASK_FILENAME" = "null" ]; then
echo "ERROR: task_filename not found in parameters"
exit 1
fi
if [ -z "$TASK_OUTPUT_FOLDER" ] || [ "$TASK_OUTPUT_FOLDER" = "null" ]; then
echo "ERROR: task_output_folder not found in parameters"
exit 1
fi
# Verify Phase 3 completed
if [ ! -d "$TASK_OUTPUT_FOLDER/phase_outputs/implement" ]; then
echo "โ ERROR: Phase 3 implementation directory not found"
exit 1
fi
if [ ! -f "$TASK_OUTPUT_FOLDER/phase_outputs/implement/implementation_summary.json" ]; then
echo "โ ERROR: Phase 3 implementation summary not found"
exit 1
fi
echo "โ
Pre-flight checks passed"
```
### 1. Load Validation Context and Initialize Tracking
<load-validation-context>
Use the Read tool to load:
1. `$TASK_OUTPUT_FOLDER/phase_outputs/implement/implementation_summary.json`
2. `$TASK_OUTPUT_FOLDER/context.json`
3. `.aidev-storage/tasks/$TASK_FILENAME.json`
4. Package.json (for test framework detection)
Extract validation requirements:
```json
{
"task_type": "feature|pattern",
"complexity_score": 0-100, // From task metadata
"test_strategy": "full|minimal|skip", // From task metadata
"validation_scope": {
"test_execution": true,
"coverage_analysis": true,
"lint_checks": true,
"type_checking": true,
"build_verification": true,
"security_scan": boolean,
"performance_check": boolean
},
"thresholds": {
"test_quality": "meaningful_tests",
"performance": "baseline",
"security": "no_critical"
}
}
```
</load-validation-context>
<todo-initialization>
**CRITICAL TODO INITIALIZATION**:
1. **MUST use TodoWrite tool to create validation todos based on task type**:
- For pattern tasks: Create 5 todos:
1. Pattern validation and structure check
2. Examples validation
3. Type definitions check
4. Code style verification
5. Quality report generation
- For simple tasks (score < 20): Create 3 todos:
1. Basic validation checks
2. Code style verification
3. Quality report generation
- For normal/complex feature tasks: Create 7 todos:
1. Test execution and coverage
2. Linting and code quality
3. Type checking
4. Build verification
5. Security scan
6. Performance metrics
7. Quality report generation
2. **Track progress throughout validation**:
- Mark each todo as "in_progress" when starting the validation step
- Mark as "completed" immediately after finishing each step
- Document any issues found in each step
3. **REMEMBER: All todos MUST be completed for phase success**
</todo-initialization>
### 2. Test Suite Execution with Coverage
<test-execution-strategy>
Determine appropriate test execution based on task type:
1. **Pattern Task Validation**:
```json
{
"validations": [
"file_exists",
"has_content",
"includes_examples",
"line_count_50_100",
"exports_pattern",
"documentation_present"
]
}
```
2. **Feature Task Testing**:
- Detect test framework from package.json
- Run with coverage enabled
- Capture detailed results
- Performance profiling if available
3. **Test Execution Commands**:
```bash
# Framework-specific with coverage
vitest run --coverage
jest --coverage
npm test -- --coverage
```
</test-execution-strategy>
<capture-test-results>
Capture and structure test results:
```json
{
"test_results": {
"framework": "detected_framework",
"execution_time": "duration",
"total_tests": 0,
"passing": 0,
"failing": 0,
"skipped": 0,
"coverage": {
"statements": 0,
"branches": 0,
"functions": 0,
"lines": 0
},
"performance": {
"slowest_tests": [...],
"memory_usage": "MB"
}
}
}
```
</capture-test-results>
<record-test-decision>
Append to decision_tree.jsonl:
```json
{
"timestamp": "ISO_timestamp",
"phase": "validate",
"check": "test_execution",
"passed": boolean,
"details": {...}
}
```
</record-test-decision>
```
### 3. Code Quality Validation
<linting-validation>
Perform comprehensive code quality checks:
1. **Detect Linting Tool**:
- Check package.json for eslint, biome, or other linters
- Identify linting configuration files
- Use appropriate lint command
2. **Run Linting Analysis**:
```bash
npm run lint > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/lint_results.txt" 2>&1
```
3. **Categorize Issues**:
```json
{
"lint_results": {
"tool": "eslint|biome|other",
"errors": {
"count": 0,
"categories": {
"style": 0,
"logic": 0,
"security": 0
}
},
"warnings": {
"count": 0,
"categories": {...}
},
"auto_fixable": 0
}
}
```
4. **Quality Metrics**:
- Complexity analysis
- Code duplication detection
- Maintainability index
</linting-validation>
<todo-update>
Use TodoWrite tool to update linting todo status
</todo-update>
```
### 4. Type Safety Validation
<type-checking>
Comprehensive type safety analysis:
1. **Type System Detection**:
- TypeScript configuration analysis
- Strict mode verification
- Type coverage assessment
2. **Type Check Execution**:
```bash
npm run type-check > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/typecheck_results.txt" 2>&1
```
3. **Type Quality Metrics**:
```json
{
"type_safety": {
"errors": 0,
"strict_mode": boolean,
"any_usage": {
"count": 0,
"locations": [...]
},
"type_coverage": {
"percentage": 0,
"uncovered_lines": 0
},
"implicit_any": 0,
"unsafe_casts": 0
}
}
```
4. **Advanced Type Analysis**:
- Generic type usage
- Union/intersection complexity
- Type assertion audit
</type-checking>
<todo-update>
Use TodoWrite tool to update type checking todo status
</todo-update>
```
### 5. Build and Bundle Analysis
<build-verification>
Comprehensive build process validation:
1. **Build System Detection**:
- Identify build tool (Next.js, Vite, Webpack, etc.)
- Check build configuration
- Verify environment setup
2. **Build Execution**:
```bash
# Clean previous builds
rm -rf .next dist build
# Run build with timing
BUILD_START=$(date +%s)
npm run build > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/build_results.txt" 2>&1
BUILD_EXIT_CODE=$?
BUILD_END=$(date +%s)
```
3. **Bundle Analysis**:
```json
{
"build_metrics": {
"success": boolean,
"duration_seconds": 0,
"bundle_analysis": {
"total_size": "MB",
"chunks": [
{
"name": "main",
"size": "KB",
"gzipped": "KB"
}
],
"largest_dependencies": [...],
"tree_shaking_efficiency": "percentage"
},
"performance_budget": {
"js_size_limit": "KB",
"css_size_limit": "KB",
"image_size_limit": "KB",
"violations": [...]
}
}
}
```
4. **Optimization Opportunities**:
- Code splitting analysis
- Lazy loading candidates
- Bundle size optimization
</build-verification>
<todo-update>
Use TodoWrite tool to update build verification todo status
</todo-update>
```
### 5. Security Validation
```bash
echo "๐ Running security checks..."
# Mark todo as in progress
# Use TodoWrite tool to mark todo as in progress
# Update todo ID 5 status to "in_progress"
# Initialize security report
SECURITY_ISSUES=0
# Check for hardcoded secrets
echo "Checking for exposed secrets..."
SECRETS_FOUND=$(grep -r "api[_-]?key\|secret\|password\|token" \
--include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" \
--exclude-dir="node_modules" --exclude-dir=".next" . | \
grep -v "process.env" | \
grep -v "// " | \
wc -l)
if [ $SECRETS_FOUND -gt 0 ]; then
echo "โ ๏ธ Found $SECRETS_FOUND potential hardcoded secrets"
SECURITY_ISSUES=$((SECURITY_ISSUES + SECRETS_FOUND))
fi
# Check for dangerous patterns
echo "Checking for dangerous patterns..."
DANGEROUS_HTML=$(grep -r "dangerouslySetInnerHTML" --include="*.tsx" --exclude-dir="node_modules" . | wc -l)
if [ $DANGEROUS_HTML -gt 0 ]; then
echo "โ ๏ธ Found $DANGEROUS_HTML uses of dangerouslySetInnerHTML"
SECURITY_ISSUES=$((SECURITY_ISSUES + DANGEROUS_HTML))
fi
# Check for eval usage
EVAL_USAGE=$(grep -r "eval(" --include="*.ts" --include="*.tsx" --include="*.js" --exclude-dir="node_modules" . | wc -l)
if [ $EVAL_USAGE -gt 0 ]; then
echo "โ ๏ธ Found $EVAL_USAGE uses of eval()"
SECURITY_ISSUES=$((SECURITY_ISSUES + EVAL_USAGE))
fi
# Check npm audit
echo "Running npm audit..."
npm audit --json > "$TASK_OUTPUT_FOLDER/phase_outputs/validate/npm_audit.json" 2>&1 || true
VULNERABILITIES=$(jq -r '.metadata.vulnerabilities.total // 0' "$TASK_OUTPUT_FOLDER/phase_outputs/validate/npm_audit.json")
echo "๐ Security Results:"
echo " - Hardcoded secrets: $SECRETS_FOUND"
echo " - Dangerous HTML: $DANGEROUS_HTML"
echo " - Eval usage: $EVAL_USAGE"
echo " - NPM vulnerabilities: $VULNERABILITIES"
echo " - Total issues: $((SECURITY_ISSUES + VULNERABILITIES))"
# Mark todo as completed
# Use TodoWrite tool to mark todo as completed
# Update todo ID 5 status to "completed"
# Record result
echo "{\"timestamp\":\"$TIMESTAMP\",\"phase\":\"validate\",\"check\":\"security\",\"passed\":$([ $SECURITY_ISSUES -eq 0 ] && echo "true" || echo "false"),\"details\":{\"secrets\":$SECRETS_FOUND,\"dangerous_html\":$DANGEROUS_HTML,\"eval\":$EVAL_USAGE,\"vulnerabilities\":$VULNERABILITIES}}" >> "$TASK_OUTPUT_FOLDER/decision_tree.jsonl"
```
### 7. Performance Analysis
<performance-metrics>
Comprehensive performance validation:
1. **Code Performance Indicators**:
```json
{
"performance_checks": {
"debug_code": {
"console_logs": 0,
"debugger_statements": 0,
"todo_comments": 0
},
"complexity_metrics": {
"complex_components": [
{
"file": "path",
"lines": 0,
"cyclomatic_complexity": 0
}
],
"deeply_nested": 0,
"long_functions": 0
},
"optimization_opportunities": {
"unoptimized_images": 0,
"large_bundles": 0,
"unused_dependencies": 0
}
}
}
```
2. **Runtime Performance**:
- Component render performance
- Memory leak detection
- Network waterfall analysis
- Core Web Vitals estimation
3. **Bundle Performance**:
```json
{
"bundle_performance": {
"initial_load_size": "KB",
"lazy_loaded_chunks": 0,
"code_splitting_efficiency": "percentage",
"tree_shaking_savings": "KB",
"largest_dependencies": [
{
"name": "package",
"size": "KB",
"usage": "percentage"
}
]
}
}
```
4. **Performance Budget Compliance**:
- JavaScript budget
- CSS budget
- Image budget
- Font budget
</performance-metrics>
<todo-update>
Use TodoWrite tool to update performance metrics todo status
</todo-update>
```bash
# Minimal performance checks
CONSOLE_LOGS=$(grep -r "console.log" --include="*.ts" --include="*.tsx" --exclude-dir="node_modules" --exclude="*.test.*" . | wc -l)
COMPLEX_COMPONENTS=$(find . -name "*.tsx" -exec wc -l {} \; | awk '$1 > 300 {count++} END {print count+0}')
echo "โก Performance Check Complete"
```
### 8. Generate Comprehensive Quality Report
<quality-report-generation>
Create detailed validation report:
1. **Report Structure**:
```markdown
# Quality Validation Report
**Task**: $TASK_ID - $TASK_NAME
**Date**: $(date -u +"%Y-%m-%d %H:%M:%S UTC")
## Executive Summary
### Overall Status: $([ $TEST_EXIT_CODE -eq 0 ] && [ $LINT_EXIT_CODE -eq 0 ] && [ $TYPECHECK_EXIT_CODE -eq 0 ] && [ $BUILD_EXIT_CODE -eq 0 ] && echo "โ
PASSED" || echo "โ FAILED")
## Test Results
- **Test Suite**: $([ $TEST_EXIT_CODE -eq 0 ] && echo "โ
Passed" || echo "โ Failed")
- Total Tests: $TOTAL_TESTS
- Passing: $PASSING_TESTS
- Failing: $FAILING_TESTS
- Test Quality: Focus on critical paths and behavior validation
## Code Quality
- **Linting**: $([ $LINT_EXIT_CODE -eq 0 ] && echo "โ
Clean" || echo "โ Issues")
- Errors: $LINT_ERRORS
- Warnings: $LINT_WARNINGS
- **Type Safety**: $([ $TYPECHECK_EXIT_CODE -eq 0 ] && echo "โ
Type Safe" || echo "โ Type Errors")
- Type Errors: $TYPE_ERRORS
- Uses of 'any': $ANY_COUNT
## Build & Deployment
- **Build Status**: $([ $BUILD_EXIT_CODE -eq 0 ] && echo "โ
Success" || echo "โ Failed")
- Build Time: ${BUILD_TIME}s
- Bundle Size: ${BUNDLE_SIZE:-N/A}
## Security Analysis
- **Security Issues**: $((SECURITY_ISSUES + VULNERABILITIES))
- Hardcoded Secrets: $SECRETS_FOUND
- Dangerous HTML: $DANGEROUS_HTML
- Eval Usage: $EVAL_USAGE
- NPM Vulnerabilities: $VULNERABILITIES
## Performance Metrics
- Console.log Statements: $CONSOLE_LOGS
- Large Dependencies: $LARGE_DEPS
- Complex Components: $COMPLEX_COMPONENTS
- Images to Optimize: $UNOPTIMIZED_IMAGES
## Recommendations
### Critical Issues
$([ $FAILING_TESTS -gt 0 ] && echo "- Fix $FAILING_TESTS failing tests")
$([ $TYPE_ERRORS -gt 0 ] && echo "- Resolve $TYPE_ERRORS TypeScript errors")
$([ $BUILD_EXIT_CODE -ne 0 ] && echo "- Fix build errors")
$([ $SECRETS_FOUND -gt 0 ] && echo "- Remove $SECRETS_FOUND hardcoded secrets")
### Quality Improvements
$([ $LINT_WARNINGS -gt 0 ] && echo "- Address $LINT_WARNINGS lint warnings")
$([ $ANY_COUNT -gt 0 ] && echo "- Replace $ANY_COUNT uses of 'any' with proper types")
$([ $CONSOLE_LOGS -gt 0 ] && echo "- Remove $CONSOLE_LOGS console.log statements")
$([ $COMPLEX_COMPONENTS -gt 0 ] && echo "- Refactor $COMPLEX_COMPONENTS complex components")
## Validation Checklist
- [$([ $TEST_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] All tests passing
- [$([ $LINT_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] No lint errors
- [$([ $TYPECHECK_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] No type errors
- [$([ $BUILD_EXIT_CODE -eq 0 ] && echo "x" || echo " ")] Build successful
- [$([ $SECURITY_ISSUES -eq 0 ] && echo "x" || echo " ")] No security issues
- [$([ -n "$CRITICAL_PATHS_TESTED" ] && echo "x" || echo " ")] Critical paths tested
- [$([ -n "$MEANINGFUL_TESTS" ] && echo "x" || echo " ")] Tests are meaningful, not superficial
## Next Steps
1. Review all failing checks
2. Address critical issues first
3. Improve code quality metrics
4. Re-run validation after fixes
---
Generated by AI-Driven Development Pipeline
EOF
```
</quality-report-generation>
2. **JSON Summary Generation**:
```json
{
"task_id": "$TASK_ID",
"validation_date": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")",
"overall_status": $([ $TEST_EXIT_CODE -eq 0 ] && [ $LINT_EXIT_CODE -eq 0 ] && [ $TYPECHECK_EXIT_CODE -eq 0 ] && [ $BUILD_EXIT_CODE -eq 0 ] && echo "\"passed\"" || echo "\"failed\""),
"checks": {
"tests": {
"passed": $([ $TEST_EXIT_CODE -eq 0 ] && echo "true" || echo "false"),
"total": $TOTAL_TESTS,
"passing": $PASSING_TESTS,
"failing": $FAILING_TESTS,
"coverage": ${COVERAGE:-null}
},
"lint": {
"passed": $([ $LINT_EXIT_CODE -eq 0 ] && echo "true" || echo "false"),
"errors": $LINT_ERRORS,
"warnings": $LINT_WARNINGS
},
"types": {
"passed": $([ $TYPECHECK_EXIT_CODE -eq 0 ] && echo "true" || echo "false"),
"errors": $TYPE_ERRORS,
"any_usage": $ANY_COUNT
},
"build": {
"passed": $([ $BUILD_EXIT_CODE -eq 0 ] && echo "true" || echo "false"),
"time_seconds": $BUILD_TIME,
"bundle_size": "${BUNDLE_SIZE:-null}"
},
"security": {
"passed": $([ $SECURITY_ISSUES -eq 0 ] && echo "true" || echo "false"),
"total_issues": $((SECURITY_ISSUES + VULNERABILITIES)),
"details": {
"secrets": $SECRETS_FOUND,
"dangerous_html": $DANGEROUS_HTML,
"eval": $EVAL_USAGE,
"vulnerabilities": $VULNERABILITIES
}
},
"performance": {
"console_logs": $CONSOLE_LOGS,
"large_deps": $LARGE_DEPS,
"complex_components": $COMPLEX_COMPONENTS
}
},
"todos_completed": 7
}
EOF
```
<write-reports>
Use the Write tool to save:
- Quality report: `$TASK_OUTPUT_FOLDER/phase_outputs/validate/quality_report.md`
- Validation summary: `$TASK_OUTPUT_FOLDER/phase_outputs/validate/validation_summary.json`
</write-reports>
<todo-update>
Use TodoWrite tool to mark quality report todo as completed
</todo-update>
### 9. Update Shared Context
<update-context>
Update context with validation results:
```json
{
"current_phase": "validate",
"phases_completed": [..., "validate"],
"phase_history": [
{
"phase": "validate",
"completed_at": "ISO_timestamp",
"success": true,
"key_outputs": {
"validation_passed": boolean,
"test_coverage": percentage,
"quality_metrics": {
"tests": "pass|fail",
"lint": "pass|fail",
"types": "pass|fail",
"build": "pass|fail",
"security": "pass|fail"
},
"reports_generated": ["quality_report.md", "validation_summary.json"]
}
}
],
"critical_context": {
...existing_context,
"validation_complete": true,
"ready_for_review": boolean,
"needs_fixes": boolean
}
}
```
</update-context>
<write-context>
Use the Write tool to save updated context:
- Path: `$TASK_OUTPUT_FOLDER/context.json`
</write-context>
```
### 10. Final Validation
<phase4a-validation>
Ensure comprehensive validation completion:
1. **TODO COMPLETION (MANDATORY)**:
- **CRITICAL: MUST use TodoWrite tool to verify ALL todos are marked as completed**
- **Count total todos (7 for features, 5 for patterns)**
- **Verify each todo status is "completed"**
- **Document any skipped validations with reasons**
- **Phase CANNOT succeed if any todos remain pending/in_progress**
2. **Report Verification**:
- Quality report generated with all sections
- JSON summary contains all metrics
- No missing data or placeholders
3. **Metric Collection**:
- All validation checks executed
- Results properly recorded
- Decision tree updated
4. **Phase Success Determination**:
- All required outputs exist
- Context properly updated
- Ready for next phase
</phase4a-validation>
<todo-completion>
**FINAL TODO COMPLETION CHECK (MANDATORY FOR PHASE SUCCESS)**:
1. **Use TodoWrite tool to list all todos**
2. **Verify each todo is marked as "completed"**
3. **If any todo is still "in_progress" or "pending":**
- The phase MUST NOT be considered complete
- Either complete the remaining work or mark as completed with skip reason
4. **Document in validation summary:**
- Total todos created: X
- Total completed: X
- Any skipped with reasons
5. **This is a HARD REQUIREMENT - phase fails if todos are incomplete**
</todo-completion>
```bash
echo ""
echo "===================================="
echo "๐ VALIDATION COMPLETE"
echo "===================================="
echo ""
echo "Validation Summary:"
echo " - Tests: Coverage achieved"
echo " - Code Quality: Standards checked"
echo " - Type Safety: Verified"
echo " - Build: Process validated"
echo " - Security: Scan completed"
echo " - Performance: Metrics collected"
echo ""
echo "Reports Generated:"
echo " - quality_report.md"
echo " - validation_summary.json"
echo " - Additional test/coverage reports"
echo ""
echo "โก๏ธ Ready for Phase 4B (if fixes needed) or Phase 5 (Final Review)"
```
## Key Requirements
<phase4-constraints>
<validation-only>
This phase MUST:
โก Run all validation checks
โก Generate comprehensive reports
โก Document all findings
โก Track validation progress
โก Provide actionable feedback
This phase MUST NOT:
โก Modify implementation code
โก Fix failing tests
โก Change configurations
โก Skip any checks
</validation-only>
<quality-standards>
Enforce these standards:
โก All critical functionality tested
โก Tests validate behavior, not just execute code
โก Zero lint errors
โก Zero type errors
โก Successful build
โก No security issues
โก Performance metrics tracked
โก No superficial tests created for coverage
</quality-standards>
</phase4-constraints>
## Success Criteria
Phase 4A is successful when:
- All validation checks completed
- Comprehensive reports generated
- Quality metrics documented
- Security scan performed
- Performance analyzed
- **ALL TODOS MARKED AS COMPLETED using TodoWrite tool**
- Clear recommendations provided