aiwg

Version:

Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.

aiwg.io

jmagly/aiwg

495 lines (451 loc) • 11.8 kB

YAML

# Quality Scoring Schema # Based on REF-019 Toolformer (perplexity-based scoring) # Issue: #192 $schema: "https://json-schema.org/draft/2020-12/schema" $id: "https://aiwg.io/schemas/quality-scoring/v1" title: "Artifact Quality Scoring Schema" description: | Schema for pattern-based and perplexity-inspired quality scoring of generated artifacts per REF-019 Toolformer. type: object required: - version - scoring_config - patterns properties: version: type: string pattern: "^\\d+\\.\\d+\\.\\d+$" default: "1.0.0" scoring_config: $ref: "#/$defs/ScoringConfig" patterns: $ref: "#/$defs/PatternLibrary" reference_corpus: $ref: "#/$defs/ReferenceCorpus" $defs: ScoringConfig: type: object description: "Quality scoring configuration" properties: enabled: type: boolean default: true auto_score: type: boolean default: false description: "Automatically score on artifact creation" thresholds: type: object properties: excellent: type: integer default: 90 good: type: integer default: 75 acceptable: type: integer default: 60 needs_work: type: integer default: 0 weights: type: object properties: required_sections: type: number default: 0.40 recommended_sections: type: number default: 0.25 antipattern_penalty: type: number default: 0.20 domain_alignment: type: number default: 0.15 feedback: type: object properties: suggest_improvements: type: boolean default: true track_history: type: boolean default: true require_for_phase_transition: type: boolean default: false PatternLibrary: type: object description: "Pattern definitions by artifact type" additionalProperties: $ref: "#/$defs/ArtifactPatterns" ArtifactPatterns: type: object required: - required - recommended - antipatterns properties: required: type: array items: type: string description: "Must be present for passing score" recommended: type: array items: type: string description: "Should be present for good score" antipatterns: type: array items: type: string description: "Should NOT be present" quality_indicators: type: array items: type: string description: "Signs of high quality" ReferenceCorpus: type: object description: "Reference examples for comparison" properties: path: type: string default: ".aiwg/quality/references/" domains: type: array items: type: object properties: name: type: string positive_examples: type: string description: "Path to high-quality examples" negative_examples: type: string description: "Path to common issues" # Default patterns by artifact type default_patterns: use_case: required: - "## Actor" - "## Goal" - "## Preconditions" - "## Main Success Scenario" recommended: - "## Alternative Flows" - "## Postconditions" - "## Acceptance Criteria" - "## Non-Functional Requirements" antipatterns: - "TODO" - "[TBD]" - "????" - "Lorem ipsum" - "[placeholder]" - "example.com" quality_indicators: - "testable" - "measurable" - "specific actor" user_story: required: - "**As a**" - "**I want to**" - "**So that**" - "## Acceptance Criteria" recommended: - "## Non-Functional Requirements" - "## Notes" - "## Dependencies" antipatterns: - "TODO" - "[TBD]" - "as a user" # Too vague - "I want to click" # Implementation detail quality_indicators: - "persona" - "business value" - "testable criteria" adr: required: - "## Status" - "## Context" - "## Decision" - "## Consequences" recommended: - "## Alternatives Considered" - "## Related Decisions" - "## References" antipatterns: - "TODO" - "[TBD]" - "Status: Draft" # Should be Proposed, Accepted, etc. - "we decided" # Missing rationale quality_indicators: - "trade-offs" - "alternatives" - "rationale" test_plan: required: - "## Scope" - "## Approach" - "## Test Cases" recommended: - "## Test Environment" - "## Entry/Exit Criteria" - "## Risk Areas" - "## Coverage Targets" antipatterns: - "TODO" - "[TBD]" - "test everything" # Too vague quality_indicators: - "coverage" - "priority" - "automation" threat_model: required: - "## Assets" - "## Threats" - "## Mitigations" recommended: - "## Trust Boundaries" - "## Attack Vectors" - "## Risk Assessment" - "## Security Controls" antipatterns: - "TODO" - "[TBD]" - "no threats identified" # Suspicious quality_indicators: - "STRIDE" - "DREAD" - "severity" - "likelihood" risk_entry: required: - "## Risk ID" - "## Description" - "## Impact" - "## Probability" - "## Mitigation Strategy" recommended: - "## Owner" - "## Status" - "## Monitoring" - "## Contingency" antipatterns: - "TODO" - "[TBD]" - "low risk" # Without justification quality_indicators: - "quantified impact" - "specific mitigation" - "assigned owner" # Quality score schema quality_score: type: object required: - overall - breakdown properties: overall: type: integer minimum: 0 maximum: 100 rating: type: string enum: [excellent, good, acceptable, needs-work] breakdown: type: object properties: required_score: type: number description: "% of required sections present" recommended_score: type: number description: "% of recommended sections present" antipattern_penalty: type: number description: "Deduction for antipatterns found" domain_alignment: type: number description: "Alignment with domain patterns" issues: type: array items: type: object properties: type: type: string enum: [missing_required, missing_recommended, antipattern, quality_issue] description: type: string severity: type: string enum: [high, medium, low] location: type: string suggestions: type: array items: type: string description: "Improvement recommendations" antipatterns_found: type: array items: type: string quality_indicators_present: type: array items: type: string # Score history schema score_history: type: object properties: artifact_path: type: string scores: type: array items: type: object properties: timestamp: type: string format: date-time version: type: string score: type: integer rating: type: string issues_count: type: integer trend: type: string enum: [improving, stable, degrading] best_score: type: integer best_version: type: string # Agent protocol agent_protocol: score_artifact: description: "Calculate quality score for artifact" steps: - load_artifact_content - determine_artifact_type - load_patterns_for_type - check_required_sections - check_recommended_sections - detect_antipatterns - check_quality_indicators - calculate_weighted_score - determine_rating - generate_suggestions - return_quality_score score_batch: description: "Score multiple artifacts" steps: - list_artifacts_in_scope - for_each_artifact: - score_artifact - record_result - aggregate_results - identify_lowest_scores - return_batch_report quality_gate: description: "Enforce quality threshold for phase transition" triggers: - phase_transition_requested steps: - identify_phase_artifacts - score_all_artifacts - check_against_threshold - if_below_threshold: - block_transition - report_failing_artifacts - else: - allow_transition track_history: description: "Track score changes over time" triggers: - artifact_scored steps: - load_existing_history - append_new_score - calculate_trend - persist_history # CLI commands cli_commands: artifact_score: command: "aiwg artifact score <path>" description: "Score artifact quality" options: - name: "--verbose" description: "Show detailed breakdown" - name: "--json" description: "Output as JSON" artifact_score_batch: command: "aiwg artifact score --all" description: "Score all artifacts" options: - name: "--phase" description: "Filter by SDLC phase" - name: "--below" description: "Only show scores below threshold" quality_report: command: "aiwg quality report" description: "Generate quality report" options: - name: "--trend" description: "Include historical trends" quality_gate: command: "aiwg phase transition <phase> --require-quality <score>" description: "Transition with quality gate" # Report template report_template: | # Quality Score Report **Artifact:** {path} **Type:** {artifact_type} **Score:** {overall}/100 ({rating}) ## Breakdown | Category | Score | Weight | Weighted | |----------|-------|--------|----------| | Required Sections | {required_score}% | 40% | {required_weighted} | | Recommended Sections | {recommended_score}% | 25% | {recommended_weighted} | | Anti-pattern Penalty | -{antipattern_penalty} | 20% | -{antipattern_weighted} | | Domain Alignment | {domain_score}% | 15% | {domain_weighted} | ## Issues Found {issues_list} ## Suggestions {suggestions_list} ## Quality Indicators Present: {quality_indicators_present} Missing: {quality_indicators_missing} # Storage storage: patterns_path: ".aiwg/quality/patterns/" references_path: ".aiwg/quality/references/" history_path: ".aiwg/quality/history/" reports_path: ".aiwg/reports/quality/" # Research targets (from REF-019) research_targets: quality_signal: "Self-supervised quality without human labels" pattern_based: "Lightweight alternative to full perplexity" improvement_feedback: "Actionable suggestions for artifacts" # References references: research: - "@.aiwg/research/findings/REF-019-toolformer.md" implementation: - "#192" related: - "@.claude/rules/best-output-selection.md" - "@agentic/code/frameworks/sdlc-complete/schemas/flows/sdlc-output-schemas.yaml"