UNPKG

@aaswe/codebase-ai

Version:

AI-Assisted Software Engineering (AASWE) - Rich codebase context for IDE LLMs

544 lines (424 loc) 17.1 kB
# Module Knowledge Files (.module-knowledge.ttl) ## Overview The AASWE RDF Generator automatically creates `.module-knowledge.ttl` files that contain semantic representations of your code modules. These files serve dual purposes: 1. **Neo4j Graph Database Ingestion** - Structured data for knowledge graph storage 2. **LLM Enhancement** - Human-readable context for AI-assisted development ## File Generation ### Automatic Generation Module knowledge files are automatically generated when: 1. **AST Analysis Completes** - After parsing source code files 2. **Code Changes Detected** - Through file watchers and Git hooks 3. **Batch Processing** - During project-wide analysis 4. **Manual Triggers** - Via CLI commands or API calls ### File Location Files are created in the same directory as the source code: ``` src/ ├── components/ │ ├── UserService.ts │ └── .module-knowledge.ttl # Generated for UserService.ts ├── utils/ │ ├── helpers.ts │ └── .module-knowledge.ttl # Generated for helpers.ts └── index.ts ``` ### Generation Process ```typescript // Example: How files are generated import { RDFService } from '@aide/rdf-generator'; const rdfService = new RDFService(); // 1. Analyze source code const astResult = await astAnalyzer.analyze('src/UserService.ts'); // 2. Generate RDF knowledge const rdfResult = await rdfService.generateRDF( astResult, 'src/UserService.ts' ); // 3. File automatically written to: src/.module-knowledge.ttl ``` ## File Structure ### Header Section ```turtle # Metadata: # Version: 1.0.0 # Generated: 2025-08-06T16:43:44.193Z # Source: src/UserService.ts # Checksum: ad666b1832f1215b # # Module Knowledge Graph # Generated by AASWE RDF Generator # Optimized for both Neo4j ingestion and LLM consumption # # This file contains concrete code structure information extracted from AST analysis # and includes placeholders for business context enhancement by developers. # # Instructions for developers: # 1. Replace [PLACEHOLDER] values with actual business context # 2. Add domain-specific knowledge in the business context sections # 3. Enhance method and class descriptions with business purpose # 4. Maintain RDF syntax when making manual edits # ``` ### Namespace Declarations ```turtle @prefix code: <https://aaswe.ai/ontology/code#>. @prefix module: <https://aaswe.ai/ontology/module#>. @prefix arch: <https://aaswe.ai/ontology/architecture#>. @prefix business: <https://aaswe.ai/ontology/business#>. @prefix quality: <https://aaswe.ai/ontology/quality#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix owl: <http://www.w3.org/2002/07/owl#>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. ``` ### Module Information ```turtle module:UserService a code:Module ; code:name "UserService" ; code:language "typescript" ; code:version "1.0.0" ; code:hasClass code:UserService.UserManager ; code:hasFunction code:UserService.validateUser ; code:dependsOn code:dependency_lodash . ``` ### Business Context Placeholders ```turtle module:UserService business:belongsToDomain "[BUSINESS_DOMAIN] - Replace with actual domain" ; business:hasBusinessRules "[BUSINESS_RULES] - Define business rules here" ; business:supportsUseCases "[USE_CASES] - List supported use cases" . ``` ## Developer Workflow ### 1. Initial Generation When you first run the RDF generator on your codebase: ```bash # Generate knowledge files for entire project npx aide generate-knowledge --path ./src # Generate for specific file npx aide generate-knowledge --file ./src/UserService.ts ``` ### 2. Review Generated File Open the generated `.module-knowledge.ttl` file and review: - ✅ **Automatically extracted code structure** (classes, methods, dependencies) - ⚠️ **Business context placeholders** (need manual enhancement) - ✅ **Quality metrics** (complexity, lines of code) ### 3. Enhance Business Context Replace placeholder values with actual business information: #### Before (Generated): ```turtle module:UserService business:belongsToDomain "[BUSINESS_DOMAIN] - Replace with actual domain" ; business:hasBusinessRules "[BUSINESS_RULES] - Define business rules here" ; business:supportsUseCases "[USE_CASES] - List supported use cases" . ``` #### After (Developer Enhanced): ```turtle module:UserService business:belongsToDomain "User Management - Handles user authentication, authorization, and profile management" ; business:hasBusinessRules "Users must verify email before activation; Password must meet complexity requirements; Failed login attempts trigger account lockout" ; business:supportsUseCases "User Registration; User Login; Password Reset; Profile Updates; Account Deactivation" . ``` ### 4. Add Method-Level Business Context Enhance individual methods with business purpose: #### Before: ```turtle code:UserService.validateUser a code:Method ; code:name "validateUser" ; code:signature "validateUser(email: string, password: string): Promise<boolean>" . ``` #### After: ```turtle code:UserService.validateUser a code:Method ; code:name "validateUser" ; code:signature "validateUser(email: string, password: string): Promise<boolean>" ; code:summary "Validates user credentials against database and security policies" ; code:description "Performs multi-step validation including email format check, password complexity verification, account status validation, and rate limiting for security" ; business:implementsRule "Password complexity must meet corporate security standards" ; business:supportsUseCase "Secure User Authentication" . ``` ### 5. Maintain During Development The system automatically updates code structure, but preserves your business context: ```typescript // When you modify UserService.ts, the system: // 1. Detects file changes // 2. Re-analyzes code structure // 3. Updates technical triples (classes, methods, etc.) // 4. PRESERVES your business context enhancements // 5. Merges old business context with new technical structure ``` ## Best Practices ### ✅ DO 1. **Enhance Business Context Early** ```turtle # Add meaningful business descriptions business:belongsToDomain "E-commerce Order Processing" ; business:hasBusinessRules "Orders require payment validation before fulfillment" ; ``` 2. **Use Consistent Terminology** ```turtle # Use domain-specific language consistently business:belongsToDomain "Customer Relationship Management" ; # Not: "CRM stuff" or "user things" ``` 3. **Document Business Rules Clearly** ```turtle business:hasBusinessRules " - Orders over $500 require manager approval - International orders need customs documentation - Refunds must be processed within 30 days " ; ``` 4. **Link to Use Cases** ```turtle business:supportsUseCases " - Customer places order - Payment processing - Inventory allocation - Shipping coordination " ; ``` 5. **Add Quality Attributes** ```turtle business:satisfiesAttribute " - Performance: Order processing < 2 seconds - Reliability: 99.9% uptime requirement - Security: PCI DSS compliance for payments " ; ``` ### ❌ DON'T 1. **Don't Modify Technical Triples** ```turtle # DON'T manually edit these - they're auto-generated code:name "UserService" ; # ❌ Will be overwritten code:hasMethod code:validateUser ; # ❌ Will be overwritten ``` 2. **Don't Break RDF Syntax** ```turtle # ❌ Invalid syntax business:belongsToDomain "Missing quotes and semicolon" # ✅ Valid syntax business:belongsToDomain "Proper quotes and semicolon" ; ``` 3. **Don't Use Placeholder Text** ```turtle # ❌ Leave placeholders business:belongsToDomain "[BUSINESS_DOMAIN] - Replace with actual domain" ; # ✅ Replace with real content business:belongsToDomain "User Authentication and Authorization" ; ``` ## Complete Property Reference ### ✅ SAFE TO EDIT - Business & Documentation Properties #### **Module-Level Business Context Properties:** ```turtle # Business domain and purpose business:belongsToDomain "E-commerce Order Processing" ; business:hasBusinessRules "Orders require payment validation before fulfillment" ; business:supportsUseCases "Customer places order; Payment processing; Order fulfillment" ; business:satisfiesAttribute "Performance: < 2 seconds; Reliability: 99.9% uptime" ; business:subjectToConstraint "PCI DSS compliance; GDPR data protection" ; business:hasStakeholder "Customers; Sales Team; Fulfillment Team" ; ``` #### **Method/Class Documentation Properties:** ```turtle # Documentation and examples code:summary "Validates user credentials against security policies" ; code:description "Multi-step validation with rate limiting and account lockout protection" ; code:example "validateUser('user@example.com', 'password123')" ; code:seeAlso "https://docs.company.com/auth-guide" ; code:deprecated "Use validateUserV2 instead - deprecated in v2.0" ; ``` #### **Business Rule Implementation Properties:** ```turtle # Link code to business requirements business:implementsRule "Corporate password complexity standards" ; business:supportsUseCase "Secure User Authentication" ; ``` #### **Custom Domain Properties:** ```turtle # Add your own domain-specific properties @prefix ecommerce: <https://yourcompany.com/ontology/ecommerce#> . ecommerce:handlesPaymentTypes "Credit Card, PayPal, Bank Transfer" ; ecommerce:supportsCurrencies "USD, EUR, GBP" ; ``` ### ❌ NEVER EDIT - Auto-Generated Properties #### **Core Structure Properties (Auto-Generated from Code):** ```turtle # These are automatically extracted from your source code code:name "UserService" ; # ❌ Class/method names code:signature "validateUser(email, pass)" ; # ❌ Method signatures code:language "typescript" ; # ❌ Programming language code:version "1.0.0" ; # ❌ Module version code:hasClass code:UserService.User ; # ❌ Contains classes code:hasMethod code:User.authenticate ; # ❌ Contains methods code:hasProperty code:User.email ; # ❌ Contains properties code:extends code:BaseService ; # ❌ Class inheritance code:implements code:IUserService ; # ❌ Interface implementations code:dependsOn code:dependency_lodash ; # ❌ Dependencies ``` #### **Technical Metadata Properties:** ```turtle # Technical details extracted during analysis code:fullyQualifiedName "UserService.User.authenticate" ; # ❌ Full names code:visibility "public" ; # ❌ public/private/protected code:isStatic false ; # ❌ Static method flag code:isAsync true ; # ❌ Async method flag code:isAbstract false ; # ❌ Abstract class flag code:isOptional false ; # ❌ Optional parameter flag code:type "string" ; # ❌ Parameter/property types code:returnType "Promise<boolean>" ; # ❌ Method return types ``` #### **Source Location Properties:** ```turtle # File location information code:sourceFile "/src/UserService.ts" ; # ❌ Source file path code:startLine 45 ; # ❌ Starting line number code:endLine 67 ; # ❌ Ending line number code:startColumn 2 ; # ❌ Starting column code:endColumn 4 ; # ❌ Ending column ``` #### **Quality Metrics Properties (Auto-Calculated):** ```turtle # Automatically calculated code quality metrics quality:cyclomaticComplexity 5 ; # ❌ Complexity metrics quality:cognitiveComplexity 3 ; # ❌ Cognitive complexity quality:linesOfCode 23 ; # ❌ Lines of code count quality:maintainabilityIndex 78 ; # ❌ Maintainability score ``` #### **File Metadata Properties:** ```turtle # System-generated metadata code:generatedAt "2025-08-06T19:27:09Z" ; # ❌ Generation timestamp code:checksum "ad666b1832f1215b" ; # ❌ File checksum code:createdAt "2025-08-06T19:27:09Z" ; # ❌ Creation timestamp code:modifiedAt "2025-08-06T19:27:09Z" ; # ❌ Last modification ``` ### 🔄 **What Happens When You Edit Auto-Generated Properties** If you accidentally edit auto-generated properties: 1. **Next Code Update**: Your changes will be **overwritten** when the system re-analyzes your code 2. **Validation Errors**: The system may detect inconsistencies and warn you 3. **Data Loss**: Manual edits to technical properties are **not preserved** ### 🚨 **Critical Rule** **ONLY edit properties that start with:** - `business:*` (business context) - `code:summary` (method/class summaries) - `code:description` (detailed descriptions) - `code:example` (usage examples) - `code:seeAlso` (references) - `code:deprecated` (deprecation notes) - Custom namespace properties (e.g., `yourcompany:*`) **Everything else is auto-generated and will be overwritten!** ## Integration with Development Tools ### Git Hooks Automatically update knowledge files on commits: ```bash # .git/hooks/pre-commit #!/bin/bash npx aide update-knowledge --changed-files ``` ### CI/CD Pipeline ```yaml # .github/workflows/knowledge-update.yml name: Update Knowledge Files on: [push, pull_request] jobs: update-knowledge: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Update Module Knowledge run: npx aide update-knowledge --validate ``` ### IDE Integration ```json // .vscode/tasks.json { "version": "2.0.0", "tasks": [ { "label": "Update Knowledge File", "type": "shell", "command": "npx aide update-knowledge --file ${file}", "group": "build" } ] } ``` ## Validation and Quality ### Automatic Validation The system validates your enhancements: ```bash # Validate all knowledge files npx aide validate-knowledge # Validate specific file npx aide validate-knowledge --file .module-knowledge.ttl ``` ### Quality Checks - ✅ **RDF Syntax Validation** - Ensures proper Turtle format - ✅ **Ontology Compliance** - Validates against AASWE schema - ✅ **Business Context Completeness** - Checks for placeholder removal - ✅ **URI Format Validation** - Ensures valid resource identifiers ### Error Examples ```bash # Common validation errors and fixes ❌ Error: "Invalid RDF syntax on line 45" Fix: Check for missing semicolons, quotes, or periods ❌ Error: "Missing required property: business:belongsToDomain" Fix: Add business domain information ❌ Error: "Placeholder text detected: [BUSINESS_DOMAIN]" Fix: Replace placeholder with actual business context ``` ## Advanced Usage ### Custom Business Properties Add domain-specific properties: ```turtle @prefix ecommerce: <https://yourcompany.com/ontology/ecommerce#> . module:OrderService ecommerce:handlesPaymentTypes "Credit Card, PayPal, Bank Transfer" ; ecommerce:supportsCurrencies "USD, EUR, GBP" ; ecommerce:integratesWith "Stripe, PayPal, Square" . ``` ### Linking Modules Connect related modules: ```turtle module:UserService arch:dependsOn module:DatabaseService ; arch:providesServiceTo module:OrderService ; arch:followsPattern "Repository Pattern" . ``` ### Performance Annotations Add performance characteristics: ```turtle code:UserService.validateUser quality:expectedResponseTime "< 100ms" ; quality:cacheable true ; quality:rateLimited "1000 requests/minute" . ``` ## Troubleshooting ### Common Issues 1. **File Not Generated** - Check if source file is in supported language (TS, JS, Python, Java, Go, Rust, C++) - Verify file permissions in target directory - Check AST analyzer logs for parsing errors 2. **Business Context Lost After Update** - Ensure proper RDF syntax in your enhancements - Check for syntax errors that prevent parsing - Verify backup files in `.aide/backups/` 3. **Validation Errors** - Run `npx aide validate-knowledge --verbose` for detailed errors - Check ontology documentation for required properties - Validate RDF syntax using online TTL validators ### Recovery ```bash # Restore from backup npx aide restore-knowledge --from-backup # Force regeneration (loses business context) npx aide generate-knowledge --force --file UserService.ts # Merge business context from backup npx aide merge-knowledge --source backup.ttl --target .module-knowledge.ttl ``` ## Summary Module knowledge files are living documents that combine: - 🤖 **Automated code analysis** (structure, metrics, dependencies) - 👨‍💻 **Developer business context** (domain knowledge, rules, use cases) - 🔄 **Continuous synchronization** (code changes update technical aspects) - 🛡️ **Context preservation** (business enhancements are maintained) This creates a comprehensive knowledge base that enhances both AI assistance and team understanding of your codebase.