arela

# Arela Codebase Overview & Architecture ## Executive Summary Arela v3.7.0 is an **AI-powered CTO platform** currently in Phase 1 (Foundation) with a vision to transform into a complete codebase analysis and autonomous refactoring system. The codebase is well-structured with clear separation of concerns, using TypeScript, SQLite for graph storage, and a modular architecture supporting multi-agent orchestration. **Current Status:** Phase 1 Complete ✅ | Phase 2 (v3.8.0) In Progress - Slice Detection --- ## 1. Overall Structure of src/ Directory ``` src/ ├── analyze/ # Phase 1 - Architecture analysis (multi-repo, coupling, cohesion) │ ├── architecture.ts # Main architecture analyzer │ ├── coupling.ts # Coupling score calculation │ ├── cohesion.ts # Cohesion score calculation │ ├── multi-repo.ts # Cross-repo analysis │ ├── patterns.ts # Pattern detection │ ├── types.ts # Architecture types & interfaces │ └── reporter.ts # Output formatting │ ├── ingest/ # Phase 1 - Codebase ingestion (files → graph) │ ├── index.ts # Main orchestrator │ ├── types.ts # Data structures for ingestion │ ├── file-scanner.ts # Find files in repo │ ├── static-analyzer.ts # TypeScript AST analysis │ ├── universal-analyzer.ts # Language-agnostic regex parsing │ ├── graph-builder.ts # Build dependency graph │ └── storage.ts # SQLite GraphDB interface │ ├── memory/ # Phase 1 - Tri-Memory System │ ├── index.ts # Unified memory interface │ ├── types.ts # Memory data structures │ ├── graph.ts # Graph memory (queries dependency graph) │ ├── vector.ts # Vector memory (RAG wrapper) │ ├── audit.ts # Governance log (SQLite) │ └── cli.ts # Memory CLI commands │ ├── agents/ # Agent orchestration & discovery │ ├── discovery.ts # Find installed AI agents │ ├── orchestrate.ts # Run tickets with multi-agent support │ ├── dispatch.ts # Route work to appropriate agent │ └── status.ts # Show ticket execution status │ ├── flow/ # Flow analysis & tracing (v3.5.0) │ ├── analyzer.ts # Analyze function call flows │ ├── tracer.ts # Trace execution flows │ ├── standards.ts # Flow patterns │ └── reporter.ts # Flow analysis reporting │ ├── analyze/ # Old analysis module (legacy) │ └── [files from Phase 1] │ ├── tickets/ # Ticket parsing & management │ ├── parser.ts # Parse .md/.yaml tickets │ ├── schema.ts # Ticket validation │ └── auto-generate.ts # Generate tickets from prompts │ ├── rag/ # Semantic search (vector DB wrapper) │ ├── index.ts # Main RAG interface │ └── server.ts # RAG server │ ├── utils/ # Utilities │ ├── progress.ts # Progress bars │ ├── schema.ts # Zod schemas │ ├── cli-personality.ts # CLI messaging styles │ ├── structure-validator.ts # Project validation │ ├── auto-index.ts # Incremental RAG indexing │ └── ragignore.ts # RAG ignore patterns │ ├── mcp/ # Model Context Protocol (Windsurf integration) │ └── server.ts # MCP server implementation │ ├── persona/ # AI persona/rules system │ ├── loader.ts # Load persona rules │ └── templates/ # Rule templates │ ├── run/ # Test execution (web & mobile) │ ├── web.ts # Web app testing │ ├── mobile.ts # Mobile app testing │ ├── flows.ts # User flow definitions │ ├── reporter.ts # Test reporting │ └── pilot.ts # AI pilot (autonomous testing) │ ├── generate/ # Code generation │ └── flow-generator.ts # Generate test flows from AI │ ├── cli.ts # Main CLI (Commander.js) ├── index.ts # Programmatic exports └── types.ts # Core type definitions ``` ### Directory Organization Summary | Directory | Purpose | Phase | Status | |-----------|---------|-------|--------| | `ingest/` | Parse files → Graph DB | Phase 1 | ✅ Complete | | `memory/` | Tri-Memory System (Vector/Graph/Audit) | Phase 1 | ✅ Complete | | `analyze/` | Architecture analysis (coupling/cohesion) | Phase 1 | ✅ Complete | | `flow/` | Function call flow analysis | Phase 1 | ✅ Complete | | `agents/` | Multi-agent orchestration | Phase 1 | ✅ Complete | | `run/` | Test execution (web/mobile) | Phase 1 | ✅ Complete | | `detect/` | **Slice detection (NEW)** | Phase 2 | 🚧 To Build | --- ## 2. Graph DB Implementation (SQLite) ### Current Status ✅ **Fully Implemented** in `src/ingest/storage.ts` ### Database Location ``` .arela/memory/graph.db # SQLite database (WAL mode) ``` ### Database Schema (8 Tables) ```sql 1. files - id, path, repo, type, lines - index: repo, type - tracks: every file in the codebase 2. functions - id, name, file_id, line_start, line_end, is_exported - index: file_id, name - tracks: function definitions 3. imports - id, from_file_id, to_file_id, to_module, import_type, imported_names, line_number - index: from_file_id, to_file_id - tracks: import relationships 4. function_calls - id, caller_function_id, callee_function_id, callee_name, line_number - index: caller_function_id, callee_function_id - tracks: function→function calls 5. api_endpoints - id, method, path, file_id, function_id, line_number - index: file_id, method+path - tracks: REST endpoints defined 6. api_calls - id, method, url, file_id, line_number - index: file_id, method+url - tracks: API calls made 7. audit_log (governance) - id, timestamp, agent, action, result, metadata - tracks: decision audit trail 8. vector_index (wrapped by vector.ts) - Semantic embeddings for RAG ``` ### GraphDB Class Interface ```typescript class GraphDB { // Initialization constructor(dbPath: string) private initSchema() // File operations addFile(file: FileNode): number getFileId(filePath: string): number | null // Function operations addFunction(fileId: number, func: FunctionNode): number getFunctionId(fileId: number, functionName: string): number | null // Dependency tracking addImport(fromFileId, toFileId, toModule, importType, names, line) addFunctionCall(callerFunctionId, calleeId, calleeName, line) // API tracking addApiEndpoint(endpoint: ApiEndpoint) addApiCall(fileId: number, call: ApiCall) // Querying query(sql: string, params: any[]): any[] getSummary(): { filesCount, functionsCount, importsCount, ... } // Transactions beginTransaction() commit() rollback() // Utilities close() clear() } ``` ### Example Queries Available ```typescript // Find files that depend on a file SELECT f.path FROM imports i JOIN files f ON f.id = i.from_file_id WHERE i.to_file_id = ? // Find functions called by a function SELECT f.name, f2.path FROM function_calls fc JOIN functions f ON f.id = fc.caller_function_id JOIN functions f2 ON f2.id = fc.callee_function_id WHERE fc.caller_function_id = ? // Find API endpoints in a file SELECT method, path FROM api_endpoints WHERE file_id = ? ``` ### Data Types Stored ```typescript // From src/ingest/types.ts FileNode { path: string // relative path repoPath: string // absolute repo path type: FileType // component|service|controller|util|hook|type|config|other lines: number imports?: ImportInfo[] exports?: ExportInfo[] functions?: FunctionNode[] apiCalls?: ApiCall[] } ImportInfo { from: string // module/file names: string[] // [imports] type: 'default'|'named'|'namespace' line: number } FunctionNode { name: string fileId: number lineStart: number lineEnd: number isExported: boolean calls?: number[] // IDs of called functions calledBy?: number[] } ApiCall { method: string // GET, POST, etc. url: string line: number } ``` --- ## 3. CLI Structure & Command Registration ### Framework - **Library:** `commander.js` v12.0.0 - **File:** `src/cli.ts` (700+ lines) - **Pattern:** Commander.js program with subcommands ### CLI Commands Registered ``` arela agents # List discovered AI agents arela init # Initialize project setup arela orchestrate # Run tickets with multi-agent arela run [web|mobile] # Test execution arela status # Show ticket status arela doctor # Validate project structure arela index # Build RAG index arela auto-index # Incremental RAG update arela install-hook # Install git auto-index hook arela uninstall-hook # Remove git hook arela analyze [flow|architecture] # Code analysis arela ingest [codebase] # Ingest files → graph arela mcp # Start MCP server arela memory [init|query|impact|audit|status] # Tri-Memory commands ``` ### Command Registration Pattern ```typescript // Example from cli.ts program .command("agents") .description("Discover and list available AI agents") .option("--verbose", "Show detailed information", false) .action(async (opts) => { const agents = await discoverAgents(); // Handle command }); // For complex commands with dynamic handlers: export class UnsupportedPlatformError extends Error { constructor(public readonly platform: string) { super(`Platform "${platform}" not supported.`); } } export function buildRunCommand( programInstance: Command, handler: RunCommandHandler = handleRunCommand ): Command { return programInstance .command("run") .description("Run and test your app") .argument("<platform>", "Platform: web or mobile") .option("--url <url>", "URL for web apps") .action(async (platformArg, opts) => { try { await handler(platformArg, opts); } catch (error) { if (error instanceof UnsupportedPlatformError) { console.error(error.message); process.exit(1); } } }); } ``` ### Programmatic Usage All CLI commands are exported for programmatic use: ```typescript // Exported from src/index.ts export * from './agents/discovery.js' export * from './agents/orchestrate.js' export * from './tickets/parser.js' export * from './memory/index.js' ``` Usage example: ```typescript import { discoverAgents, orchestrate } from 'arela'; const agents = await discoverAgents(); await orchestrate({ cwd: process.cwd(), parallel: true, tickets: ['CODEX-001', 'CLAUDE-002'] }); ``` --- ## 4. Types & Interfaces ### Core Domain Types ```typescript // src/types.ts - Core types export type AgentName = 'codex' | 'claude' | 'deepseek' | 'ollama' | 'cascade' export type TicketPriority = 'low' | 'medium' | 'high' | 'critical' export type TicketComplexity = 'simple' | 'medium' | 'complex' export type TicketStatus = 'pending' | 'in_progress' | 'completed' | 'failed' | 'blocked' export interface Ticket { id: string title: string description: string agent: AgentName priority: TicketPriority complexity: TicketComplexity status: TicketStatus estimatedTime?: string estimatedCost?: number dependencies?: string[] tags?: string[] files?: Array<{ path: string; action: 'create'|'modify'|'delete' }> acceptance?: Array<{ id: string; description: string; status: 'pending'|'passed'|'failed' }> } export interface DiscoveredAgent { name: string command: string available: boolean type: 'cloud' | 'local' | 'ide' version?: string } ``` ### Graph & Import Types ```typescript // src/ingest/types.ts export interface FileNode { path: string repoPath: string type: FileType // component|service|controller|util|hook|type|config|other lines: number imports?: ImportInfo[] exports?: ExportInfo[] functions?: FunctionNode[] apiCalls?: ApiCall[] } export interface ImportInfo { from: string names: string[] type: 'default' | 'named' | 'namespace' line: number } export interface FunctionNode { id?: number name: string fileId?: number filePath?: string isExported: boolean lineStart: number lineEnd: number calls?: number[] calledBy?: number[] } export interface FunctionCall { callerId: number calleeId: number line: number } export interface ApiEndpoint { method: string // GET, POST, PUT, DELETE path: string fileId: number functionId?: number line: number } export interface ApiCall { method: string url: string line: number filePath?: string } export interface CodebaseMap { summary: { filesScanned: number importsFound: number functionsDefined: number apiCallsFound: number } stats: { modules: number components: number services: number apiEndpoints: number } dbPath: string duration: number } ``` ### Memory System Types ```typescript // src/memory/types.ts export interface DependencyEdge { file: string reason: string // e.g., "imports, default export" weight: number // number of import relationships } export interface ImpactAnalysis { file: string exists: boolean upstream: DependencyEdge[] // files that import this downstream: DependencyEdge[] // files this imports fanIn: number // total upstream dependency weight fanOut: number // total downstream dependency weight } export interface GraphStats { ready: boolean files: number imports: number functions: number functionCalls: number apiEndpoints: number apiCalls: number dbPath: string } export interface AuditEntry { timestamp?: string agent: string action: string inputHash?: string outputHash?: string result: 'success' | 'failure' | 'pending' metadata?: Record<string, unknown> commitHash?: string ticketId?: string policyViolations?: unknown[] } export interface MemoryQueryResult { question: string semantic: SemanticResult[] // from vector DB (RAG) relatedFiles: string[] // from graph DB timestamp: string } export interface TriMemoryStats { vector: VectorStats graph: GraphStats audit: AuditStats } ``` ### Architecture Analysis Types ```typescript // src/analyze/types.ts export type ArchitectureType = 'horizontal' | 'vertical' | 'hybrid' export interface ArchitectureScore { horizontal: number // 0-100 vertical: number // 0-100 } export interface CouplingCohesionScores { coupling: number // 0-100 (lower is better) cohesion: number // 0-100 (higher is better) } export interface RepoAnalysis { name: string path: string architecture: ArchitectureType scores: ArchitectureScore metrics: CouplingCohesionScores directories: DirectoryAnalysis[] issues: ArchitectureIssue[] } export interface DirectoryAnalysis { path: string type: DirectoryType // layer|feature|module|other fileCount: number internalImports: number // within directory externalImports: number // from outside importedBy: number // reverse imports } export interface ArchitectureIssue { severity: 'critical' | 'warning' | 'info' title: string description: string affectedFiles?: number affectedDirs?: string[] recommendation?: string } export interface ArchitectureReport { timestamp: string repositories: RepoAnalysis[] overallArchitecture: ArchitectureType overallScores: ArchitectureScore globalMetrics: CouplingCohesionScores issues: ArchitectureIssue[] apiDrift: ApiDrift[] recommendations: string[] effort?: { estimated: string // "8-12 weeks" breakeven: string // "14 months" roi3Year: number // 380 for 380% } } ``` --- ## 5. Database System & Storage ### Primary Database: SQLite with better-sqlite3 **Package:** `better-sqlite3@^11.0.0` **Location:** `.arela/memory/graph.db` **Configuration:** ```typescript this.db.pragma("journal_mode = WAL"); // Write-Ahead Logging (faster) this.db.pragma("foreign_keys = ON"); // Enable foreign keys ``` ### Storage Layers ``` ┌─────────────────────────────────────────┐ │ Arela Memory System │ ├─────────────────────────────────────────┤ │ │ │ TriMemory (src/memory/index.ts) │ │ ├── VectorMemory → RAG index │ │ ├── GraphMemory → graph.db │ │ └── AuditMemory → audit.db │ │ │ ├─────────────────────────────────────────┤ │ │ │ .arela/memory/ │ │ ├── graph.db (SQLite) │ │ ├── audit.db (SQLite) │ │ └── vector/ (RAG embeddings) │ │ │ └─────────────────────────────────────────┘ ``` ### Ingest Pipeline ``` Codebase ↓ scanDirectory (find files) ↓ analyzeFile (universal-analyzer or static-analyzer) ├── Extract imports ├── Extract functions ├── Extract API calls └── Extract exports ↓ buildGraph (graph-builder.ts) ├── Add files to DB ├── Add functions to DB ├── Resolve imports ├── Resolve function calls └── Add API endpoints/calls ↓ GraphDB (storage.ts) └── .arela/memory/graph.db ``` ### Query Interface ```typescript // Example: Find all files that import a specific file const dependents = await queryGraph( repoPath, `SELECT f.path FROM imports i JOIN files f ON f.id = i.from_file_id WHERE i.to_file_id = (SELECT id FROM files WHERE path = ?)`, ['src/auth/login.ts'] ); // Example: Analyze impact of a file const impact = await triMemory.impact('src/auth/login.ts'); // Returns: { // upstream: [{file: 'x', weight: 3, reason: 'imports'}, ...], // downstream: [{file: 'y', weight: 2, reason: 'imports'}, ...], // fanIn: 5, fanOut: 2 // } ``` --- ## 6. Phase 1 Implementation Status ✅ ### Completed Features #### 1.1 Multi-Repo Architecture Analyzer - **File:** `src/analyze/architecture.ts` - **Status:** ✅ Complete - **Capabilities:** - Analyzes single & multiple repositories - Detects architecture type (horizontal/vertical/hybrid) - Calculates coupling/cohesion scores (0-100) - Identifies architectural issues - Recommends VSA migration path - Calculates effort/ROI estimates - **CLI:** ```bash arela analyze architecture /mobile /backend ``` #### 1.2 Universal Codebase Ingestion - **Files:** `src/ingest/*` - **Status:** ✅ Complete - **Capabilities:** - Supports 15+ programming languages - Language-agnostic regex parsing - Extracts imports, functions, API calls - Builds complete dependency graph - Stores in SQLite GraphDB - Fast: 3,585 files in 3.91s - **Languages:** JS/TS, Python, Go, Rust, Ruby, PHP, Java, C#, C++, Swift, Kotlin - **CLI:** ```bash arela ingest codebase arela ingest codebase --refresh --analyze ``` #### 1.3 Tri-Memory System - **Files:** `src/memory/*` - **Status:** ✅ Complete - **Three Types:** 1. **Vector Memory** - Semantic RAG search 2. **Graph Memory** - Structural dependencies 3. **Audit Memory** - Governance log - **CLI:** ```bash arela memory init arela memory query "authentication logic" arela memory impact src/auth/login.ts arela memory audit --ticket CODEX-001 arela memory status ``` #### 1.4 Flow Analysis - **Files:** `src/flow/*` - **Status:** ✅ Complete - **Capabilities:** - Analyze function call flows - Trace execution paths - Identify flow patterns - **CLI:** ```bash arela analyze flow main --verbose ``` #### 1.5 Agent Discovery & Orchestration - **Files:** `src/agents/*` - **Status:** ✅ Complete - **Capabilities:** - Discover installed AI agents (Claude, Codex, Ollama, etc.) - Route work to appropriate agent - Execute tickets with multi-agent orchestration - Show execution status - **CLI:** ```bash arela agents --verbose arela orchestrate --parallel --max-parallel 5 ``` --- ## 7. Phase 2 Development Plan (v3.8.0) 🚧 ### Feature 2.1: Autonomous Slice Boundary Detection (YOUR TASK) **Location:** `src/detect/` (to be created) **Ticket:** `.arela/tickets/claude/CLAUDE-001-v3.8.0-slice-detection.md` **Algorithm:** Louvain Community Detection - O(n log n) complexity - Maximizes modularity - Produces hierarchical communities - Works on weighted graphs **Expected Files:** ``` src/detect/ ├── slices.ts # Main detector interface ├── clustering.ts # Louvain algorithm implementation ├── scoring.ts # Cohesion/coupling scoring └── review.ts # Human approval workflow ``` **Expected CLI:** ```bash arela detect slices arela detect slices --repo /path/to/repo arela detect slices /path/to/mobile /path/to/backend arela detect slices --json slices.json arela review slices ``` **Expected Output:** ``` Detected 4 optimal vertical slices: 1. authentication (23 files, cohesion: 87%) 2. workout (45 files, cohesion: 82%) 3. nutrition (31 files, cohesion: 79%) 4. social (28 files, cohesion: 75%) ``` **Data Required from Phase 1:** - ✅ Graph DB (imports, functions, dependencies) - ✅ File nodes and relationships - ✅ Coupling/cohesion calculation helpers ### Feature 2.2: API Contract Generator **Location:** `src/generate/contract.ts` (to be created) **Capabilities:** - Generate OpenAPI specs from code - Validate specs against implementation - Detect schema drift **CLI:** ```bash arela generate contract --from-code src/api/ arela validate contract openapi.yaml --against src/api/ ``` --- ## 8. Existing Infrastructure Supporting Slice Detection ### Available Graph Data The Graph DB already contains: - ✅ All file nodes with types (component, service, util, etc.) - ✅ Import relationships with weights - ✅ Function definitions and calls - ✅ API endpoints and calls - ✅ File-to-file dependency paths ### Available Analysis Functions ```typescript // From src/memory/graph.ts - Can be reused: await graphMemory.impact(filePath) // Gets fan-in/fan-out await graphMemory.findSlice(identifier) // Finds related files // From src/ingest/storage.ts - Can be reused: db.query(sql, params) // Direct SQL access db.getSummary() // Graph statistics // From src/analyze/coupling.ts - Can be reused: calculateCoupling() // Coupling metrics calculateCohesion() // Cohesion metrics ``` ### Utility Functions ```typescript // Available utilities: - picocolors for CLI output - better-sqlite3 for database - ts-morph for type extraction (already imported) - fast-glob for file scanning - fs-extra for file operations ``` ### Pattern from Existing Code The ingest module shows the pattern for handling graph data: ```typescript // From src/ingest/graph-builder.ts function buildGraph(analyses: FileAnalysis[], repoPath: string, db: GraphDB) { // Phase 1: Add all files // Phase 2: Resolve imports // Phase 3: Resolve function calls // Phase 4: Analyze relationships } ``` Similar pattern should be used for slice detection. --- ## 9. Key Dependencies ```json { "better-sqlite3": "^11.0.0", // Graph DB "ts-morph": "^21.0.0", // TypeScript AST "commander": "^12.0.0", // CLI "picocolors": "^1.1.0", // CLI colors "fast-glob": "^3.3.0", // File scanning "fs-extra": "^11.2.0", // File operations "zod": "^3.23.0", // Schema validation "yaml": "^2.4.0" // YAML parsing } ``` No external ML libraries are used—implement Louvain yourself or use a lightweight algorithm. --- ## 10. Development Checklist ### For Phase 2 (Slice Detection Implementation) - [ ] **Create slice detection module** (`src/detect/`) - [ ] `slices.ts` - Main public API - [ ] `clustering.ts` - Louvain algorithm - [ ] `scoring.ts` - Slice quality metrics - [ ] `types.ts` - Slice data structures - [ ] **Extend CLI** (`src/cli.ts`) - [ ] `arela detect slices` command - [ ] `arela review slices` command - [ ] Support for `--json` export - [ ] Multi-repo support - [ ] **Extend types** (`src/types.ts`) - [ ] Slice interfaces - [ ] Detection options - [ ] Recommendation types - [ ] **Add integration tests** - [ ] Test on real codebase - [ ] Test multi-repo detection - [ ] Validate clustering quality - [ ] **Update exports** (`src/index.ts`) - [ ] Export slice detection functions ### Building on Phase 1 The slice detection feature can directly leverage: 1. **GraphDB queries** - Import/dependency data already available 2. **Impact analysis** - Fan-in/fan-out calculations 3. **File nodes** - Types and metadata 4. **Coupling/cohesion** - Scoring algorithms from analyze/ 5. **CLI patterns** - Command registration in cli.ts --- ## Summary Arela is a well-structured Node.js/TypeScript application with: 1. **Clear Module Separation:** - Ingestion (files → graph) - Memory (3-tier: vector/graph/audit) - Analysis (architecture, coupling, cohesion) - CLI (Commander.js) 2. **Solid Foundation (Phase 1 Complete):** - SQLite Graph DB with 8 tables - Universal codebase ingestion (15+ languages) - Tri-Memory system with 3 storage types - Multi-repo architecture analysis 3. **Ready for Phase 2:** - Graph data available for clustering - Coupling/cohesion algorithms exist - CLI patterns established - Type system in place 4. **Technology Stack:** - TypeScript with strict types - SQLite (better-sqlite3) for persistence - Commander.js for CLI - ts-morph for AST analysis - Zod for validation **The codebase is production-ready and well-organized for the slice detection feature development.**