UNPKG

codemodctl

Version:

CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis

181 lines (131 loc) 6.67 kB
# codemodctl CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis. ## Installation ```bash npm install codemodctl ``` ## Usage ### As a CLI Tool ```bash # Analyze CODEOWNERS and generate sharding configuration codemodctl codeowner --shard-size 20 --state-prop shards --rule ./rule.yaml ``` ### As a Library #### Deterministic File Sharding ```typescript import { getShardForFilename, fitsInShard, distributeFilesAcrossShards } from 'codemodctl/sharding'; // Get the shard index for a specific file - always deterministic! const shardIndex = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); // Same file + same shard count = same result, every time const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); console.log(shard1 === shard2); // always true // Check if a file belongs to a specific shard const belongsToShard = fitsInShard('src/components/Button.tsx', { shardCount: 5, shardIndex: 2 }); // Distribute all files across shards with consistent hashing const files = ['file1.ts', 'file2.ts', 'file3.ts']; const distribution = distributeFilesAcrossShards(files, 5); // Check scaling behavior - minimal reassignment when growing const scalingAnalysis = analyzeShardScaling(files, 5, 6); console.log(`${scalingAnalysis.stableFiles} files stay in same shard`); console.log(`${scalingAnalysis.reassignmentPercentage}% reassignment`); // Much less than 100% ``` #### Codeowner Analysis ```typescript import { analyzeCodeowners, findCodeownersFile } from 'codemodctl/codeowners'; // Analyze codeowners and generate shard configuration const result = await analyzeCodeowners({ shardSize: 20, rulePath: './rule.yaml', projectRoot: process.cwd() }); console.log(`Generated ${result.shards.length} shards for ${result.totalFiles} files`); result.teams.forEach(team => { console.log(`Team "${team.team}" owns ${team.fileCount} files`); }); ``` #### Complete API ```typescript import codemodctl from 'codemodctl'; // Access all utilities through the default export const shardIndex = await codemodctl.sharding.getShardForFilename('file.ts', { shardCount: 5 }); const analysis = await codemodctl.codeowners.analyzeCodeowners(options); ``` ## Key Features ### Consistent File Sharding The sharding algorithm uses **consistent hashing** to ensure: - **Perfect consistency**: Same file + same shard count = same result, always - **No external dependencies**: Result depends only on filename and shard count - **Minimal reassignment**: When scaling up, only ~20-40% of files move (not 100%) - **Stable scaling**: Adding new shards doesn't reorganize existing file assignments - **Simple API**: No complex parameters or configuration needed - **Team-aware sharding**: Works with codeowner boundaries ### Codeowner Analysis - **Automatic CODEOWNERS detection**: Searches common locations (root, .github/, docs/) - **AST-grep integration**: Analyze files using custom rules - **Team-based grouping**: Groups files by their assigned teams - **Shard generation**: Creates optimal shard configuration based on team ownership ## API Reference ### Sharding Functions - `getShardForFilename(filename, { shardCount })` - Get shard index for a file - `fitsInShard(filename, { shardCount, shardIndex })` - Check shard membership - `distributeFilesAcrossShards(files, shardCount)` - Distribute files across shards - `calculateOptimalShardCount(totalFiles, targetShardSize)` - Calculate optimal shard count - `getFileHashPosition(filename)` - Get consistent hash position for a file - `analyzeShardScaling(files, oldCount, newCount)` - Analyze reassignment when scaling All functions are deterministic: same input always produces the same output. **Scaling behavior**: When going from N to N+1 shards, typically only 20-40% of files get reassigned to new locations, making it ideal for incremental scaling scenarios. ### Codeowner Functions - `analyzeCodeowners(options)` - Complete analysis with shard generation - `findCodeownersFile(projectRoot?, explicitPath?)` - Locate CODEOWNERS file - `loadAstGrepRule(rulePath)` - Parse AST-grep rule from YAML - `analyzeFilesByOwner(codeownersPath, rule, projectRoot?)` - Group files by owner - `generateShards(filesByOwner, shardSize)` - Generate shard configuration - `normalizeOwnerName(owner)` - Normalize owner names ## Usage Examples ### Simple Deterministic Sharding ```typescript import { getShardForFilename, distributeFilesAcrossShards } from 'codemodctl/sharding'; // Get shard for a file - always deterministic const shard = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); // Same input always gives same output const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); console.log(shard1 === shard2); // always true // Different shard counts give different results (expected behavior) const shard5 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 }); const shard10 = getShardForFilename('src/components/Button.tsx', { shardCount: 10 }); // shard5 and shard10 will likely be different, but each is consistent // Distribute files with consistent hashing for stable scaling const files = ['file1.ts', 'file2.ts', 'file3.ts']; const distribution = distributeFilesAcrossShards(files, 5); // When you need more capacity, most files stay in place const moreFiles = [...files, 'newFile.ts']; const analysis = analyzeShardScaling(files, 5, 6); // Only ~20-40% of files get reassigned, not all of them! ``` ### Key Benefits - **No complex parameters**: Just filename and shard count - **Perfectly deterministic**: Same input = same output, always - **Stable scaling**: When adding shards, most files stay in their original shards - **Minimal reassignment**: Only ~20-40% of files move when scaling up - **Fast and simple**: Hash-based assignment with consistent ring placement - **Works across runs**: File gets same shard whether filesystem changes or not ## CLI Commands ### `codeowner` Analyze CODEOWNERS file and generate sharding configuration. ```bash codemodctl codeowner [options] Options: -s, --shard-size <size> Number of files per shard (required) -p, --state-prop <prop> Property name for state output (required) -c, --codeowners <path> Path to CODEOWNERS file (optional) -r, --rule <path> Path to AST-grep rule file (required) ``` Environment variables: - `STATE_OUTPUTS`: Path to write state output file ## License MIT