commit-analyzer
Version:
Analyze git commits and generate categories, summaries, and descriptions for each commit. Optionally generate a yearly breakdown report of your commit history.
357 lines (268 loc) • 9.5 kB
Markdown
# Git Commit Analyzer
A TypeScript/Node.js program that analyzes git commits and generates categorized
summaries using Claude CLI.
## Features
- Extract commit details (message, date, diff) from git repositories
- Categorize commits using LLM analysis into:
`tweak`, `feature`, or `process`
- Generate CSV reports with timestamp, category, summary, and description
- Generate condensed markdown reports from CSV data for stakeholder
communication
- Support for multiple LLM models (Claude, Gemini, OpenAI) with automatic
detection
- Support for batch processing multiple commits
- Automatically filters out merge commits for cleaner analysis
- Robust error handling and validation
## Usage
### Default Behavior
When run without arguments, the program analyzes all commits authored by the
current user:
```bash
# Analyze all your commits in the current repository
npx commit-analyzer
# Analyze your last 10 commits
npx commit-analyzer --limit 10
# Analyze commits by a specific user
npx commit-analyzer --author user@example.com
```
### Command Line Arguments
```bash
# Analyze specific commits
npx commit-analyzer abc123 def456 ghi789
# Specify output file with default behavior
npx commit-analyzer --output analysis.csv --limit 20
# Generate markdown report from existing CSV
npx commit-analyzer --report --input-csv analysis.csv
# Analyze commits and generate both CSV and markdown report
npx commit-analyzer --report --limit 50
# Use specific LLM model
npx commit-analyzer --llm claude --limit 10
```
### Options
- `-o, --output <file>`:
Output file (default:
`results/commits.csv` for analysis, `results/report.md` for reports)
- `--output-dir <dir>`:
Output directory for CSV and report files (default:
current directory)
- `-a, --author <email>`:
Filter commits by author email (defaults to current user)
- `-l, --limit <number>`:
Limit number of commits to analyze
- `--llm <model>`:
LLM model to use (claude, gemini, openai)
- `-r, --resume`:
Resume from last checkpoint if available
- `-c, --clear`:
Clear any existing progress checkpoint
- `--report`:
Generate condensed markdown report from existing CSV
- `--input-csv <file>`:
Input CSV file to read for report generation
- `-v, --verbose`:
Enable verbose logging (shows detailed error information)
- `--since <date>`:
Only analyze commits since this date (YYYY-MM-DD, '1 week ago', '2024-01-01')
- `--until <date>`:
Only analyze commits until this date (YYYY-MM-DD, '1 day ago', '2024-12-31')
- `--no-cache`:
Disable caching of analysis results
- `--batch-size <number>`:
Number of commits to process per batch (default:
1 for sequential processing)
- `-h, --help`:
Display help
- `-V, --version`:
Display version
## Output Formats
### CSV Output
The program generates a CSV file with the following columns:
- `timestamp`:
ISO 8601 timestamp of the commit (e.g., `2025-08-28T11:14:40.000Z`)
- `category`:
One of `tweak`, `feature`, or `process`
- `summary`:
One-line description (max 80 characters)
- `description`:
Detailed explanation (2-3 sentences)
### Markdown Report Output
When using the `--report` option, the program generates a condensed markdown
report that:
- Groups commits by year (most recent first)
- Organizes by categories:
Features, Processes, Tweaks & Bug Fixes
- Consolidates similar items for stakeholder readability
- Includes commit count statistics
- Uses professional language suitable for both technical and non-technical
audiences
## Requirements
- Node.js 18+ with TypeScript support (Bun runtime recommended)
- Git repository (must be run within a git repository)
- At least one supported LLM CLI tool:
- Claude CLI (`claude`) - recommended, defaults to Sonnet model
- Gemini CLI (`gemini`)
- OpenAI CLI (`codex`)
- Valid git commit hashes (when specifying commits manually)
## Categories
- **tweak**:
Minor adjustments, bug fixes, small improvements
- **feature**:
New functionality, major additions
- **process**:
Build system, CI/CD, tooling, configuration changes
## Error Handling
The program includes comprehensive error handling for:
- Invalid commit hashes
- Git repository validation
- LLM analysis failures with automatic retry
- File I/O errors
- Network connectivity issues
### Resume Capability
The tool automatically:
- Saves progress checkpoints every 10 commits
- Saves immediately when a failure occurs
- **Stops processing after a commit fails all retry attempts**
- Exports partial results to the CSV file before exiting
If the process stops (e.g., after 139 commits due to API failure), you can
resume from where it left off:
```bash
# Resume from last checkpoint
npx commit-analyzer --resume
# Clear checkpoint and start fresh
npx commit-analyzer --clear
# View checkpoint status (it will prompt you)
npx commit-analyzer --resume
```
The checkpoint file (`.commit-analyzer/progress.json`) contains:
- List of all commits to process
- Successfully processed commits (including failed ones to skip on resume)
- Analyzed commit data (only successful ones)
- Output file location
### Application Data Directory
The tool creates a `.commit-analyzer/` directory to store internal files:
```
.commit-analyzer/
├── progress.json # Progress checkpoint data
└── cache/ # Cached analysis results
├── commit-abc123.json
├── commit-def456.json
└── ...
```
- **Progress checkpoint**:
Enables resuming interrupted analysis sessions
- **Analysis cache**:
Stores LLM analysis results to avoid re-processing the same commits (TTL:
30 days)
Use `--no-cache` to disable caching if needed.
Use `--clear` to clear the cache and progress checkpoint.
### Date Filtering
The tool supports flexible date filtering using natural language or specific
dates:
```bash
# Analyze commits from the last week
npx commit-analyzer --since "1 week ago"
# Analyze commits from a specific date range
npx commit-analyzer --since "2024-01-01" --until "2024-12-31"
# Analyze commits from the beginning of the year
npx commit-analyzer --since "2024-01-01"
# Analyze commits up to a specific date
npx commit-analyzer --until "2024-06-30"
```
Date formats supported:
- Relative dates:
`"1 week ago"`, `"2 months ago"`, `"3 days ago"`
- ISO dates:
`"2024-01-01"`, `"2024-12-31"`
- Git-style dates:
Any format accepted by `git log --since` and `git log --until`
### Batch Processing
Control processing speed and resource usage with batch size options:
```bash
# Process commits one at a time (default, safest for rate limits)
npx commit-analyzer --batch-size 1
# Process multiple commits in parallel (faster but may hit rate limits)
npx commit-analyzer --batch-size 5 --limit 100
# Sequential processing for large datasets
npx commit-analyzer --batch-size 1 --limit 500
```
### Retry Logic
The tool includes automatic retry logic with exponential backoff for handling
API failures when processing many commits.
This is especially useful when analyzing large numbers of commits that might
trigger rate limits.
#### Configuration
You can configure the retry behavior using environment variables:
- `LLM_MAX_RETRIES`:
Maximum number of retry attempts (default:
3)
- `LLM_INITIAL_RETRY_DELAY`:
Initial delay between retries in milliseconds (default:
5000)
- `LLM_MAX_RETRY_DELAY`:
Maximum delay between retries in milliseconds (default:
30000)
- `LLM_RETRY_MULTIPLIER`:
Multiplier for exponential backoff (default:
2)
#### Examples
```bash
# More aggressive retries for large batches (e.g., 139+ commits)
LLM_MAX_RETRIES=5 LLM_INITIAL_RETRY_DELAY=10000 npx commit-analyzer --limit 200
# Faster retries for testing
LLM_MAX_RETRIES=2 LLM_INITIAL_RETRY_DELAY=2000 npx commit-analyzer
# Conservative approach for rate-limited APIs
LLM_MAX_RETRIES=4 LLM_INITIAL_RETRY_DELAY=15000 LLM_MAX_RETRY_DELAY=60000 npx commit-analyzer
```
The retry mechanism automatically:
- Retries failed API calls with increasing delays
- Shows progress and retry attempts in the console
- Continues processing remaining commits even if some fail
- Reports the total number of successful and failed commits at the end
## Development
```bash
# Install dependencies
bun install
# Run in development mode
bun run dev
# Build for production
bun run build
# Run linting
bun run lint
# Type checking
bun run typecheck
```
## Examples
```bash
# Analyze all your commits in the current repository
npx commit-analyzer
# Analyze your last 20 commits and save to custom file
npx commit-analyzer --limit 20 --output my_analysis.csv
# Analyze commits by a specific team member
npx commit-analyzer --author teammate@company.com --limit 50
# Quick analysis of your recent work
npx commit-analyzer --limit 10
# Generate both CSV and markdown report from analysis
npx commit-analyzer --report --limit 100 --output yearly_analysis.csv
# Generate only a markdown report from existing CSV
npx commit-analyzer --report --input-csv existing_analysis.csv --output team_report.md
# Use specific LLM model for analysis
npx commit-analyzer --llm gemini --limit 25
# Resume interrupted analysis with progress tracking
npx commit-analyzer --resume
```
## Development
This tool requires the Bun runtime.
Install it globally:
```bash
# Install bun globally
curl -fsSL https://bun.sh/install | bash
# or
npm install -g bun
```
## Installation
```bash
bun install
bun build
bun link
```
After linking, you can use `commit-analyzer` command globally.