aiwg
Version:
Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.
367 lines (262 loc) • 15.8 kB
Markdown
# Measurement Plan Template
## Purpose
Define the metrics, collection methods, and reporting cadence used to monitor project health and product quality.
## Ownership & Collaboration
- Document Owner: Project Manager
- Contributor Roles: Test Architect, System Analyst, Configuration Manager
- Automation Inputs: Metric definitions, data sources, reporting cadence
- Automation Outputs: `measurement-plan.md` including inventory table
## Completion Checklist
- Metrics mapped to project objectives and risks
- Data collection methods and tools specified
- Reporting cadence and stakeholders defined
- Automation strategy documented
- Alerting thresholds established
## Document Sections
1. **Measurement Objectives**
- State why measurement is necessary and which decisions it supports.
2. **Metric Inventory**
- List metrics with definitions, formulas, and targets.
| Metric | Purpose | Data Source | Frequency | Target/Threshold | Owner |
| --- | --- | --- | --- | --- | --- |
| Example Metric | Explain why it matters. | Tool or artifact. | Weekly | Target value. | Person/role. |
3. **Data Collection Process**
- Describe how data is gathered, validated, and stored.
### 3.1 Collection Method
For each metric, specify:
- **Manual or Automated**: Prefer automated (see `metrics-collection-automation.md` if available)
- **Data Source**: API, database, log files, analytics tool
- **Collection Frequency**: Real-time, per event, hourly, daily, weekly
- **Responsible System/Person**: Who/what collects the data
### 3.2 Automation Strategy
**Recommended**: Automate 90%+ of metric collection
**Automation Options**:
- **CI/CD Pipeline**: Metrics pushed during build/deploy (build duration, test results)
- **Scheduled Jobs**: Cron or Kubernetes CronJob queries APIs (Jira velocity, GitHub stats)
- **Event-Driven**: Webhooks trigger metric collection (deployment events, user signups)
- **Streaming**: Real-time event processing (user activity, system performance)
**Example Automation**:
```python
# Daily Jira velocity collection
# Runs via cron: 0 9 * * *
def collect_jira_velocity():
jira = JIRA(server=JIRA_URL, token_auth=JIRA_TOKEN)
sprint = jira.sprints(board_id, state='active')[0]
velocity = calculate_velocity(sprint)
push_to_influxdb('jira_metrics', {'velocity': velocity})
```
**Reference**: See `docs/sdlc/metrics/` catalogs for detailed integration examples
### 3.3 Data Validation
Validate metrics before storing to catch anomalies:
| Validation Type | Example | Action if Failed |
|-----------------|---------|------------------|
| Range Check | Velocity 0-200 | Alert + store with flag |
| Logic Check | DAU ≤ MAU | Alert + investigate |
| Trend Check | No 50%+ single-day change | Alert + confirm source |
### 3.4 Data Storage
**Time-Series Metrics** (most SDLC metrics):
- Tool: InfluxDB, Prometheus, CloudWatch
- Retention: 1 year for tactical metrics, 3+ years for strategic
**Event-Level Data** (audit trail):
- Tool: PostgreSQL, MySQL, data warehouse
- Retention: 3-7 years for compliance
**Dashboard Data**:
- Tool: Grafana, Tableau, Looker
- Source: Query time-series or relational database
4. **Reporting and Visualization**
### 4.1 Dashboard Strategy
**Principle**: Different audiences need different views
| Audience | Metrics | Update Frequency | Format |
|----------|---------|------------------|--------|
| Team (Daily) | Velocity, WIP, build status, deployment frequency | Real-time | Grafana dashboard, CI/CD status board |
| Stakeholders (Weekly) | Velocity trend, quality metrics, risk status | Daily refresh | Status assessment report |
| Executives (Monthly) | Product metrics, ROI, milestone progress | Weekly refresh | Executive summary dashboard |
### 4.2 Dashboard Layout
**Recommended Structure**:
```
┌─────────────────────────────────────────────────┐
│ PROJECT DASHBOARD │
├─────────────────────────────────────────────────┤
│ Overall Status: GREEN │
│ Last Updated: 2025-10-15 14:30 UTC │
├──────────────────┬──────────────────────────────┤
│ DELIVERY │ QUALITY │
├──────────────────┼──────────────────────────────┤
│ Velocity: 38 │ Test Coverage: 84% │
│ Target: 35 ✓ │ Target: 80% ✓ │
│ Trend: ↗ │ Trend: ↗ │
│ │ │
│ Deploy Freq: │ Defect Rate: │
│ 5/week ✓ │ 8% ✓ │
├──────────────────┼──────────────────────────────┤
│ PRODUCT │ RISKS │
├──────────────────┼──────────────────────────────┤
│ DAU/MAU: 28% │ Critical: 0 │
│ Target: > 20% ✓ │ High: 2 │
│ │ Trend: Decreasing ✓ │
│ NPS: 42 (Good) │ │
└──────────────────┴──────────────────────────────┘
```
### 4.3 Visualization Best Practices
**Trends over Snapshots**: Show sparklines (last 7 data points) not just current value
**Thresholds**: Use color coding (green, yellow, red) based on thresholds
**Context**: Include target/threshold alongside current value
**Actionability**: Link metrics to actions ("Velocity dropped 30% → Review WIP limits")
### 4.4 Report Distribution
**Daily**:
- Team dashboard (auto-refresh)
- Build/deploy status (Slack notifications)
**Weekly**:
- Status Assessment report (email to stakeholders)
- Iteration metrics summary (stand-up/demo)
**Monthly**:
- Iteration Assessment (includes metrics section)
- Executive dashboard (business metrics + delivery health)
### 4.5 Tooling Recommendations
| Use Case | Tool | Pros | Cons |
|----------|------|------|------|
| Time-series visualization | Grafana | Free, powerful, integrates with InfluxDB/Prometheus | Steep learning curve |
| Business intelligence | Tableau | Executive-friendly, drag-drop | Expensive |
| Lightweight BI | Metabase | Open-source, easy setup | Limited advanced features |
| Custom dashboards | React + Recharts | Full control | Requires development |
**Reference**: See example dashboards in `delivery-metrics-catalog.md` and `product-metrics-catalog.md`
5. **Roles and Responsibilities**
- Define who collects, analyzes, and acts on metrics.
| Role | Responsibilities |
|------|------------------|
| Metrics Analyst | Define metrics, build collection scripts, generate reports |
| Project Manager | Review metrics weekly, escalate anomalies, track action items |
| Test Architect | Define quality metrics, set coverage targets |
| DevOps Engineer | Implement CI/CD metrics collection, maintain monitoring |
| Product Strategist | Define product metrics, track user engagement |
| Team Members | Provide context for metric anomalies, implement improvements |
6. **Review Cadence**
- Specify when metrics are reviewed (stand-ups, demos, governance boards).
| Cadence | Audience | Metrics Reviewed | Format |
|---------|----------|------------------|--------|
| Daily | Team | WIP, build status, deployment frequency | Standup dashboard review |
| Weekly | Team + Stakeholders | Velocity, quality, risks | Status assessment |
| Monthly | Executives | Product metrics, ROI, milestones | Iteration assessment |
| Quarterly | Leadership | Strategic metrics, DORA benchmarks | Roadmap planning |
7. **Continuous Improvement**
- Outline how metrics will be refined or retired.
**Metric Review Process**:
- **Quarterly**: Review metric inventory for relevance
- **Retire**: Metrics no longer informing decisions
- **Add**: New metrics for emerging risks or objectives
- **Refine**: Adjust thresholds based on team maturity
**Improvement Triggers**:
- Metric anomaly: Investigate root cause, update process
- Metric target missed: Adjust threshold or improve process
- Metric no longer reviewed: Candidate for retirement
8. **Alerting and Thresholds**
**Objective**: Define when metrics trigger action, not just passive observation
### 8.1 Threshold Levels
| Level | Color | Meaning | Response |
|-------|-------|---------|----------|
| Green | 🟢 | Healthy | Continue monitoring |
| Yellow | 🟡 | Caution | Investigate, prepare action |
| Red | 🔴 | Alert | Immediate action required |
### 8.2 Threshold Matrix
| Metric | Green (Good) | Yellow (Caution) | Red (Alert) | Action Owner |
|--------|--------------|------------------|-------------|--------------|
| Velocity variance | ±10% | ±10-20% | > ±20% | Project Manager: Review estimation |
| Defect escape rate | < 5% | 5-10% | > 10% | Test Architect: Increase coverage |
| Build duration | < 10 min | 10-15 min | > 15 min | Build Engineer: Optimize pipeline |
| Deployment success rate | > 95% | 90-95% | < 90% | Deployment Manager: Investigate failures |
| SLO compliance | > 99.9% | 99-99.9% | < 99% | Reliability Engineer: Page on-call |
| Change failure rate | < 15% | 15-25% | > 25% | DevOps Engineer: Review deploy process |
| WIP count | < limit | At limit | > limit | Project Manager: Finish work, don't start new |
### 8.3 Automated Alert Configuration
**Immediate Alerts** (real-time):
- SLO breach (< 99%)
- Production deployment failure
- Critical defect opened
**Daily Digest** (morning):
- Build duration trend (if > 20% increase in 7 days)
- WIP exceeds limit for 2+ days
- Code review queue > 5 PRs
**Weekly Report**:
- Velocity variance > 20%
- Defect escape rate trend (if increasing 2 consecutive weeks)
- Test coverage dropped below 80%
### 8.4 Alert Delivery
| Urgency | Channel | Recipients |
|---------|---------|------------|
| Critical (Red, production-impacting) | PagerDuty | On-call engineer, incident commander |
| High (Red, non-production) | Slack #alerts | Team channel, Project Manager |
| Medium (Yellow) | Email | Daily digest to team |
| Low (Green trending yellow) | Dashboard | Passive monitoring |
### 8.5 Alert Fatigue Prevention
**Rules**:
1. **Deduplication**: Group similar alerts
2. **Quiet Hours**: No non-critical alerts outside business hours
3. **Escalation**: If alert not acknowledged in 15 min, escalate
4. **Tuning**: Review alert false positive rate monthly, adjust thresholds
5. **Retirement**: Disable alerts that haven't fired in 90 days
**Target**: < 5% false positive rate on alerts
9. **Risks and Assumptions**
- Capture measurement risks (data quality, tooling limits) and mitigation plans.
| Risk | Impact | Mitigation |
|------|--------|------------|
| Data source unavailable | Metrics gaps | Multiple collection methods, cached data |
| Metric gaming (inflating values) | False sense of health | Cross-validate metrics, audit sample |
| Tool limitations | Missing data | Manual fallback process |
| Team doesn't review metrics | Metrics waste | Embed in rituals (standup, demo) |
| Alert fatigue | Ignored alerts | Tune thresholds, reduce noise |
## Appendix A: Starter Metrics Catalog
**Purpose**: Pre-defined metrics teams can adopt immediately
**Usage**: Select 5-7 metrics appropriate to project phase and objectives
### Inception Phase
Focus: Validate feasibility and planning assumptions
| Metric | Purpose | Data Source | Frequency | Target | Owner |
|--------|---------|-------------|-----------|--------|-------|
| Risk Identification Rate | Ensure comprehensive risk capture | Risk list | Weekly | Plateau by end Inception | Project Manager |
| Stakeholder Alignment Score | Confirm shared vision | Vision review feedback | After each review | 90% agreement | Vision Owner |
| Estimation Accuracy | Validate initial estimates | Iteration plan vs actual | Per iteration | ±30% (calibrating) | Project Manager |
### Elaboration Phase
Focus: Validate architecture and refine estimates
| Metric | Purpose | Data Source | Frequency | Target | Owner |
|--------|---------|-------------|-----------|--------|-------|
| Velocity (Baseline) | Establish delivery rate | Jira/Linear | Per iteration | Stable for 3 iterations | Project Manager |
| Architecture Coverage | Ensure significant scenarios addressed | Use case realization count | Per iteration | 100% by end Elaboration | Architecture Designer |
| Technical Debt Ratio | Monitor architecture quality | Static analysis | Weekly | < 5% | Build Engineer |
| Defect Discovery Rate | Validate quality processes | Defect tracking | Per iteration | < 5 defects per story | Test Architect |
### Construction Phase
Focus: Deliver features and maintain quality
| Metric | Purpose | Data Source | Frequency | Target | Owner |
|--------|---------|-------------|-----------|--------|-------|
| Deployment Frequency | Track delivery capability | CI/CD logs | Daily | ≥ 1 per week | DevOps Engineer |
| Lead Time for Changes | Measure pipeline efficiency | Git + CI/CD | Per deploy | < 2 days | DevOps Engineer |
| Change Failure Rate | Measure release quality | Deploys + incidents | Per deploy | < 15% | Deployment Manager |
| Velocity | Predict capacity | Jira/Linear | Per iteration | ±10% variance | Project Manager |
| Test Coverage | Ensure quality | Coverage tool | Per commit | ≥ 80% | Test Architect |
| Build Duration | Monitor CI health | CI/CD system | Per build | < 15 min | Build Engineer |
### Transition Phase
Focus: Production readiness and user adoption
| Metric | Purpose | Data Source | Frequency | Target | Owner |
|--------|---------|-------------|-----------|--------|-------|
| MTTR | Operational resilience | Incident logs | Per incident | < 1 hour | Reliability Engineer |
| SLO Compliance | Production performance | Monitoring | Continuous | ≥ 99% | Reliability Engineer |
| User Adoption Rate | Rollout success | Analytics | Daily | 80% target users in 30d | Product Strategist |
| NPS | User satisfaction | Survey | Weekly | ≥ 4.0 / 5.0 | Support Lead |
| Churn Rate | User retention | Activity logs | Monthly | < 5% | Product Strategist |
**Selection Guidance**: Choose metrics that:
1. Align with project objectives and risks
2. Are measurable with available data sources
3. Will inform specific decisions
4. Can be collected with reasonable effort (prefer automated)
**Metric Catalog References**:
- Delivery Metrics: `docs/sdlc/metrics/delivery-metrics-catalog.md`
- Product Metrics: `docs/sdlc/metrics/product-metrics-catalog.md`
- Quality Metrics: `docs/sdlc/metrics/quality-metrics-catalog.md`
- Operational Metrics: `docs/sdlc/metrics/operational-metrics-catalog.md`
- DORA Quickstart: `docs/sdlc/metrics/dora-metrics-quickstart.md`
## Agent Notes
- Ensure metric definitions align with Status Assessments and Test Strategy.
- Automate data collection whenever possible; link scripts or queries used.
- Revisit metrics when project goals or risks evolve.
- Verify the Automation Outputs entry is satisfied before signaling completion.
- Populate metric inventory from Appendix A or metric catalogs in `docs/sdlc/metrics/`
- Set up automated collection for DORA metrics (see DORA quickstart guide)
- Implement alerting thresholds to enable proactive issue detection