aiwg
Version:
Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.
469 lines (342 loc) • 13.5 kB
Markdown
# Database Design Template
## Metadata
- ID: DES-DB-`id`
- Owner: `name/role/team`
- Contributors: `list`
- Reviewers: `list`
- Team: `team`
- Stakeholders: `list`
- Status: `draft/in-progress/blocked/approved/done`
- Dates: created `YYYY-MM-DD` / updated `YYYY-MM-DD` / due `YYYY-MM-DD`
- Related: REQ-`id`, DES-`id`, IFC-`id`, ADR-`id`, TEST-`id`
- Links: `paths/urls`
## Related Templates
- docs/sdlc/templates/requirements/data-contract-template.md
- docs/sdlc/templates/analysis-design/interface-contract-card.md
- docs/sdlc/templates/analysis-design/component-design-card.md
- docs/sdlc/templates/analysis-design/architecture-decision-record.md
## Database Identity
### System Context
- System Name: `name of system/application`
- Database Purpose: `primary business function supported`
- Database Type: `operational/analytical/hybrid`
- Data Criticality: `tier-1/tier-2/tier-3`
- Regulatory Scope: `GDPR/HIPAA/PCI/SOX/none`
### Database Profile
- Logical Name: `business-aligned name`
- Physical Name: `technical database name`
- Environment: `dev/test/staging/prod`
- Version: `schema version`
## Data Architecture Decisions
### Storage Pattern
**Decision**: `single-database/multi-database/federated/distributed`
**Rationale**:
- Scalability considerations
- Data isolation requirements
- Consistency requirements
- Performance objectives
- Operational complexity tolerance
### Data Model Paradigm
**Primary Model**: `relational/document/key-value/graph/time-series/hybrid`
**Justification**:
- Query patterns alignment
- Data relationship complexity
- Schema flexibility needs
- Consistency guarantees required
- Performance characteristics
### Consistency Model
**Selected Model**: `strong/eventual/causal/session`
**Trade-offs Accepted**:
- CAP theorem position: `CP/AP/CA`
- Acceptable staleness window: `0ms/100ms/1s/minutes`
- Conflict resolution strategy: `last-write-wins/vector-clocks/CRDT/custom`
## Conceptual Data Model
### Business Entities
| Entity | Description | Cardinality | Criticality |
| ------ | ----------- | ----------- | ----------- |
| Entity1 | Business purpose | Expected volume | Critical/Important/Standard |
| Entity2 | Business purpose | Expected volume | Critical/Important/Standard |
### Entity Relationships
```mermaid
erDiagram
CUSTOMER ||--o{ ORDER : places
ORDER ||--|{ ORDER_ITEM : contains
ORDER_ITEM }o--|| PRODUCT : includes
CUSTOMER }o--o{ ADDRESS : has
CUSTOMER {
string id PK
string email UK
string name
datetime created_at
}
ORDER {
string id PK
string customer_id FK
string status
decimal total_amount
datetime order_date
}
```
### Business Rules
1. **Integrity Rules**:
- Rule: `description of business constraint`
- Enforcement: `database/application/both`
- Validation: `trigger/check/foreign-key/unique`
2. **Derivation Rules**:
- Calculated fields and their formulas
- Aggregation rules
- Default value logic
3. **Temporal Rules**:
- Data retention policies
- Archive strategies
- Audit requirements
## Logical Data Model
### Entity Specifications
#### Entity: `EntityName`
**Business Description**: `what this entity represents`
**Attributes**:
| Attribute | Type | Constraints | Description |
| --------- | ---- | ----------- | ----------- |
| id | identifier | PK, not null | Unique identifier |
| email | string(255) | unique, not null | User email |
| status | enum | not null, default='active' | Record status |
| created_at | timestamp | not null, auto | Creation timestamp |
| updated_at | timestamp | not null, auto | Last update timestamp |
**Natural Keys**: `business-meaningful unique identifiers`
**Surrogate Keys**: `system-generated identifiers`
**Candidate Keys**: `potential unique identifiers`
### Relationship Specifications
#### Relationship: `Entity1-Entity2`
- **Type**: `one-to-one/one-to-many/many-to-many`
- **Optionality**: `mandatory/optional on each side`
- **Cardinality**: `min..max on each side`
- **Cascade Rules**: `delete: cascade/restrict/set-null`
- **Referential Actions**: `update: cascade/restrict`
- **Business Meaning**: `what this relationship represents`
### Data Integrity Constraints
#### Domain Constraints
| Domain | Base Type | Constraints | Examples |
| ------ | --------- | ----------- | -------- |
| Email | string | RFC 5322 compliant | user@example.com |
| Phone | string | E.164 format | +1234567890 |
| Currency | decimal(19,4) | >= 0 | 1234.5678 |
| Status | enum | predefined values | active/inactive/pending |
#### Complex Constraints
```text
Constraint: Order total must equal sum of line items
Expression: order.total_amount = SUM(order_item.quantity * order_item.unit_price)
Enforcement: trigger/application/both
```
## Physical Design Decisions
### Indexing Strategy
#### Index Planning Matrix
| Index Name | Type | Columns | Purpose | Expected Impact |
| ---------- | ---- | ------- | ------- | --------------- |
| idx_user_email | unique | (email) | Login queries | Lookup: O(1) |
| idx_order_customer_date | composite | (customer_id, order_date DESC) | Customer history | Range scan improvement |
| idx_product_category | bitmap/btree | (category_id) | Category browsing | High cardinality filter |
#### Indexing Principles
1. **Primary Access Paths**: Indexes supporting main query patterns
2. **Join Optimization**: Foreign key indexes for join performance
3. **Sort Optimization**: Indexes matching ORDER BY clauses
4. **Covering Indexes**: Include columns to avoid table lookups
5. **Selective Indexing**: Focus on high-cardinality columns
### Partitioning Strategy
**Partitioning Decision**: `none/range/list/hash/composite`
**Partitioning Design**:
- Partition Key: `column(s) used for partitioning`
- Partition Scheme: `monthly/yearly/by-region/by-tenant`
- Partition Count: `initial and growth strategy`
- Maintenance: `automated/manual partition management`
**Benefits Expected**:
- Query performance improvement
- Maintenance window reduction
- Archive efficiency
- Parallel processing enablement
### Storage Optimization
#### Denormalization Decisions
| Denormalization | Type | Justification | Update Strategy |
| --------------- | ---- | ------------- | --------------- |
| user.post_count | Computed column | Avoid COUNT(*) queries | Trigger on post insert/delete |
| order.customer_name | Duplicated data | Avoid join for reports | Sync on customer update |
#### Compression Strategy
- Compression Type: `row/column/page/none`
- Target Tables: `large historical tables`
- Expected Ratio: `3:1 typical for text data`
- Trade-offs: `CPU vs storage vs query performance`
## Data Access Patterns
### Query Patterns
#### Pattern: `GetUserByEmail`
- **Frequency**: `1000/second`
- **Latency Target**: `<10ms`
- **Access Path**: `unique index on email`
- **Result Size**: `single row`
- **Caching Strategy**: `application cache 5 minutes`
#### Pattern: `GetRecentOrdersByCustomer`
- **Frequency**: `100/second`
- **Latency Target**: `<50ms`
- **Access Path**: `composite index (customer_id, order_date DESC)`
- **Result Size**: `10-50 rows typical`
- **Pagination**: `cursor-based using order_date`
### Write Patterns
#### Pattern: `CreateOrder`
- **Frequency**: `10/second peak`
- **Consistency**: `strong required`
- **Transaction Scope**: `order + order_items`
- **Locking Strategy**: `row-level pessimistic`
- **Retry Policy**: `exponential backoff on deadlock`
### Aggregation Patterns
#### Pattern: `DailySalesReport`
- **Frequency**: `once per day`
- **Processing Window**: `02:00-04:00 UTC`
- **Source Data**: `orders, order_items, products`
- **Optimization**: `materialized view/summary table`
- **Incremental Strategy**: `process changed records only`
## Data Governance
### Data Classification
| Data Element | Classification | Handling Requirements |
| ------------ | -------------- | -------------------- |
| user.email | PII | Encryption required, audit access |
| user.password_hash | Secret | Never expose, rotate regularly |
| order.total | Confidential | Restrict access by role |
| product.description | Public | No restrictions |
### Data Quality Rules
1. **Completeness**: Required fields and coverage expectations
2. **Accuracy**: Validation rules and acceptable ranges
3. **Consistency**: Cross-field and cross-entity rules
4. **Timeliness**: Freshness requirements and SLAs
5. **Uniqueness**: Duplicate prevention strategies
### Data Lineage
```mermaid
graph LR
A[Source System] --> B[ETL Process]
B --> C[Staging Table]
C --> D[Validation]
D --> E[Target Table]
E --> F[Reporting Views]
E --> G[API Cache]
```
## Migration Strategy
### Schema Evolution Approach
**Versioning Scheme**: `major.minor.patch`
**Backward Compatibility**: `maintain 2 versions back`
**Migration Types**:
- **Additive**: New columns/tables (backward compatible)
- **Transformative**: Data type changes (requires migration)
- **Destructive**: Removal of elements (requires deprecation)
### Migration Patterns
1. **Dual-Write Pattern**: For gradual migrations
2. **Blue-Green Deployment**: For schema swaps
3. **Expand-Contract**: For column modifications
4. **Shadow Tables**: For testing migrations
### Rollback Strategy
- Point-in-time recovery capability
- Migration checkpoint creation
- Automated rollback triggers
- Data reconciliation procedures
## Performance Baselines
### Expected Volumes
| Metric | Current | Year 1 | Year 3 | Year 5 |
| ------ | ------- | ------ | ------ | ------ |
| Total Records | 1M | 10M | 100M | 500M |
| Daily Transactions | 10K | 100K | 1M | 5M |
| Storage Size | 10GB | 100GB | 1TB | 5TB |
| Concurrent Users | 100 | 1000 | 10000 | 50000 |
### Performance Targets
| Operation | Target Latency | Target Throughput |
| --------- | -------------- | ----------------- |
| Single row lookup | <5ms | 10000/sec |
| Range scan (100 rows) | <50ms | 1000/sec |
| Insert transaction | <10ms | 1000/sec |
| Bulk load (1M rows) | <60sec | N/A |
| Complex aggregation | <1sec | 100/sec |
## Operational Considerations
### Backup Strategy
- **RPO (Recovery Point Objective)**: `15 minutes`
- **RTO (Recovery Time Objective)**: `1 hour`
- **Backup Type**: `full weekly, incremental daily, log every 15min`
- **Retention**: `30 days hot, 1 year cold`
- **Testing**: `monthly restore verification`
### Monitoring Requirements
1. **Health Metrics**:
- Connection pool utilization
- Query response times
- Lock wait times
- Deadlock frequency
- Replication lag
2. **Growth Metrics**:
- Table size growth rate
- Index fragmentation
- Storage utilization
- Row count trends
3. **Alert Thresholds**:
- Query time >1s: Warning
- Connection pool >80%: Warning
- Replication lag >1min: Critical
- Storage >90%: Critical
### Maintenance Windows
- **Scheduled**: `Sunday 02:00-06:00 UTC`
- **Activities**: `index rebuild, statistics update, cleanup`
- **Zero-downtime Requirements**: `rolling updates, read replicas`
## Security Specifications
### Access Control Matrix
| Role | Create | Read | Update | Delete | Notes |
| ---- | ------ | ---- | ------ | ------ | ----- |
| admin | Yes | Yes | Yes | Yes | Full access |
| app_service | Yes | Yes | Yes | Soft only | Via API |
| analyst | No | Yes | No | No | Read replicas only |
| auditor | No | Yes | No | No | Audit tables only |
### Encryption Requirements
- **At Rest**: `AES-256 transparent encryption`
- **In Transit**: `TLS 1.3 minimum`
- **Key Management**: `HSM/KMS integration`
- **Field-Level**: `PII fields individually encrypted`
### Audit Requirements
- **What to Audit**: `all write operations, sensitive reads`
- **Audit Storage**: `separate audit database`
- **Retention**: `7 years for compliance`
- **Immutability**: `write-once, tamper-evident`
## Testing Strategy
### Test Data Requirements
- **Volume**: `10% of production size minimum`
- **Variety**: `cover all edge cases and data types`
- **Veracity**: `maintain referential integrity`
- **Velocity**: `simulate production write rates`
### Test Scenarios
1. **Functional Tests**: CRUD operations, constraints, triggers
2. **Performance Tests**: Load testing, stress testing, endurance
3. **Recovery Tests**: Backup/restore, failover, corruption handling
4. **Security Tests**: Injection prevention, access control, encryption
## Documentation and Maintenance
### Schema Documentation
- Entity-relationship diagrams
- Data dictionary maintenance
- Change history tracking
- Business glossary alignment
### Knowledge Transfer
- Developer guides for data access
- DBA runbooks for operations
- Data governance procedures
- Troubleshooting playbooks
## Validation Checklist
- [ ] All entities have primary keys defined
- [ ] Foreign key relationships are complete
- [ ] Indexes support all primary access patterns
- [ ] Constraints enforce business rules
- [ ] Audit requirements are met
- [ ] Security controls are specified
- [ ] Performance targets are achievable
- [ ] Migration strategy is defined
- [ ] Backup/recovery meets RTO/RPO
- [ ] Monitoring covers key metrics
- [ ] Test scenarios are comprehensive
- [ ] Documentation is complete
## Appendices
### A. Data Dictionary
[Detailed field-level documentation]
### B. Sample Queries
[Common query patterns with examples]
### C. Migration Scripts
[Template migration scripts]
### D. Monitoring Queries
[Health check and metric queries]