aiwg
Version:
Deployment tool and support utility for AI context. Copies agents, skills, commands, rules, and behaviors into the paths each AI platform reads (Claude Code, Codex, Copilot, Cursor, Warp, OpenClaw, and 6 more) so one source of truth works across 10 platfo
298 lines (233 loc) • 8.81 kB
Markdown
# ops-complete Extensions Guide
The four ops-complete extensions add domain-specific agents, templates, rules, and skills on top of the base framework. Each extension targets a specific type of operational repository. This guide covers what each extension provides and when to use it.
## Extension Basics
Extensions require ops-complete and cannot run standalone. Install one or more alongside the base framework:
```bash
aiwg use ops --ext sys
aiwg use ops --ext sys,it
aiwg use ops --ext sys,it,dev,stream
```
Each extension introduces artifacts with its own `apiVersion` prefix (e.g., `sys.ops.aiwg.io/v1`) and extends the base kind vocabulary with domain-specific kinds.
## sys — System Operations
**Scope**: Per-host hardware documentation, OS configuration, boot chains, fleet-wide inventory.
Install when your repository contains machine-specific documentation: hardware specs, OS setup procedures, boot configuration, or a host catalog.
### What sys Adds
**Agents**:
- `sys-fleet-doc` — Generate and maintain per-host documentation from live system introspection
- `sys-hardware-auditor` — Audit hardware configuration against expected specs
**Templates**:
- `host-profile.yaml` — Single host: hardware, OS, roles, network interfaces, installed services
- `fleet-inventory.yaml` — Group of hosts: topology, group variables, site information
- `boot-chain.md` — Boot procedure documentation with verification steps
**Rules**:
- `sys-hardware-safety` — Prevent operations that can damage hardware (write to raw block devices, firmware flashes without explicit gates)
- `sys-fleet-scope` — Enforce that sys commands operate within declared fleet scope
### Example: Host Profile
```yaml
apiVersion: sys.ops.aiwg.io/v1
kind: HostProfile
metadata:
name: web-01
namespace: production
spec:
hardware:
cpu: "AMD EPYC 7443P, 24 cores"
memory: "128 GB ECC DDR4"
storage:
- device: nvme0n1
size: "2 TB NVMe"
role: os
- device: nvme1n1
size: "4 TB NVMe"
role: data
os:
distribution: Ubuntu
version: "24.04 LTS"
kernel: "6.8.0-41-generic"
roles:
- web
- reverse-proxy
network:
interfaces:
- name: eth0
address: "10.0.1.10/24"
role: management
```
### Example Usage
```
Document the hardware configuration for all hosts in the web tier
```
```
Create a fleet inventory for the database cluster
```
## it — IT Operations
**Scope**: Asset management, configuration management database (CMDB), service deployments, disaster recovery.
Install when your repository manages infrastructure assets beyond individual hosts: service catalogs, DR plans, provisioning procedures, or network state.
### What it Adds
**Agents**:
- `it-asset-manager` — Track and reconcile IT assets against CMDB
- `it-dr-planner` — Generate and validate disaster recovery runbooks
**Templates**:
- `asset-record.yaml` — CMDB record for hardware or software asset
- `dr-runbook.yaml` — Disaster recovery procedure with RTO/RPO targets
- `provisioning-playbook.yaml` — New service or host provisioning procedure
- `network-state.yaml` — Network topology snapshot
**Rules**:
- `it-change-control` — Require change record reference for production changes
- `it-dr-coverage` — Verify that all critical services have DR runbooks
### Example: DR Runbook
```yaml
apiVersion: it.ops.aiwg.io/v1
kind: DisasterRecoveryRunbook
metadata:
name: postgres-primary-failover
namespace: production
spec:
service: postgres-primary
rto: "15 minutes"
rpo: "5 minutes"
triggers:
- "Primary host unreachable for > 2 minutes"
- "Replication lag > 60 seconds"
steps:
- name: verify-primary-down
description: Confirm primary is not responding
run: pg_isready -h postgres-primary -p 5432
verify:
exitCode: 1
- name: promote-replica
description: Promote replica to primary
run: pg_ctl promote -D /var/lib/postgresql/data
verify:
run: pg_isready -h postgres-replica -p 5432
exitCode: 0
```
### Example Usage
```
Generate a DR runbook for the authentication service
```
```
Audit IT assets for the web tier and flag anything not in the CMDB
```
## dev — DevOps / Build Operations
**Scope**: CI/CD pipelines, build automation, fleet-wide tooling, release workflows.
Install when your repository manages CI/CD infrastructure: pipeline definitions, build scripts, deployment automation, or developer tooling that operates across the fleet.
### What dev Adds
**Agents**:
- `dev-pipeline-builder` — Generate CI/CD pipeline definitions from workflow descriptions
- `dev-release-coordinator` — Coordinate multi-stage release procedures
**Templates**:
- `ci-pipeline.yaml` — CI pipeline with build, test, and deploy stages
- `build-playbook.yaml` — Build automation procedure
- `release-runbook.yaml` — Release procedure with gates and rollback
**Rules**:
- `dev-pipeline-safety` — No pipeline should deploy to production without a gate step
- `dev-secret-hygiene` — Secrets must come from vault references, not environment literals
- `dev-idempotent-builds` — Build procedures must be idempotent and produce stable artifacts
### Example: CI Pipeline
```yaml
apiVersion: dev.ops.aiwg.io/v1
kind: CIPipeline
metadata:
name: api-gateway-build
namespace: ci
spec:
trigger:
branches: ["main", "release/*"]
stages:
- name: build
steps:
- run: npm ci
- run: npm run build
artifacts:
- dist/
- name: test
steps:
- run: npm test
- run: npx tsc --noEmit
gates:
coverage: 80
- name: deploy-staging
environment: staging
steps:
- run: ./deploy.sh staging
verify:
- run: ./healthcheck.sh staging
- name: production-gate
type: OpsGate
description: "Manual approval required before production deployment"
- name: deploy-production
environment: production
steps:
- run: ./deploy.sh production
```
### Example Usage
```
Create a CI pipeline for the API service with coverage gate at 80%
```
```
Generate a release runbook for v2.3.0 including staging verification and production gate
```
## stream — Streaming Infrastructure Operations
**Scope**: Streaming infrastructure, transcoders, platform integrations, stream pipeline health.
Install when your repository manages live streaming or media pipeline infrastructure: encoder configurations, CDN integrations, stream health monitoring, or key rotation procedures.
### What stream Adds
**Agents**:
- `stream-deployer` — Deploy and configure streaming service components
- `stream-health-monitor` — Diagnose and remediate stream pipeline health issues
**Templates**:
- `stream-service.yaml` — Streaming service configuration and deployment spec
- `encoder-config.yaml` — Encoder/transcoder configuration
- `stream-pipeline.yaml` — End-to-end pipeline: ingest → transcode → deliver
- `key-rotation-runbook.yaml` — Stream key rotation with zero-downtime procedure
**Rules**:
- `stream-key-safety` — Stream keys must be stored in vault; never in plaintext in ops documents
- `stream-pipeline-gates` — Pipeline changes require health verification before traffic shift
### Example: Stream Service
```yaml
apiVersion: stream.ops.aiwg.io/v1
kind: StreamService
metadata:
name: live-encoder-primary
namespace: production
spec:
role: ingest
upstream: rtmp://ingest.example.com/live
downstream:
- protocol: HLS
output: s3://media-bucket/live/
segmentDuration: 4
redundancy:
enabled: true
failoverTarget: live-encoder-backup
healthCheck:
endpoint: http://localhost:9090/metrics
interval: 10s
```
### Example Usage
```
Create a stream key rotation runbook for the primary live encoder
```
```
Diagnose why the stream pipeline health check is failing
```
## Using Multiple Extensions Together
Extensions compose cleanly. A `sys` `HostProfile` can be referenced by an `it` `DisasterRecoveryRunbook`, which can reference a `dev` `CIPipeline` to rebuild the recovered service. Cross-extension references follow the same `from:` pattern as the base framework.
```bash
# Deploy all four extensions
aiwg use ops --ext sys,it,dev,stream
# Verify
aiwg list
# ops-complete installed
# ops-complete/sys installed
# ops-complete/it installed
# ops-complete/dev installed
# ops-complete/stream installed
```
## Creating a Custom Extension
See `@$AIWG_ROOT/agentic/code/frameworks/ops-complete/docs/overview.md` for the minimal `ADDON.yaml` manifest. Drop the manifest plus any templates, rules, or skills into `agentic/code/extensions/<name>/` and run `aiwg use ops --ext <name>`.