aiwg

Version:

Deployment tool and support utility for AI context. Copies agents, skills, commands, rules, and behaviors into the paths each AI platform reads (Claude Code, Codex, Copilot, Cursor, Warp, OpenClaw, and 6 more) so one source of truth works across 10 platfo

aiwg.io

jmagly/aiwg

265 lines (184 loc) • 9.13 kB

Markdown

# Security Incident Runbook **Incident ID**: INC-{id} **Date Declared**: {YYYY-MM-DD HH:MM UTC} **Declared By**: {operator} **Incident Commander**: {operator} **Status**: {active|contained|remediated|closed} --- ## Incident Classification | Field | Value | |-------|-------| | Severity | {P1-Critical \| P2-High \| P3-Medium \| P4-Low} | | Type | {unauthorized-access \| data-breach \| ransomware \| key-compromise \| service-disruption \| insider-threat \| supply-chain} | | Affected Systems | {comma-separated list of hostnames, services, or data stores} | | Data Classification | {public \| internal \| confidential \| regulated} | | Estimated Scope | {number of users, records, or systems potentially affected} | | Regulatory Trigger | {GDPR \| HIPAA \| PCI-DSS \| SOC2 \| none} | ### Severity Definitions | Severity | Definition | Response Time | |----------|-----------|---------------| | P1-Critical | Active breach, data exfiltration in progress, ransomware executing, CA key compromise | Immediate — all hands | | P2-High | Confirmed unauthorized access, suspected data access, credential compromise | Within 1 hour | | P3-Medium | Suspicious activity, failed intrusion attempt, policy violation | Within 4 hours | | P4-Low | Anomalous behavior, informational finding, near-miss | Next business day | --- ## Containment ### Immediate Actions (complete within first 30 minutes for P1/P2) - [ ] Incident declared and incident commander assigned - [ ] Affected systems identified - [ ] Stakeholders notified (see Communication section) - [ ] Evidence preservation started BEFORE containment actions (see Evidence Collection) - [ ] Containment actions initiated ### Network Isolation (if needed) **CAUTION**: Network isolation may destroy volatile evidence. Capture memory and network state BEFORE isolating. ```bash # Capture current network connections before isolation ss -tnp > /tmp/incident-network-state-$(date +%Y%m%dT%H%M%S).txt netstat -rn > /tmp/incident-routes-$(date +%Y%m%dT%H%M%S).txt # Block outbound from compromised host (adjust interface and target) # Present to operator — requires root and knowledge of network topology sudo iptables -I OUTPUT -j DROP sudo iptables -I FORWARD -j DROP # Or use firewall zone change (firewalld) sudo firewall-cmd --zone=drop --change-interface={eth0} ``` ### Credential Containment (if credential compromise suspected) 1. Rotate affected SSH keys or certificates immediately: - Revoke SSH certificates via SSH CA KRL (see SSH CA Runbook Procedure 4) - Remove `authorized_keys` entries for compromised keys from all hosts - Rotate service account API keys and tokens 2. Revoke TLS certificates if key material is suspected compromised: - Follow CA Operations Runbook Procedure 3 (Revoke Certificate) 3. Disable compromised operator accounts in IdP (Keycloak, LDAP, AD) 4. Rotate LUKS passphrase on affected host (do NOT remove TPM2 slot — follow LUKS Enrollment Runbook protocol) --- ## Evidence Collection Evidence must be collected before containment actions that would destroy it. All evidence must be collected with chain of custody documentation. ### Chain of Custody Record | Item | Collection Time (UTC) | Collector | Storage Location | Hash (SHA256) | |------|----------------------|-----------|-----------------|---------------| | {description} | {YYYY-MM-DD HH:MM} | {operator} | {path or location} | {hash} | ### Log Preservation ```bash # Collect and hash system logs before rotation TIMESTAMP=$(date +%Y%m%dT%H%M%S) mkdir -p /secure/incident-{INC-id}/logs # System logs sudo journalctl -a --since="24 hours ago" > /secure/incident-{INC-id}/logs/journal-${TIMESTAMP}.log sha256sum /secure/incident-{INC-id}/logs/journal-${TIMESTAMP}.log # Auth log sudo cp /var/log/auth.log /secure/incident-{INC-id}/logs/auth-${TIMESTAMP}.log sha256sum /secure/incident-{INC-id}/logs/auth-${TIMESTAMP}.log # Syslog sudo cp /var/log/syslog /secure/incident-{INC-id}/logs/syslog-${TIMESTAMP}.log sha256sum /secure/incident-{INC-id}/logs/syslog-${TIMESTAMP}.log # Application logs (adjust paths) sudo cp -r /var/log/{nginx,apache2,postgresql} /secure/incident-{INC-id}/logs/ ``` ### Memory Capture (P1 incidents — do before any reboot or shutdown) ```bash # Capture running process list ps auxf > /secure/incident-{INC-id}/processes-${TIMESTAMP}.txt # Capture open network connections ss -tnp > /secure/incident-{INC-id}/network-connections-${TIMESTAMP}.txt # Capture loaded kernel modules lsmod > /secure/incident-{INC-id}/kernel-modules-${TIMESTAMP}.txt # Full memory dump (requires LiME or equivalent — large, requires planning) # sudo insmod lime-$(uname -r).ko "path=/secure/incident-{INC-id}/memory.lime format=lime" ``` ### Disk Image (P1 incidents with suspected persistent malware) ```bash # Create forensic image of affected disk (do NOT image the running OS disk while mounted write) # Boot from forensic live environment, then: # sudo dcfldd if=/dev/{device} of=/secure/incident-{INC-id}/disk-image.dd hash=sha256 hashlog=/secure/incident-{INC-id}/disk-image.hash ``` --- ## Analysis ### Indicators of Compromise (IOC) Extraction Document all IOCs discovered during investigation: | IOC Type | Value | Context | Confidence | |----------|-------|---------|-----------| | IP Address | {ip} | {source of finding} | {high\|medium\|low} | | Domain | {domain} | {source of finding} | {high\|medium\|low} | | File Hash | {sha256} | {file path} | {high\|medium\|low} | | User Account | {username} | {context} | {high\|medium\|low} | | SSH Key Fingerprint | {SHA256:...} | {context} | {high\|medium\|low} | ### Timeline | Time (UTC) | Event | Evidence Source | |-----------|-------|-----------------| | {YYYY-MM-DD HH:MM} | {description of event} | {log file, alert, operator observation} | ### Root Cause Analysis **Initial Access Vector**: {phishing \| stolen-credential \| vulnerability-exploit \| insider \| supply-chain \| unknown} **Attack Path**: 1. {Step 1 of observed attack chain} 2. {Step 2} 3. {Step 3} **Root Cause**: {description of the fundamental vulnerability, misconfiguration, or gap that enabled the incident} --- ## Remediation ### Immediate Remediation (within 24 hours for P1/P2) - [ ] {Specific action: patch vulnerability CVE-XXXX-YYYY on affected hosts} - [ ] {Rotate all credentials that may have been exposed} - [ ] {Remove persistence mechanisms identified during analysis} - [ ] {Update access controls to prevent recurrence} ### Short-Term Remediation (within 1 week) - [ ] {Deploy detection rule for IOCs to SIEM} - [ ] {Harden affected system configurations} - [ ] {Update incident response playbooks based on lessons learned} ### Long-Term Remediation (within 30 days) - [ ] {Architectural or process change to address root cause} - [ ] {Security training for affected team members} - [ ] {Additional monitoring or alerting} --- ## Communication ### Internal Notification | Role | Person | Notified At (UTC) | Method | |------|--------|------------------|--------| | On-Call | {name} | {YYYY-MM-DD HH:MM} | {PagerDuty\|Slack\|phone} | | Security Lead | {name} | {YYYY-MM-DD HH:MM} | {method} | | Engineering Lead | {name} | {YYYY-MM-DD HH:MM} | {method} | | Legal / Compliance | {name} | {YYYY-MM-DD HH:MM} | {method} | | Executive | {name} | {YYYY-MM-DD HH:MM} | {method} | ### External Disclosure (if required) **Regulatory Notification Deadlines**: | Framework | Trigger | Deadline | Notified | |-----------|---------|----------|---------| | GDPR | Personal data of EU residents affected | 72 hours from discovery | {yes\|no\|not-applicable} | | HIPAA | PHI affected | 60 days from discovery | {yes\|no\|not-applicable} | | PCI-DSS | Cardholder data affected | Immediately to acquirer + card brands | {yes\|no\|not-applicable} | | State breach laws | PII of state residents affected | Varies by state (30-90 days) | {yes\|no\|not-applicable} | **Affected Users / Customers**: {yes \| no \| under-investigation} If yes, notification plan: {description} --- ## Postmortem **Postmortem Date**: {YYYY-MM-DD} **Facilitator**: {operator} **Participants**: {list} ### Timeline (Final) | Time (UTC) | Event | |-----------|-------| | {YYYY-MM-DD HH:MM} | {event} | ### What Went Well - {detection was fast because ...} - {containment procedure worked because ...} ### What Could Have Been Better - {detection gap: ...} - {response gap: ...} ### Contributing Factors | Factor | Category | Explanation | |--------|----------|-------------| | {factor} | {technical\|process\|people\|tooling} | {explanation} | ### Prevention Measures | Measure | Owner | Target Date | Status | |---------|-------|-------------|--------| | {specific preventive action} | {operator} | {YYYY-MM-DD} | {planned\|in-progress\|done} | ### Incident Metrics | Metric | Value | |--------|-------| | Time to Detection | {N} minutes/hours | | Time to Containment | {N} minutes/hours | | Time to Remediation | {N} hours/days | | Time to Closure | {N} hours/days | | Systems Affected | {N} | | Users / Records Affected | {N} or unknown | | Total Incident Cost | {estimate or TBD} |