claude-flow-novice
Version:
Claude Flow Novice - Advanced orchestration platform for multi-agent AI workflows with CFN Loop architecture Includes CodeSearch (hybrid SQLite + pgvector), mem0/memgraph specialists, and all CFN skills.
826 lines (656 loc) • 22.8 kB
Markdown
# RuVector Security Remediation Guide
## Overview
This guide provides specific code changes to remediate the 6 critical/high security findings identified in the RuVector init system audit.
---
## REMEDIATION #1: Implement Backup Before Reset
**File:** `.claude/skills/cfn-local-ruvector-accelerator/src/cli/reset.rs`
**Current Risk:** Complete data loss without backup
**Severity:** CRITICAL
### Current Code
```rust
pub fn execute(&self) -> Result<()> {
let ruvector_dir = self.project_dir.join(".ruvector");
if !self.confirm {
eprintln!("⚠️ This will delete all indexed data!");
eprintln!("To proceed, run with --confirm");
return Ok(());
}
if ruvector_dir.exists() {
fs::remove_dir_all(&ruvector_dir)?; // UNSAFE: No backup!
info!("Reset complete: removed .ruvector directory");
} else {
info!("No RuVector data found to reset");
}
Ok(())
}
```
### Recommended Fix
```rust
use chrono::Local;
use std::path::PathBuf;
pub fn execute(&self) -> Result<()> {
let ruvector_dir = self.project_dir.join(".ruvector");
if !self.confirm {
eprintln!("⚠️ This will delete all indexed data!");
eprintln!("To proceed, run with --confirm");
return Ok(());
}
if ruvector_dir.exists() {
// STEP 1: Create timestamped backup FIRST
let backup_dir = self.create_timestamped_backup(&ruvector_dir)?;
info!("Created backup at: {}", backup_dir.display());
// STEP 2: Then proceed with deletion
fs::remove_dir_all(&ruvector_dir)?;
info!("Reset complete: removed .ruvector directory");
info!("Backup preserved at: {}", backup_dir.display());
// STEP 3: Log the operation
self.log_deletion_event(&backup_dir)?;
} else {
info!("No RuVector data found to reset");
}
Ok(())
}
fn create_timestamped_backup(&self, source_dir: &Path) -> Result<PathBuf> {
use std::fs::create_dir_all;
let timestamp = Local::now().format("%Y%m%d_%H%M%S").to_string();
let backup_dir = self.project_dir.join(".ruvector_backups")
.join(format!("backup_{}", timestamp));
create_dir_all(&backup_dir)?;
// Copy entire directory
copy_dir_recursive(source_dir, &backup_dir)?;
debug!("Backup created: {}", backup_dir.display());
Ok(backup_dir)
}
fn log_deletion_event(&self, backup_location: &Path) -> Result<()> {
let log_entry = format!(
"[{}] Reset command executed. Backup: {}",
Local::now().to_rfc3339(),
backup_location.display()
);
// Write to audit log
let audit_log = self.project_dir.join(".ruvector_audit.log");
std::fs::OpenOptions::new()
.create(true)
.append(true)
.open(&audit_log)?
.write_all(format!("{}\n", log_entry).as_bytes())?;
Ok(())
}
fn copy_dir_recursive(src: &Path, dst: &Path) -> Result<()> {
fs::create_dir_all(dst)?;
for entry in fs::read_dir(src)? {
let entry = entry?;
let ty = entry.file_type()?;
let path = entry.path();
let file_name = entry.file_name();
let new_path = dst.join(file_name);
if ty.is_dir() {
copy_dir_recursive(&path, &new_path)?;
} else {
fs::copy(&path, &new_path)?;
}
}
Ok(())
}
```
### Verification
```rust
#[test]
fn test_reset_creates_backup_before_deletion() {
let temp_dir = tempdir().unwrap();
let test_dir = temp_dir.path();
// Create test data
let ruvector = test_dir.join(".ruvector");
fs::create_dir(&ruvector).unwrap();
fs::write(ruvector.join("test.txt"), "important data").unwrap();
// Execute reset with confirm
let cmd = ResetCommand::new(test_dir, true);
cmd.execute().unwrap();
// Verify backup exists
let backups = test_dir.join(".ruvector_backups");
assert!(backups.exists());
assert!(backups.join("backup_*").exists());
// Verify original was deleted
assert!(!ruvector.exists());
}
```
---
## REMEDIATION #2: Change CASCADE to RESTRICT
**File:** `.claude/skills/cfn-local-ruvector-accelerator/src/schema_v2.rs`
**Current Risk:** Uncontrolled cascading deletes
**Severity:** CRITICAL
### Current Code (Lines 232-283)
```sql
CREATE TABLE IF NOT EXISTS entities (
...
parent_id INTEGER,
...
FOREIGN KEY (parent_id) REFERENCES entities(id) ON DELETE CASCADE -- UNSAFE!
);
CREATE TABLE IF NOT EXISTS refs (
...
FOREIGN KEY (source_entity_id) REFERENCES entities(id) ON DELETE CASCADE -- UNSAFE!
);
CREATE TABLE IF NOT EXISTS entity_embeddings (
entity_id INTEGER PRIMARY KEY,
...
FOREIGN KEY (entity_id) REFERENCES entities(id) ON DELETE CASCADE -- UNSAFE!
);
```
### Recommended Fix
```sql
CREATE TABLE IF NOT EXISTS entities (
id INTEGER PRIMARY KEY AUTOINCREMENT,
kind TEXT NOT NULL,
name TEXT NOT NULL,
...
parent_id INTEGER,
...
-- Change CASCADE to RESTRICT to prevent silent cascades
FOREIGN KEY (parent_id) REFERENCES entities(id) ON DELETE RESTRICT
);
-- Create audit trigger to log deletions
CREATE TRIGGER IF NOT EXISTS log_entity_deletion
BEFORE DELETE ON entities
FOR EACH ROW
BEGIN
INSERT INTO deletion_audit_log (
table_name,
entity_id,
entity_kind,
deleted_at,
deletion_method
) VALUES (
'entities',
OLD.id,
OLD.kind,
strftime('%s', 'now'),
'direct_delete'
);
END;
CREATE TABLE IF NOT EXISTS refs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
source_entity_id INTEGER NOT NULL,
...
-- RESTRICT prevents cascading deletes
FOREIGN KEY (source_entity_id) REFERENCES entities(id) ON DELETE RESTRICT
);
CREATE TABLE IF NOT EXISTS entity_embeddings (
entity_id INTEGER PRIMARY KEY,
...
-- RESTRICT prevents orphaned embeddings
FOREIGN KEY (entity_id) REFERENCES entities(id) ON DELETE RESTRICT
);
-- Add audit log table
CREATE TABLE IF NOT EXISTS deletion_audit_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
table_name TEXT NOT NULL,
entity_id INTEGER,
entity_kind TEXT,
deleted_at INTEGER NOT NULL,
deletion_method TEXT,
created_at INTEGER DEFAULT (strftime('%s', 'now'))
);
CREATE INDEX IF NOT EXISTS idx_deletion_audit_timestamp
ON deletion_audit_log(deleted_at);
```
### Handling RESTRICT Violations
```rust
pub fn safe_delete_entity(&self, entity_id: i64) -> Result<()> {
// First check for dependent records
let ref_count: i64 = self.conn.query_row(
"SELECT COUNT(*) FROM refs WHERE source_entity_id = ?",
[entity_id],
|row| row.get(0)
)?;
if ref_count > 0 {
return Err(anyhow!(
"Cannot delete entity: {} references depend on this entity. \
Use force_delete_with_cascade() to remove all dependent data.",
ref_count
));
}
// Safe to delete
self.conn.execute("DELETE FROM entities WHERE id = ?", [entity_id])?;
Ok(())
}
pub fn force_delete_with_cascade(&self, entity_id: i64) -> Result<()> {
// Only called with explicit user approval
let mut tx = self.conn.transaction()?;
// Delete in dependency order (children first)
tx.execute("DELETE FROM entity_embeddings WHERE entity_id = ?", [entity_id])?;
tx.execute("DELETE FROM type_usage WHERE entity_id = ?", [entity_id])?;
tx.execute("DELETE FROM refs WHERE source_entity_id = ?", [entity_id])?;
tx.execute("DELETE FROM entities WHERE id = ?", [entity_id])?;
// Log the cascade
tx.execute(
"INSERT INTO deletion_audit_log (table_name, entity_id, deletion_method) \
VALUES ('entities', ?, 'force_cascade_delete')",
[entity_id]
)?;
tx.commit()?;
Ok(())
}
```
---
## REMEDIATION #3: Add Preview Mode to Cleanup
**File:** `.claude/skills/cfn-local-ruvector-accelerator/src/cli/cleanup.rs`
**Current Risk:** Deletion without visibility into impact
**Severity:** HIGH
### Current Code
```rust
fn remove_old_embeddings(&self, store: &SqliteStore, days: u32) -> Result<()> {
info!("Removing embeddings older than {} days", days);
let cutoff = SystemTime::now()
.duration_since(UNIX_EPOCH)?
.as_secs() - (days as u64 * 86400);
let removed = if self.dry_run {
store.count_old_embeddings(cutoff)?
} else {
store.remove_old_embeddings(cutoff)? // UNSAFE: No preview!
};
if self.dry_run {
info!("Would remove {} old embeddings", removed);
} else {
info!("Removed {} old embeddings", removed);
}
Ok(())
}
```
### Recommended Fix
```rust
pub struct CleanupCommand {
// ... existing fields ...
preview: bool, // NEW: Add preview flag
backup_before_delete: bool, // NEW: Backup deleted records
}
impl CleanupCommand {
pub fn execute(&self) -> Result<()> {
info!("Starting cleanup process");
if self.dry_run {
info!("Running in dry-run mode - no changes will be made");
}
// STEP 1: Preview what will be deleted
let preview = self.preview_cleanup()?;
println!("\n{}", self.format_cleanup_preview(&preview));
if self.preview {
info!("Preview mode: showing what would be deleted");
return Ok(());
}
// STEP 2: Get confirmation unless forced
if !self.force && !self.dry_run {
eprintln!("\n⚠️ This will permanently delete records above.");
eprintln!("Run with --force to proceed, or --preview to see without changes.");
return Ok(());
}
// STEP 3: Backup deleted records if requested
if self.backup_before_delete {
self.export_cleanup_records(&preview)?;
}
// STEP 4: Execute cleanup
self.execute_cleanup(&preview)?;
Ok(())
}
fn preview_cleanup(&self) -> Result<CleanupPreview> {
let store = SqliteStore::new(&self.project_dir.join(".ruvector").join("index.db"))?;
let mut preview = CleanupPreview::default();
if let Some(days) = self.older_than {
let cutoff = SystemTime::now()
.duration_since(UNIX_EPOCH)?
.as_secs() - (days as u64 * 86400);
preview.old_embeddings_count = store.count_old_embeddings(cutoff)?;
preview.oldest_embedding_date = store.find_oldest_embedding_before(cutoff)?;
}
if self.remove_orphans {
preview.orphaned_embeddings_count = store.count_orphaned_embeddings()?;
}
Ok(preview)
}
fn format_cleanup_preview(&self, preview: &CleanupPreview) -> String {
let mut output = String::from("\n=== Cleanup Preview ===\n");
if let Some(count) = preview.old_embeddings_count {
output.push_str(&format!(
" Old embeddings (>{} days): {}\n",
self.older_than.unwrap_or(30),
count
));
if let Some(date) = &preview.oldest_embedding_date {
output.push_str(&format!(" Oldest embedding from: {}\n", date));
}
}
if let Some(count) = preview.orphaned_embeddings_count {
output.push_str(&format!(" Orphaned embeddings: {}\n", count));
}
output
}
fn export_cleanup_records(&self, preview: &CleanupPreview) -> Result<()> {
let timestamp = Local::now().format("%Y%m%d_%H%M%S").to_string();
let export_path = self.project_dir
.join(".ruvector_backups")
.join(format!("cleanup_export_{}.json", timestamp));
fs::create_dir_all(export_path.parent().unwrap())?;
// Export records before deletion
let records = self.collect_records_for_deletion()?;
let json = serde_json::to_string_pretty(&records)?;
fs::write(&export_path, json)?;
info!("Exported {} records to: {}", records.len(), export_path.display());
Ok(())
}
}
#[derive(Debug, Default)]
struct CleanupPreview {
old_embeddings_count: Option<usize>,
oldest_embedding_date: Option<String>,
orphaned_embeddings_count: Option<usize>,
}
```
---
## REMEDIATION #4: Preserve Migration Backups
**File:** `.claude/skills/cfn-local-ruvector-accelerator/src/migration.rs`
**Current Risk:** Immediate loss of backup after migration
**Severity:** CRITICAL
### Current Code
```rust
fn cleanup_after_migration(&self, old_version: u32) -> Result<()> {
// ... validation code ...
// Drop backup tables after successful migration
self.conn.execute_batch(
r#"
DROP TABLE IF EXISTS embeddings_v1_backup;
DROP TABLE IF EXISTS files_v1_backup;
"#
)?;
self.conn.execute("VACUUM", [])?;
Ok(())
}
```
### Recommended Fix
```rust
const BACKUP_RETENTION_DAYS: u32 = 7;
fn cleanup_after_migration(&self, old_version: u32) -> Result<()> {
info!("Cleaning up after migration from version {}", old_version);
// Verify migration was successful
let new_entities_count: i64 = self.conn.query_row(
"SELECT COUNT(*) FROM entities",
[],
|row| row.get(0)
)?;
if new_entities_count == 0 && old_version > 0 {
warn!("No entities found after migration, keeping backup tables");
return Ok(());
}
// Create recovery record BEFORE dropping backups
self.create_backup_recovery_record(old_version)?;
// Check backup retention policy
let should_keep_backup = self.should_keep_backup(old_version)?;
if should_keep_backup {
info!("Keeping backup tables for {} days (recovery period)", BACKUP_RETENTION_DAYS);
return Ok(());
}
// Safe to drop backups - but create export first
info!("Exporting backup data before cleanup");
self.export_backup_tables()?;
// ONLY NOW drop backup tables
self.conn.execute_batch(
r#"
DROP TABLE IF EXISTS embeddings_v1_backup;
DROP TABLE IF EXISTS files_v1_backup;
"#
)?;
info!("Backup tables dropped successfully");
// Run VACUUM to reclaim space
debug!("Running VACUUM to reclaim database space");
self.conn.execute("VACUUM", [])?;
Ok(())
}
fn should_keep_backup(&self, old_version: u32) -> Result<bool> {
// Check when migration was done
let migration_time: i64 = self.conn.query_row(
"SELECT applied_at FROM schema_version WHERE version = ? LIMIT 1",
[old_version],
|row| row.get(0)
)?;
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)?
.as_secs() as i64;
let age_days = (now - migration_time) / (24 * 3600);
Ok(age_days < BACKUP_RETENTION_DAYS as i64)
}
fn create_backup_recovery_record(&self, old_version: u32) -> Result<()> {
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)?
.as_secs();
self.conn.execute(
"INSERT INTO migration_recovery (
source_version, target_version, backup_created_at,
backup_expires_at, status
) VALUES (?, ?, ?, ?, 'active')",
rusqlite::params![
old_version,
2,
now,
now + (BACKUP_RETENTION_DAYS as u64 * 24 * 3600)
]
)?;
Ok(())
}
fn export_backup_tables(&self) -> Result<()> {
use chrono::Local;
let timestamp = Local::now().format("%Y%m%d_%H%M%S").to_string();
let backup_dir = self.db_path
.parent()
.unwrap()
.join(format!("migration_backup_{}", timestamp));
fs::create_dir_all(&backup_dir)?;
// Export v1 embeddings
let mut stmt = self.conn.prepare(
"SELECT pattern, embedding, metadata FROM embeddings_v1_backup"
)?;
let embeddings_file = fs::File::create(backup_dir.join("embeddings.json"))?;
let writer = io::BufWriter::new(embeddings_file);
// ... write JSON records ...
info!("Backup exported to: {}", backup_dir.display());
Ok(())
}
```
---
## REMEDIATION #5: Protect index_all.sh
**File:** `.claude/skills/cfn-local-ruvector-accelerator/index_all.sh`
**Current Risk:** Unconditional index deletion
**Severity:** CRITICAL
### Current Code
```bash
#!/bin/bash
# Index all files in the project
echo "Starting comprehensive indexing of all files..."
cd .claude/skills/cfn-local-ruvector-accelerator
# Clear existing index
rm -rf index/ # UNSAFE: No confirmation, no backup!
```
### Recommended Fix
```bash
#!/bin/bash
# Index all files in the project
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
INDEX_DIR="$SCRIPT_DIR/index"
PRESERVE_INDEX="${PRESERVE_INDEX:-false}"
BACKUP_INDEX="${BACKUP_INDEX:-true}"
log_info() {
echo "[INFO] $1"
}
log_error() {
echo "[ERROR] $1" >&2
}
# Function to backup index
backup_index() {
if [[ ! -d "$INDEX_DIR" ]]; then
return 0
fi
local timestamp=$(date +%Y%m%d_%H%M%S)
local backup_dir="${INDEX_DIR}_backup_${timestamp}"
log_info "Creating backup: $backup_dir"
cp -r "$INDEX_DIR" "$backup_dir"
log_info "Backup created successfully"
}
# Function to clear index
clear_index() {
if [[ ! -d "$INDEX_DIR" ]]; then
log_info "No existing index to clear"
return 0
fi
if [[ "$BACKUP_INDEX" == "true" ]]; then
backup_index
fi
log_info "Clearing index directory"
rm -rf "$INDEX_DIR"
# Log the action
{
echo "$(date '+%Y-%m-%d %H:%M:%S') - Index cleared"
} >> "$SCRIPT_DIR/.index_audit.log"
}
# Show usage
show_usage() {
cat << 'EOF'
Usage: ./index_all.sh [OPTIONS]
Options:
--preserve-index Keep existing index (incremental update)
--no-backup Don't backup before clearing
--force Force re-indexing all files
Environment:
PRESERVE_INDEX=true ./index_all.sh
BACKUP_INDEX=false ./index_all.sh
EOF
}
# Parse arguments
while [[ $# -gt 0 ]]; do
case $1 in
--preserve-index)
PRESERVE_INDEX=true
shift
;;
--no-backup)
BACKUP_INDEX=false
shift
;;
--force)
# Force reindexing (default behavior after clear)
shift
;;
--help)
show_usage
exit 0
;;
*)
log_error "Unknown option: $1"
show_usage
exit 1
;;
esac
done
log_info "Starting comprehensive indexing of all files..."
cd "$SCRIPT_DIR"
# Clear or preserve index
if [[ "$PRESERVE_INDEX" == "true" ]]; then
log_info "Preserving existing index (incremental mode)"
else
log_info "Index will be cleared and rebuilt"
clear_index
fi
# ... rest of indexing code ...
```
---
## REMEDIATION #6: Protect Test Script Cleanup
**File:** `.claude/skills/cfn-local-ruvector-accelerator/test-local-ruvector.sh`
**Current Risk:** Unprotected directory deletion via variables
**Severity:** MEDIUM
### Current Code
```bash
# Clean up previous test
rm -rf "$STORAGE_PATH" "$TEST_DIR"
```
### Recommended Fix
```bash
#!/bin/bash
# test-local-ruvector.sh - Test Local RuVector implementation
set -euo pipefail
# Use mktemp for safer temporary directories
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
STORAGE_PATH="$(mktemp -d -t ruvector-test-storage-XXXXXX)"
TEST_DIR="$(mktemp -d -t ruvector-test-project-XXXXXX)"
# Cleanup trap - only runs on exit
cleanup() {
local exit_code=$?
if [[ -d "$STORAGE_PATH" ]]; then
echo "Cleaning up test storage: $STORAGE_PATH"
rm -rf "$STORAGE_PATH"
fi
if [[ -d "$TEST_DIR" ]]; then
echo "Cleaning up test directory: $TEST_DIR"
rm -rf "$TEST_DIR"
fi
exit $exit_code
}
trap cleanup EXIT
# Verify paths are safe (sanity checks)
if [[ ! "$STORAGE_PATH" =~ ^/tmp/ruvector-test-storage ]]; then
echo "ERROR: Invalid storage path: $STORAGE_PATH" >&2
exit 1
fi
if [[ ! "$TEST_DIR" =~ ^/tmp/ruvector-test-project ]]; then
echo "ERROR: Invalid test directory: $TEST_DIR" >&2
exit 1
fi
# Verify paths don't exist or are empty
if [[ -d "$STORAGE_PATH" && -n $(find "$STORAGE_PATH" -type f 2>/dev/null | head -1) ]]; then
echo "ERROR: Storage path not empty: $STORAGE_PATH" >&2
exit 1
fi
echo "🧪 Testing Local RuVector Accelerator..."
echo "Storage: $STORAGE_PATH"
echo "Test Dir: $TEST_DIR"
mkdir -p "$TEST_DIR"
# ... rest of test code ...
# Note: cleanup happens automatically via trap on exit
```
---
## Implementation Checklist
- [ ] Remediation #1: Reset backup mechanism implemented
- [ ] Remediation #2: CASCADE changed to RESTRICT
- [ ] Remediation #3: Cleanup preview mode added
- [ ] Remediation #4: Migration backups retained 7 days
- [ ] Remediation #5: index_all.sh protected
- [ ] Remediation #6: Test script cleanup protected
- [ ] Unit tests added for each remediation
- [ ] Integration tests verify fixes
- [ ] Backward compatibility verified
- [ ] Performance impact assessed
- [ ] Deployment plan documented
- [ ] Team review completed
- [ ] Production deployment approved
---
## Testing Strategy
Each remediation should include:
1. Unit test for the specific fix
2. Integration test with real data
3. Edge case testing (empty dirs, permissions, etc.)
4. Concurrent operation testing
5. Recovery/rollback testing
---
## Deployment Order
1. First: Implement backups (Remediation #1, #4)
2. Second: Add constraints (Remediation #2)
3. Third: Enhance UX (Remediation #3)
4. Fourth: Script fixes (Remediation #5, #6)
5. Finally: Full integration test and production deployment
---
## Success Criteria
- All destructive operations have backups
- No silent cascading deletes
- All dangerous operations require explicit confirmation
- Audit trail exists for all deletions
- Recovery mechanism available for 7+ days
- Tests pass 100%
- No performance regression
- All findings marked REMEDIATED