claude-flow-novice
Version:
Claude Flow Novice - Advanced orchestration platform for multi-agent AI workflows with CFN Loop architecture Includes Local RuVector Accelerator and all CFN skills for complete functionality.
429 lines (346 loc) • 19 kB
Plain Text
================================================================================
RUVECTOR QUERY ISOLATION AUDIT - EXECUTIVE SUMMARY
================================================================================
ASSESSMENT DATE: 2025-12-11
AUDIT STATUS: CRITICAL VULNERABILITIES IDENTIFIED
DATABASE: ~/.local/share/ruvector/index_v2.db (Centralized, Multi-Project)
ENTITIES AT RISK: 783,891+ rows (Project A + Project B + others)
================================================================================
CRITICAL FINDING
================================================================================
The RuVector centralized database provides ZERO cross-project isolation.
Any query from Project B can access Project A's entire codebase.
No database-level filtering exists.
No WHERE clause project restrictions in core queries.
No project identifier column in schema.
Result: 100% DATA LEAKAGE in multi-project scenarios.
================================================================================
ISOLATION AUDIT RESULTS
================================================================================
QUERY ANALYSIS
--------------
Function Filtering WHERE Clause Risk Level
═══════════════════════════════════════════════════════════════════════
QueryV2::search() NONE NO CRITICAL
QueryV2::search_similar_entities() NONE NO CRITICAL
StoreV2::find_entities_by_name() NONE NO CRITICAL
StoreV2::find_entities_by_kind() NONE NO CRITICAL
StoreV2::search_entities() NONE NO CRITICAL
StoreV2::find_references_to_entity() NONE NO CRITICAL
StoreV2::find_references_from_entity() NONE NO CRITICAL
StoreV2::find_entities_using_type() NONE NO CRITICAL
QueryCommand::execute() OPTIONAL STRING (client-side) HIGH
StoreV2::find_entities_in_file() PATH ONLY NO VALIDATION MEDIUM
Total methods analyzed: 10
Unfiltered global results: 8/10 (80%)
Client-side filtering: 1/10 (insufficient)
Properly isolated: 0/10
================================================================================
PATH-BASED IDENTIFICATION ASSESSMENT
================================================================================
Implementation Type: File path string as sole isolation mechanism
Example Paths:
/home/user/project-a/src/auth.rs
/home/user/project-b/src/auth.rs
/home/user/project-c/src/oauth.ts
Strengths:
✓ Path is recorded in database
✓ Can enable retrospective filtering
✓ Supports detection of cross-project leakage
Weaknesses:
✗ No validation that queries use current project's path
✗ No path canonicalization (symlinks, relative paths unreliable)
✗ No directory traversal protection (../../../)
✗ Substring matching vulnerable (file_filter.contains() easily bypassed)
✗ Case sensitivity issues on case-insensitive filesystems
✗ No structured project ID column
✗ No database-level constraint enforcement
Edge Cases:
• Symlink attacks: Link to other project accessible via symlink
• Relative path traversal: ../../project-a/src/ not validated
• Path case issues: PROJECT-A vs project-a mismatch
• Network paths: NFS, Samba mounts may resolve differently
CONCLUSION: Path-based isolation INSUFFICIENT for production use.
================================================================================
CROSS-PROJECT LEAKAGE RISKS (SEVERITY LEVELS)
================================================================================
RISK #1: SEMANTIC SEARCH LEAKAGE [CRITICAL]
────────────────────────────────────────────
Vector: query.search("authentication", 10, 0.5)
Impact: Returns auth code from ALL projects
Exposure: 100% of matching entities across all projects
Exploit: 0-click, runs on every search
Fix Time: Add WHERE project_root = ? clause (30 minutes)
Status: EXPLOITABLE NOW
RISK #2: ENTITY ENUMERATION BY KIND [CRITICAL]
────────────────────────────────────────────
Vector: store.find_entities_by_kind(Class, 500)
Impact: Gets 500 classes from Project A, B, C...
Exposure: All class definitions across all projects
Exploit: Loop over EntityKind enum (8 types total)
Fix Time: Add WHERE project_root = ? clause (30 minutes)
Status: EXPLOITABLE NOW
RISK #3: NAME-BASED ENTITY DISCOVERY [CRITICAL]
────────────────────────────────────────────
Vector: store.find_entities_by_name("authenticate", 500)
Impact: Returns all authenticate() functions from all projects
Exposure: Function signatures, implementations, patterns
Exploit: Guess common function names
Fix Time: Add WHERE project_root = ? clause (30 minutes)
Status: EXPLOITABLE NOW
RISK #4: SIMILAR ENTITY MAPPING [CRITICAL]
────────────────────────────────────────────
Vector: query.search_similar_entities(entity_id, 10, 0.5)
Impact: Maps similar code across all projects
Exposure: Reveals architecture, design patterns, variable naming
Exploit: Brute force entity IDs (1-9B range)
Fix Time: Add project_root param and WHERE filter (1 hour)
Status: EXPLOITABLE NOW
RISK #5: DIRECTORY TRAVERSAL [CRITICAL]
────────────────────────────────────────
Vector: store.find_entities_in_file("/home/user/project-a/src/auth.rs")
Impact: Direct access to any project's file
Exposure: Complete file entity lists for any project
Exploit: Guess project paths (easy with standard naming)
Fix Time: Add path validation and canonicalization (2 hours)
Status: EXPLOITABLE NOW
RISK #6: BATCH QUERY UNFILTERING [HIGH]
────────────────────────────────────────
Vector: BatchQueryCommand reads queries from external file
Impact: No per-line project scoping in batch mode
Exposure: All queries return global results
Exploit: Create batch file with cross-project queries
Fix Time: Apply same fixes as Risk #1
Status: EXPLOITABLE NOW
================================================================================
VULNERABILITY TEST SCENARIO
================================================================================
SETUP:
Centralized DB contains:
Project A: /home/user/project-a/ (783,891 entities)
Includes: auth.ts, database.ts, crypto.ts
Project B: /home/user/project-b/ (empty, just initialized)
ACTION (from Project B context):
query.search("authentication", 10, 0.5)
EXPECTED RESULT:
Query returns 0 results (Project B empty)
OR
Query returns only Project B results if any match
ACTUAL RESULT:
Query returns results from Project A:
[1] authenticate() from /home/user/project-a/src/auth.ts
[2] OAuth provider from /home/user/project-a/src/oauth.ts
[3] SessionManager from /home/user/project-a/src/session.ts
[4] CredentialManager from /home/user/project-a/src/crypto.ts
... (up to 10 results from Project A)
ASSESSMENT: 100% LEAKAGE CONFIRMED
No filtering mechanism prevents Project B from accessing Project A's code.
================================================================================
DATABASE SCHEMA ISSUES
================================================================================
MISSING COLUMNS:
□ project_root TEXT NOT NULL - Primary isolation mechanism
□ project_id INTEGER - Explicit project reference
□ access_level - Could support future fine-grained access control
MISSING CONSTRAINTS:
□ UNIQUE(project_root, file_path) - Prevent cross-project dupes
□ CHECK(project_root LIKE '/home/%') - Validate path format
□ FK constraint on refs ensuring same-project references
MISSING INDEXES:
□ idx_entities_project_kind ON entities(project_root, kind)
□ idx_entities_project_name ON entities(project_root, name)
□ idx_refs_project_source ON refs(project_root, source_entity_id)
□ idx_type_usage_project_entity ON type_usage(project_root, entity_id)
CURRENT INDEXES (UNUSED):
✓ idx_entities_file_path - Created but queries don't use WHERE file_path
✓ idx_refs_file_path - Created but queries don't filter on file_path
SCHEMA RISK LEVEL: CRITICAL
Requires migration to add isolation guarantees.
================================================================================
CENTRALIZED DB ISOLATION GUARANTEES
================================================================================
Database-Level Enforcement: NONE
Application-Level Filtering: OPTIONAL (file_filter parameter)
Code-Level Validation: NONE
Current Flow:
CLI: QueryCommand.execute()
├─ Captures project_dir
├─ Calls QueryV2.search() ← NO project_root passed
│ ├─ SQL: "SELECT * FROM entities e JOIN entity_embeddings ee ON ..."
│ │ ↑ NO WHERE clause
│ ├─ Returns ALL rows matching similarity
│ └─ No project filtering
├─ Optional file_filter.contains() check ← Client-side, insufficient
└─ Output results
Problems:
1. Project context lost between CLI and query layer
2. No WHERE clause enforces isolation in SQL
3. Client-side filtering optional and easily bypassed
4. File_filter uses substring match (false positives)
5. Defense-in-depth violated (no DB-level security)
Fix Requires:
1. Add project_root column to schema
2. Update ALL 10 query methods to accept project_root parameter
3. Add WHERE project_root = ? to every query
4. Pass project_root from CLI to query layer
5. Add path validation helper
6. Remove optional file_filter (enforce at DB level)
================================================================================
QUERY METHOD FILTERING COVERAGE
================================================================================
Method File Path Status
────────────────────────────────────────────────────────────────────
search() query_v2.rs:42 ✗ UNFILTERED
search_similar_entities() query_v2.rs:136 ✗ UNFILTERED
find_entities_by_name() store_v2.rs:143 ✗ UNFILTERED
find_entities_by_kind() store_v2.rs:158 ✗ UNFILTERED
find_entities_in_file() store_v2.rs:173 ⚠ PATH ONLY (no validation)
search_entities() store_v2.rs:187 ✗ UNFILTERED
find_entities_using_type() store_v2.rs:285 ✗ UNFILTERED
find_references_to_entity() store_v2.rs:235 ✗ UNFILTERED
find_references_from_entity() store_v2.rs:249 ✗ UNFILTERED
find_module_by_file() store_v2.rs:321 ⚠ EXACT MATCH (if validated)
Coverage: 0/10 properly isolated
Unfiltered: 8/10 methods
Partially safe: 2/10 methods (if input validated, which isn't)
================================================================================
RECOMMENDATIONS - PRIORITY ORDER
================================================================================
CRITICAL (Week 1 - Must Fix Before Any Production Use):
────────────────────────────────────────────────────────
[1] Add project_root column to entities, refs, type_usage, modules tables
Effort: 2 hours
Impact: Enables database-level isolation
[2] Update QueryV2::search() to accept project_root parameter
Add WHERE e.project_root = ? clause
Effort: 1 hour
Impact: Fixes semantic search leakage
[3] Update QueryV2::search_similar_entities() with project_root filter
Effort: 1 hour
Impact: Fixes similarity-based leakage
[4] Update all 8 StoreV2 unfiltered methods with project_root filtering
Effort: 2 hours (repeated pattern)
Impact: Closes 80% of query gaps
[5] Add path validation helper (canonicalize, traverse check)
Effort: 1 hour
Impact: Prevents directory traversal attacks
[6] Update QueryCommand to pass project_root to all query methods
Effort: 1 hour
Impact: Connects CLI context to database queries
[7] Create comprehensive test suite for isolation
Effort: 2 hours
Impact: Prevents regression
Total P0 Effort: ~10 hours (1-2 developer days)
HIGH (Week 2-3 - Important for Robustness):
──────────────────────────────────────────
[8] Add project consistency check constraints (FKs within project)
Effort: 2 hours
[9] Create audit logging table for query operations
Effort: 2 hours
[10] Create composite indexes on (project_root, kind), (project_root, name)
Effort: 1 hour
[11] Add documentation on isolation assumptions and API contracts
Effort: 2 hours
MEDIUM (Week 3-4 - Nice to Have):
────────────────────────────────
[12] Performance tuning for project-scoped queries
[13] Add rate limiting per project
[14] Create project-scoped access control layer
================================================================================
RECOMMENDATIONS - CODE CHANGES
================================================================================
SCHEMA CHANGE:
──────────────
ALTER TABLE entities ADD COLUMN project_root TEXT NOT NULL DEFAULT '';
UPDATE entities SET project_root = SUBSTR(file_path, 1, INSTR(file_path, '/src/') - 1);
ALTER TABLE entities ADD CONSTRAINT entities_project_check CHECK(project_root != '');
API CHANGES:
────────────
OLD: pub fn search(&self, query: &str, max_results: usize, threshold: f32)
NEW: pub fn search(&self, query: &str, max_results: usize, threshold: f32, project_root: &str)
OLD SQL: "SELECT ... FROM entities e JOIN entity_embeddings ee ON e.id = ee.entity_id"
NEW SQL: "SELECT ... FROM entities e JOIN entity_embeddings ee ON e.id = ee.entity_id WHERE e.project_root = ?"
CLI CHANGES:
────────────
let project_root = self.project_dir.canonicalize()?.to_str()?;
let results = self.query_v2.search(query, max_results, threshold, project_root)?;
VALIDATION CHANGES:
───────────────────
fn validate_project_path(file_path: &str, project_root: &str) -> Result<()> {
let canonical_file = std::fs::canonicalize(file_path)?;
let canonical_project = std::fs::canonicalize(project_root)?;
if !canonical_file.starts_with(&canonical_project) {
return Err(anyhow!("Path traversal detected"));
}
Ok(())
}
================================================================================
TESTING VERIFICATION
================================================================================
Before accepting ANY fix, create and run these tests:
TEST 1: test_cross_project_search_isolation
Setup: Add 100 entities to Project A, 10 to Project B in same DB
Action: Search from Project B context
Assert: Results contain ONLY Project B entities
Status: MUST PASS to consider fix valid
TEST 2: test_cross_project_find_by_name_isolation
Setup: Same 100/10 entity distribution
Action: find_entities_by_name from Project B context
Assert: Results contain ONLY Project B entities with that name
Status: MUST PASS
TEST 3: test_cross_project_find_by_kind_isolation
Setup: Same distribution
Action: find_entities_by_kind(Class) from Project B
Assert: Results contain ONLY Project B classes
Status: MUST PASS
TEST 4: test_directory_traversal_blocked
Setup: Create Project A and B
Action: Query with path containing "../../../project-a"
Assert: Error or blocked traversal
Status: MUST PASS
TEST 5: test_symlink_attack_blocked
Setup: Create symlink from Project B to Project A
Action: Query through symlink path
Assert: Path validation catches it OR canonicalizes correctly
Status: MUST PASS
All tests MUST PASS in Standard mode (95%+ pass rate).
================================================================================
OVERALL ASSESSMENT
================================================================================
VULNERABILITY SEVERITY: CRITICAL
EXPLOITABILITY: TRIVIAL (no special knowledge required)
BLAST RADIUS: ALL projects using centralized DB
DETECTION DIFFICULTY: HARD (appears as normal search results)
DATA AT RISK: ALL entities in centralized database
BEFORE FIX:
Status: UNSAFE for multi-project use
Safe for: Single project, public code, research only
Risk: 100% code leakage between projects
AFTER FIX (assuming all P0 completed):
Status: SAFE for production multi-project use
Added isolation: Database-level + application-level
Testing: Comprehensive isolation test suite
Maintenance: Required documentation updates
RECOMMENDATION: DO NOT DEPLOY to production multi-project environments.
Implement critical fixes immediately (1-2 days).
Re-audit after implementation.
================================================================================
SIGN-OFF
================================================================================
Audit Completed: 2025-12-11
Auditor: Code Security Review
Status: CRITICAL ISSUES IDENTIFIED
Recommendation: URGENT FIX REQUIRED
Next Steps:
1. Review this audit with development team
2. Prioritize P0 fixes (Week 1)
3. Create test suite before implementation
4. Implement fixes following recommendations
5. Run tests to verify isolation
6. Request re-audit after implementation
7. Update CHANGELOG with security fixes
Documentation:
- Full audit: docs/RUVECTOR_ISOLATION_AUDIT.md (835 lines)
- Quick ref: docs/RUVECTOR_QUICK_REFERENCE.md (287 lines)
- This summary: docs/RUVECTOR_ISOLATION_SUMMARY.txt
================================================================================