UNPKG

cntx-ui

Version:

File context management tool with web UI and MCP server for AI development workflows - bundle project files for LLM consumption

135 lines (109 loc) 4.32 kB
# Vector Search Capabilities ## Overview Vector search provides semantic code discovery through embedding-based similarity matching. This is the most efficient method for conceptual code exploration. ## When Available - Look for vector database endpoints: `/api/vector-db/*` - Check database status: `GET /api/vector-db/status` - Verify chunk count and model information ## Core Capabilities ### Semantic Search **Endpoint**: `POST /api/vector-db/search` **Use for**: - "Find functions that handle user authentication" - "Locate error handling patterns" - "Discover similar implementations" - "Explore feature-related code" **Query Format**: ```json { "query": "semantic description of what you're looking for", "limit": 5, "minSimilarity": 0.2 } ``` **Best Practices**: - Use 3-5 descriptive words: "user authentication login session" - Be conceptual, not literal: "form validation" not "validateForm function" - Lower similarity thresholds (0.1-0.2) for broader discovery - Higher limits (5-10) for comprehensive exploration ### Type-Based Search **Endpoint**: `POST /api/vector-db/search-by-type` **Use for**: - Finding all React components: `{"type": "react_component"}` - Locating API handlers: `{"type": "api_integration"}` - Discovering utility functions: `{"type": "utility"}` ### Domain-Based Search **Endpoint**: `POST /api/vector-db/search-by-domain` **Use for**: - Business domain exploration: `{"domain": "authentication"}` - Technical domain focus: `{"domain": "user-interface"}` - Architectural layer discovery: `{"domain": "api"}` ## Response Interpretation ### Result Structure ```json { "results": [ { "id": "functionName", "similarity": 0.85, "metadata": { "content": "actual code", "semanticType": "react_component", "businessDomain": ["authentication"], "technicalPatterns": ["hooks", "async_operations"], "purpose": "User login form", "files": ["src/auth/LoginForm.tsx"], "complexity": {"score": 5, "level": "medium"} } } ] } ``` ### Similarity Scores - **0.8+**: Highly relevant, direct match - **0.6-0.8**: Good relevance, related functionality - **0.4-0.6**: Potentially relevant, worth investigating - **0.2-0.4**: Loosely related, may provide context - **<0.2**: Probably not relevant ### Metadata Usage - **semanticType**: Code classification (component, function, etc.) - **businessDomain**: Functional area (auth, ui, api, etc.) - **technicalPatterns**: Implementation patterns (async, hooks, etc.) - **purpose**: Human-readable function description - **complexity**: Cognitive complexity metrics ## Query Optimization Strategies ### Progressive Refinement 1. Start broad: "user management" 2. Refine: "user authentication login" 3. Specific: "user login form validation" ### Multiple Approaches - Try different phrasings: "data fetching" vs "API calls" vs "HTTP requests" - Use synonyms: "configuration" vs "settings" vs "options" - Vary abstraction level: "React hooks" vs "state management" vs "useState" ### Fallback Strategies 1. **Lower similarity**: Try minSimilarity: 0.1 for broader results 2. **Increase limit**: Get more results to find relevant matches 3. **Different endpoints**: Try search-by-type or search-by-domain 4. **Simpler queries**: Use fewer, more basic terms ## Performance Characteristics ### Speed - **Typical response**: 20-50ms - **Token efficiency**: ~5k tokens vs 50k+ for file reading - **Real-time updates**: Automatically stays current with code changes ### Accuracy - **Best for**: Conceptual discovery and pattern matching - **Less effective for**: Exact string matching, specific error messages - **Coverage**: Function-level granularity across entire codebase ## Integration Patterns ### Vector-First Workflow ``` User Query → Semantic Search → [Optional: Type/Domain Refinement] → Traditional Search (if needed) ``` ### Hybrid Discovery 1. **Vector search** for broad discovery 2. **Bundle system** for structural context 3. **Direct file access** for specific implementation details ### Error Handling - **Empty results**: Try broader terms or lower similarity threshold - **Server error**: Fall back to bundle system or traditional search - **Offline**: Use available cached information and traditional methods