@buger/probe-chat

# Probe Chat A command-line and web interface for interacting with Probe code search using AI models through the Vercel AI SDK. ## Features - Interactive CLI chat interface - Web-based chat interface with Markdown and syntax highlighting - Support for Anthropic Claude, OpenAI, and Google Gemini models - Force provider option to specify which AI provider to use - Semantic code search using Probe's search capabilities - AST-based code querying for finding specific code structures - Code extraction for viewing complete context - Session-based search caching for improved performance - Token usage tracking - Colorized output for better readability (CLI mode) - Diagram generation with Mermaid.js (Web mode) ## Prerequisites - Node.js 18 or higher - Probe CLI installed and available in your PATH - An API key for Anthropic Claude, OpenAI, or Google Gemini ## Installation 1. Clone the repository 2. Navigate to the `examples/chat` directory 3. Install dependencies: ```bash npm install ``` 4. Create a `.env` file with your API keys: ``` # API Keys (uncomment and add your key) ANTHROPIC_API_KEY=your_anthropic_api_key # OPENAI_API_KEY=your_openai_api_key # GOOGLE_API_KEY=your_google_api_key # Force a specific provider (optional) # FORCE_PROVIDER=anthropic # Options: anthropic, openai, google # Debug mode (set to true for verbose logging) DEBUG=false # Default model (optional) # For Anthropic: MODEL_NAME=claude-3-7-sonnet-latest # For OpenAI: MODEL_NAME=gpt-4o-2024-05-13 # For Google: MODEL_NAME=gemini-2.0-flash # API URL configuration (optional) # Generic base URL for all providers (if provider-specific URL not set) # LLM_BASE_URL=https://your-custom-endpoint.com # Provider-specific URLs (override LLM_BASE_URL) # ANTHROPIC_API_URL=https://your-anthropic-endpoint.com # OPENAI_API_URL=https://your-openai-endpoint.com # GOOGLE_API_URL=https://your-google-endpoint.com # Folders to search (comma-separated list of paths) # If not specified, the current directory will be used by default # ALLOWED_FOLDERS=/path/to/folder1,/path/to/folder2 # Web interface settings (optional) # PORT=8080 # AUTH_ENABLED=false # AUTH_USERNAME=admin # AUTH_PASSWORD=password ``` ## Usage ### CLI Mode Start the chat interface in CLI mode: ```bash node index.js ``` Or with npm: ```bash npm start ``` ### Web Mode Start the chat interface in web mode: ```bash node index.js --web ``` Or with npm: ```bash npm run web ``` You can specify a custom port: ```bash node index.js --web --port 3000 ``` You can also specify a path to the codebase you want to search: ```bash node index.js /path/to/codebase ``` For example, to search in a repository located at ../../tyk: ```bash node index.js ../../tyk ``` This will override any ALLOWED_FOLDERS setting in your .env file. ### Command-line Options - `-d, --debug`: Enable debug mode for verbose logging - `-m, --model <model>`: Specify the model to use (e.g., `claude-3-7-sonnet-latest`, `gpt-4o-2024-05-13`, `gemini-2.0-flash`) - `-f, --force-provider <provider>`: Force a specific provider (options: `anthropic`, `openai`, `google`) - `-w, --web`: Run in web interface mode - `-p, --port <port>`: Port to run web server on (default: 8080) - `[path]`: Path to the codebase to search (overrides ALLOWED_FOLDERS) ### Special Commands During the chat, you can use these special commands: - `exit` or `quit`: End the chat session - `usage`: Display token usage statistics - `clear`: Clear the chat history and start a new session ## How It Works This CLI tool uses the Vercel AI SDK to interact with AI models and provides them with tools to search and analyze your codebase: 1. **search**: Searches code using Elasticsearch-like query syntax 2. **query**: Searches code using AST-based pattern matching 3. **extract**: Extracts code blocks from files with context The AI is instructed to use these tools to answer your questions about the codebase, providing relevant code snippets and explanations. ### Search Caching The tool automatically generates a unique session ID for each chat session and passes it to the Probe CLI commands using the `--session` parameter. This enables caching of search results within a session, which can significantly improve performance when similar searches are performed multiple times. The session ID is managed internally and doesn't require any user intervention. When you start a new chat session (or use the "clear" command), a new session ID is generated, and a new cache is created. ## Provider Options Probe Chat supports multiple AI providers, giving you flexibility in choosing which model to use for your code search and analysis: ### Supported Providers 1. **Anthropic Claude** - Default model: `claude-3-7-sonnet-latest` - Environment variable: `ANTHROPIC_API_KEY` - Best for: Complex code analysis, detailed explanations, and understanding nuanced patterns 2. **OpenAI GPT** - Default model: `gpt-4o-2024-05-13` - Environment variable: `OPENAI_API_KEY` - Best for: General code search, pattern recognition, and concise explanations 3. **Google Gemini** - Default model: `gemini-2.0-flash` - Environment variable: `GOOGLE_API_KEY` - Best for: Fast responses, code generation, and efficient search ### Forcing a Specific Provider You can force Probe Chat to use a specific provider in two ways: 1. **Using the command line option**: ```bash node index.js --force-provider anthropic node index.js --force-provider openai node index.js --force-provider google ``` 2. **Using the environment variable**: Add this to your `.env` file: ``` FORCE_PROVIDER=anthropic # or openai, google ``` When forcing a provider, Probe Chat will verify that you have the corresponding API key set. If the API key is missing, it will display an error message. ### Customizing Models You can specify which model to use for each provider: 1. **Using the command line option**: ```bash node index.js --model claude-3-7-sonnet-latest node index.js --model gpt-4o-2024-05-13 node index.js --model gemini-2.0-flash ``` 2. **Using the environment variable**: Add this to your `.env` file: ``` MODEL_NAME=claude-3-7-sonnet-latest ``` Note that the model must be compatible with the selected provider. If you force a specific provider and specify a model, the model must be available for that provider. ### Custom API Endpoints You can configure custom API endpoints for each provider: 1. **Generic endpoint for all providers**: ``` LLM_BASE_URL=https://your-custom-endpoint.com ``` This will be used for all providers unless a provider-specific URL is set. 2. **Provider-specific endpoints**: ``` ANTHROPIC_API_URL=https://your-anthropic-endpoint.com OPENAI_API_URL=https://your-openai-endpoint.com GOOGLE_API_URL=https://your-google-endpoint.com ``` These override the generic LLM_BASE_URL for their respective providers. Provider-specific URLs always take precedence over the generic LLM_BASE_URL. ## Example Queries - "How does the config loading work?" - "Show me all RPC handlers" - "What does the process_file function do?" - "Find all implementations of the extract tool" - "Show me the main entry point of the application" ## Architecture - `index.js`: Main entry point for both CLI and web interfaces - `probeChat.js`: Core chat functionality - `webServer.js`: Web server implementation - `auth.js`: Authentication middleware for web interface - `probeTool.js`: Tool definitions for code search, query, and extraction - `tokenCounter.js`: Utility for tracking token usage - `index.html`: Web interface HTML template ## License Apache-2.0