@buger/probe-chat
Version:
CLI and web interface for Probe code search (formerly @buger/probe-web and @buger/probe-chat)
255 lines (182 loc) • 7.68 kB
Markdown
# Probe Chat
A command-line and web interface for interacting with Probe code search using AI models through the Vercel AI SDK.
## Features
- Interactive CLI chat interface
- Web-based chat interface with Markdown and syntax highlighting
- Support for Anthropic Claude, OpenAI, and Google Gemini models
- Force provider option to specify which AI provider to use
- Semantic code search using Probe's search capabilities
- AST-based code querying for finding specific code structures
- Code extraction for viewing complete context
- Session-based search caching for improved performance
- Token usage tracking
- Colorized output for better readability (CLI mode)
- Diagram generation with Mermaid.js (Web mode)
## Prerequisites
- Node.js 18 or higher
- Probe CLI installed and available in your PATH
- An API key for Anthropic Claude, OpenAI, or Google Gemini
## Installation
1. Clone the repository
2. Navigate to the `examples/chat` directory
3. Install dependencies:
```bash
npm install
```
4. Create a `.env` file with your API keys:
```
# API Keys (uncomment and add your key)
ANTHROPIC_API_KEY=your_anthropic_api_key
# OPENAI_API_KEY=your_openai_api_key
# GOOGLE_API_KEY=your_google_api_key
# Force a specific provider (optional)
# FORCE_PROVIDER=anthropic # Options: anthropic, openai, google
# Debug mode (set to true for verbose logging)
DEBUG=false
# Default model (optional)
# For Anthropic: MODEL_NAME=claude-3-7-sonnet-latest
# For OpenAI: MODEL_NAME=gpt-4o-2024-05-13
# For Google: MODEL_NAME=gemini-2.0-flash
# API URL configuration (optional)
# Generic base URL for all providers (if provider-specific URL not set)
# LLM_BASE_URL=https://your-custom-endpoint.com
# Provider-specific URLs (override LLM_BASE_URL)
# ANTHROPIC_API_URL=https://your-anthropic-endpoint.com
# OPENAI_API_URL=https://your-openai-endpoint.com
# GOOGLE_API_URL=https://your-google-endpoint.com
# Folders to search (comma-separated list of paths)
# If not specified, the current directory will be used by default
# ALLOWED_FOLDERS=/path/to/folder1,/path/to/folder2
# Web interface settings (optional)
# PORT=8080
# AUTH_ENABLED=false
# AUTH_USERNAME=admin
# AUTH_PASSWORD=password
```
## Usage
### CLI Mode
Start the chat interface in CLI mode:
```bash
node index.js
```
Or with npm:
```bash
npm start
```
### Web Mode
Start the chat interface in web mode:
```bash
node index.js --web
```
Or with npm:
```bash
npm run web
```
You can specify a custom port:
```bash
node index.js --web --port 3000
```
You can also specify a path to the codebase you want to search:
```bash
node index.js /path/to/codebase
```
For example, to search in a repository located at ../../tyk:
```bash
node index.js ../../tyk
```
This will override any ALLOWED_FOLDERS setting in your .env file.
### Command-line Options
- `-d, --debug`: Enable debug mode for verbose logging
- `-m, --model <model>`: Specify the model to use (e.g., `claude-3-7-sonnet-latest`, `gpt-4o-2024-05-13`, `gemini-2.0-flash`)
- `-f, --force-provider <provider>`: Force a specific provider (options: `anthropic`, `openai`, `google`)
- `-w, --web`: Run in web interface mode
- `-p, --port <port>`: Port to run web server on (default: 8080)
- `[path]`: Path to the codebase to search (overrides ALLOWED_FOLDERS)
### Special Commands
During the chat, you can use these special commands:
- `exit` or `quit`: End the chat session
- `usage`: Display token usage statistics
- `clear`: Clear the chat history and start a new session
## How It Works
This CLI tool uses the Vercel AI SDK to interact with AI models and provides them with tools to search and analyze your codebase:
1. **search**: Searches code using Elasticsearch-like query syntax
2. **query**: Searches code using AST-based pattern matching
3. **extract**: Extracts code blocks from files with context
The AI is instructed to use these tools to answer your questions about the codebase, providing relevant code snippets and explanations.
### Search Caching
The tool automatically generates a unique session ID for each chat session and passes it to the Probe CLI commands using the `--session` parameter. This enables caching of search results within a session, which can significantly improve performance when similar searches are performed multiple times.
The session ID is managed internally and doesn't require any user intervention. When you start a new chat session (or use the "clear" command), a new session ID is generated, and a new cache is created.
## Provider Options
Probe Chat supports multiple AI providers, giving you flexibility in choosing which model to use for your code search and analysis:
### Supported Providers
1. **Anthropic Claude**
- Default model: `claude-3-7-sonnet-latest`
- Environment variable: `ANTHROPIC_API_KEY`
- Best for: Complex code analysis, detailed explanations, and understanding nuanced patterns
2. **OpenAI GPT**
- Default model: `gpt-4o-2024-05-13`
- Environment variable: `OPENAI_API_KEY`
- Best for: General code search, pattern recognition, and concise explanations
3. **Google Gemini**
- Default model: `gemini-2.0-flash`
- Environment variable: `GOOGLE_API_KEY`
- Best for: Fast responses, code generation, and efficient search
### Forcing a Specific Provider
You can force Probe Chat to use a specific provider in two ways:
1. **Using the command line option**:
```bash
node index.js --force-provider anthropic
node index.js --force-provider openai
node index.js --force-provider google
```
2. **Using the environment variable**:
Add this to your `.env` file:
```
FORCE_PROVIDER=anthropic # or openai, google
```
When forcing a provider, Probe Chat will verify that you have the corresponding API key set. If the API key is missing, it will display an error message.
### Customizing Models
You can specify which model to use for each provider:
1. **Using the command line option**:
```bash
node index.js --model claude-3-7-sonnet-latest
node index.js --model gpt-4o-2024-05-13
node index.js --model gemini-2.0-flash
```
2. **Using the environment variable**:
Add this to your `.env` file:
```
MODEL_NAME=claude-3-7-sonnet-latest
```
Note that the model must be compatible with the selected provider. If you force a specific provider and specify a model, the model must be available for that provider.
### Custom API Endpoints
You can configure custom API endpoints for each provider:
1. **Generic endpoint for all providers**:
```
LLM_BASE_URL=https://your-custom-endpoint.com
```
This will be used for all providers unless a provider-specific URL is set.
2. **Provider-specific endpoints**:
```
ANTHROPIC_API_URL=https://your-anthropic-endpoint.com
OPENAI_API_URL=https://your-openai-endpoint.com
GOOGLE_API_URL=https://your-google-endpoint.com
```
These override the generic LLM_BASE_URL for their respective providers.
Provider-specific URLs always take precedence over the generic LLM_BASE_URL.
## Example Queries
- "How does the config loading work?"
- "Show me all RPC handlers"
- "What does the process_file function do?"
- "Find all implementations of the extract tool"
- "Show me the main entry point of the application"
## Architecture
- `index.js`: Main entry point for both CLI and web interfaces
- `probeChat.js`: Core chat functionality
- `webServer.js`: Web server implementation
- `auth.js`: Authentication middleware for web interface
- `probeTool.js`: Tool definitions for code search, query, and extraction
- `tokenCounter.js`: Utility for tracking token usage
- `index.html`: Web interface HTML template
## License
Apache-2.0