embark-mcp
Version:
MCP server proxy for Embark code search
407 lines (321 loc) • 13.4 kB
Markdown
# Embark MCP Server
A Model Context Protocol (MCP) server that provides a proxy interface to Embark's semantic code search capabilities. This server enables LLM applications to search through code repositories using Embark's powerful indexing and similarity search.
## Features
- **Semantic Code Search**: Search for code using natural language queries through Embark's semantic search engine
- **Multi-Repository Search**: Search across multiple Git repositories simultaneously with filtering options
- **Dependency-Based Search**: Search for code that uses specific dependencies and libraries
- **MCP Protocol Compliance**: Fully compatible with the Model Context Protocol standard
- **JetBrains Account Authentication**: Secure authentication with Embark's API using JetBrains Account OAuth
- **Configurable Repositories**: Search across different code repositories with include/exclude filters
- **Fallback JWT Support**: Supports `GRAZIE_JWT_TOKEN` for non-interactive environments
## Installation
1. Clone the repository:
```bash
git clone <repository-url>
cd embark-mcp
```
2. Install dependencies:
```bash
npm install
```
3. Build the project:
```bash
npm run build
```
## Configuration
### Authentication
This server uses JetBrains Account OAuth for authentication by default.
#### First-Time Authorization
The OAuth flow is triggered when you first use a search tool. To manually trigger the authorization before using it with Claude Desktop:
1. Start the server: `npx embark-mcp`
2. In another terminal, trigger a search to start the OAuth flow:
```bash
echo '{"method": "tools/call", "params": {"name": "semantic_code_search", "arguments": {"text": "test"}}}' | npx embark-mcp
```
3. The server will automatically open a browser window to `https://account.jetbrains.com`
4. Log in with your JetBrains Account credentials
5. Review and accept the authorization request for Embark access
6. The browser will redirect to a success page showing "Authorization successful, return to the terminal"
7. Return to your terminal - the server should now be authenticated
The authorization tokens are securely saved in `~/.jbaccount` for future sessions, so you only need to do this once.
### Environment Variables
#### Authentication
- `GRAZIE_JWT_TOKEN` (optional): JWT authentication token for Embark API. If provided, it will be used instead of the OAuth flow. This is useful for non-interactive environments.
- `JETBRAINS_AI_URL` (optional): Base URL for JetBrains AI API (defaults to `https://api.jetbrains.ai`, can be set to `https://api.stgn.jetbrains.ai/` for staging).
#### Repository Configuration
- `REPOSITORY_GIT_REMOTE_URL` (optional): Default repository Git remote URL to search in (can be overridden per search request)
- `REPOSITORY_ID` (optional): Alternative way to specify default repository (fallback for `REPOSITORY_GIT_REMOTE_URL`)
- `REPOSITORY_REVISION` (optional): Default repository revision (commit hash, branch name, or tag) to search in. If not set, searches the latest indexed version.
#### Multi-Repository Filtering
- `INCLUDE_REPOSITORY_URLS` (optional): Comma-separated list of Git repository URLs to include in searches. When set, only these repositories will be searched. Example: `"https://github.com/owner/repo1.git,https://github.com/owner/repo2.git"`
- `EXCLUDE_REPOSITORY_URLS` (optional): Comma-separated list of Git repository URLs to exclude from searches. When set, these repositories will be skipped. Example: `"https://github.com/owner/large-repo.git"`
#### Other Options
- `TYPE_TOKEN` (optional): Endpoint type token, either `USER` (default) or `APPLICATION`. When set to `APPLICATION`, the server uses `/application/*` endpoints instead of `/user/*` endpoints.
- `ENABLE_REMOTE_LOGS` (optional): When set to `true`, includes `logAllowed=true` parameter in Embark API requests to enable remote logging for debugging purposes. Defaults to `false`.
**Note**: See [MULTI_REPOSITORY_SEARCH.md](MULTI_REPOSITORY_SEARCH.md) for detailed documentation on multi-repository search and filtering.
### Setting up Fallback Authentication (Optional)
If you need to use the server in a non-interactive environment, you can use a JWT token.
1. Obtain a JWT token from your Embark service administrator
2. Set the token as an environment variable:
```bash
export GRAZIE_JWT_TOKEN="your-jwt-token-here"
export REPOSITORY_GIT_REMOTE_URL="https://github.com/owner/repo.git" # optional default repository
```
## Usage
### Running the Server
#### Using npx (Recommended)
You can run the server directly using `npx` without cloning the repository:
```bash
# Optional: set a default repository
export REPOSITORY_GIT_REMOTE_URL="https://github.com/owner/repo.git"
# Run the server and follow the on-screen instructions for OAuth login
npx embark-mcp
```
If you are using fallback JWT authentication:
```bash
# Set your authentication token
export GRAZIE_JWT_TOKEN="your-jwt-token-here"
export REPOSITORY_GIT_REMOTE_URL="https://github.com/owner/repo.git" # optional
# Run the server
npx embark-mcp
```
#### Running from Source
Start the MCP server:
```bash
npm start
```
Or for development with auto-reload:
```bash
npm run dev
```
### Integrating with Claude Desktop
Add the server to your Claude Desktop configuration file:
**macOS/Linux**: `~/.config/claude/claude_desktop_config.json`
**Windows**: `%APPDATA%/Claude/claude_desktop_config.json`
#### Using npx (Recommended)
```json
{
"mcpServers": {
"embark-mcp": {
"command": "npx",
"args": ["embark-mcp"],
"env": {
"JETBRAINS_AI_URL": "https://api.jetbrains.ai",
"REPOSITORY_GIT_REMOTE_URL": "https://github.com/owner/repo.git",
"REPOSITORY_REVISION": "main",
"TYPE_TOKEN": "USER",
"ENABLE_LOCAL_LOGS": true,
"ENABLE_REMOTE_LOGS": "false"
}
}
}
}
```
**Note**: For the OAuth flow to work with Claude Desktop, you must complete the initial authorization process first. Follow the steps in the "First-Time Authorization" section above to trigger the OAuth flow and save the tokens before Claude Desktop attempts to use the server.
#### Using Local Installation
```json
{
"mcpServers": {
"embark-mcp": {
"command": "node",
"args": ["/path/to/embark-mcp/dist/index.js"],
"env": {
"REPOSITORY_GIT_REMOTE_URL": "https://github.com/owner/repo.git"
}
}
}
}
```
### Available Tools
#### `semantic_code_search`
Search for code using Embark's semantic search engine.
**Parameters:**
- `text` (required): The text/code to search for. Use detailed, descriptive natural language queries for best results. Examples: "function that validates user email addresses and returns boolean", "error handling middleware for HTTP requests with logging", "React component that renders a modal dialog with close button".
- `pathFilter` (optional): A specific directory or file to narrow the search. If not provided, the whole codebase is searched. Examples: "path/to/module", "path/to/module/submodule", "path/to/file/example.kt".
- `repositoryGitRemoteUrl` (optional): The repository Git remote URL to search in (defaults to `REPOSITORY_GIT_REMOTE_URL` environment variable)
**Example:**
```
{'text': 'Search for "authentication middleware" in the codebase', 'pathFilter': 'src'}
```
**Response Format:**
```
Found 10 results for "authentication middleware" in repository "https://github.com/owner/repo.git" (revision: "5e7f1ab4bf58e473e5d7f878eb2b499d7deabd29", pathFilter: "src")
1. File=src/middleware/auth.js, offset=120:340, similarity=0.892, type=FUNCTION
Snippet:
"""
export function authMiddleware(req, res, next) {
const token = req.headers['authorization'];
if (!token) return res.status(401).send('Missing token');
try {
req.user = verifyToken(token);
next();
} catch (err) {
res.status(403).send('Invalid token');
}
}
"""
2. File=src/security/middleware.ts, offset=45:180, similarity=0.834, type=CLASS
Snippet:
"""
export class SecurityMiddleware {
constructor(private readonly secret: string) {}
handle(req: Request, res: Response, next: NextFunction) {
if (!req.headers['x-api-key']) {
res.status(401).json({ error: 'Unauthorized' });
} else {
next();
}
}
}
"""
3. File=src/routes/auth.js, offset=890:1120, similarity=0.776, type=FUNCTION
Snippet:
"""
router.post('/login', async (req, res) => {
const { username, password } = req.body;
const token = await authenticate(username, password);
res.json({ token });
});
"""
...
```
#### `search_in_dependencies`
Search for code that uses specific dependencies using Embark's semantic search.
**Parameters:**
- `text` (required): The search query describing what to look for in relation to dependencies
- `dependencies` (required): Array of dependency objects with the following structure:
- `dependency` (required): The dependency name
- `version` (required): The dependency version
**Example:**
```
Search for code that uses React hooks with specific dependencies
```
**Dependencies Parameter Example:**
```json
[
{"dependency": "react", "version": "18.0.0"},
{"dependency": "react-hooks", "version": "1.0.0"}
]
```
**Response Format:**
```
Found 3 results for "React hooks usage" with dependencies [react:18.0.0, react-hooks:1.0.0] in repository "https://github.com/owner/repo.git":
1. File=src/components/UserProfile.tsx, offset=45:180, similarity=0.891, type=FUNCTION
2. File=src/hooks/useAuth.ts, offset=12:95, similarity=0.834, type=FUNCTION
3. File=src/pages/Dashboard.tsx, offset=200:350, similarity=0.776, type=FUNCTION
Timings: search: 120ms, total: 150ms
```
## API Reference
### Embark API Integration
This server integrates with Embark's REST API endpoints:
#### Semantic Code Search
- **Endpoint**: `/user/v5/indexing/search` (or `/application/v5/indexing/search` when `TYPE_TOKEN=APPLICATION`)
- **Method**: POST
- **Authentication**: Bearer token from JetBrains Account OAuth or `GRAZIE_JWT_TOKEN` via `grazie-authenticate-jwt` header
- **Request Body**:
```json
{
"text": "search query",
"repository": "https://github.com/owner/repo.git",
"revision": "main", // optional: commit hash, branch name, or tag
"pathFilter": "dir" // optional: a file or a dir
}
```
#### Dependencies Search
- **Endpoint**: `/search-dependencies`
- **Method**: POST
- **Authentication**: Bearer token from JetBrains Account OAuth or `GRAZIE_JWT_TOKEN` via `grazie-authenticate-jwt` header
- **Request Body**:
```json
{
"index": "ProductionIndices.CodeBlocks",
"text": "search query",
"dependencies": [
{"dependency": "react", "version": "18.0.0"}
],
"maxResults": 10,
"minScore": 0.0,
"logAllowed": false,
"searchPipelineConfig": "SearchPipelineConfig.SearchOnly"
}
```
### Response Structure
#### Semantic Code Search Response
Embark returns search results with the following structure:
```typescript
interface SearchResponse {
searchResponse: {
res: Array<{
scoredText: {
text: string;
similarity: number;
};
sourcePosition: {
relativePath: string;
startOffset: number;
endOffset: number;
};
indexItemType: string;
}>;
};
}
```
#### Dependencies Search Response
The dependencies search endpoint returns results with timing information:
```typescript
interface DependenciesSearchResponse {
results: Array<{
searchResult: {
sourcePosition: {
relativePath: string;
startOffset: number;
endOffset: number;
};
indexItemType: string;
similarity: number;
};
content: string;
}>;
timings: {
[key: string]: number; // timing in milliseconds
};
}
```
## Development
### Building
```bash
npm run build
```
### Development Mode
```bash
npm run dev
```
### Watch Mode
```bash
npm run watch
```
## Troubleshooting
### Common Issues
1. **OAuth Error**:
- Ensure your browser is available to complete the login flow.
- If behind a firewall, ensure that `https://www.jetbrains.com` and `http://localhost:62345` (or a nearby port) are accessible.
- If the browser doesn't open automatically, copy the URL from the terminal and open it manually.
2. **Authentication Error (JWT)**: Ensure your `GRAZIE_JWT_TOKEN` is valid and not expired. This is only relevant if you are using fallback authentication.
3. **Connection Error**: Check that the `JETBRAINS_AI_URL` is correct and accessible.
4. **No Results**: Verify the repository name exists and is accessible with your token.
### Error Messages
- `Failed to get authorization code`: The OAuth flow was not completed successfully.
- `Embark API error (401)`: Invalid or expired token (either from OAuth or JWT).
- `Embark API error (404)`: Repository not found or not accessible.
## License
MIT
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request
## Support
For issues related to:
- **This MCP server**: Open an issue in this repository
- **Embark API**: Contact your Embark service administrator
- **Model Context Protocol**: See the [official MCP documentation](https://modelcontextprotocol.io/)