@mastra/rag
Version:
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
471 lines (350 loc) • 16.2 kB
Markdown
# createVectorQueryTool()
The `createVectorQueryTool()` function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.
## Basic usage
```typescript
import { createVectorQueryTool } from '@mastra/rag'
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
```
## Parameters
> **Note:** **Parameter Requirements:** Most fields can be set at creation as defaults. Some fields can be overridden at runtime via the request context or input. If a required field is missing from both creation and runtime, an error will be thrown. Note that `model`, `id`, and `description` can only be set at creation time.
**id** (`string`): Custom ID for the tool. By default: 'VectorQuery {vectorStoreName} {indexName} Tool'. (Set at creation only.)
**description** (`string`): Custom description for the tool. By default: 'Access the knowledge base to find information needed to answer user questions' (Set at creation only.)
**model** (`EmbeddingModel`): Embedding model to use for vector search. (Set at creation only.)
**vectorStoreName** (`string`): Name of the vector store to query. (Can be set at creation or overridden at runtime.)
**indexName** (`string`): Name of the index within the vector store. (Can be set at creation or overridden at runtime.)
**enableFilter** (`boolean`): Enable filtering of results based on metadata. (Set at creation only, but will be automatically enabled if a filter is provided in the request context.) (Default: `false`)
**includeVectors** (`boolean`): Include the embedding vectors in the results. (Can be set at creation or overridden at runtime.) (Default: `false`)
**includeSources** (`boolean`): Include the full retrieval objects in the results. (Can be set at creation or overridden at runtime.) (Default: `true`)
**reranker** (`RerankConfig`): Options for reranking results. (Can be set at creation or overridden at runtime.)
**reranker.model** (`MastraLanguageModel`): Language model to use for reranking
**reranker.options** (`RerankerOptions`): Options for the reranking process
**reranker.options.weights** (`WeightConfig`): Weights for scoring components (semantic: 0.4, vector: 0.4, position: 0.2)
**reranker.options.topK** (`number`): Number of top results to return
**databaseConfig** (`DatabaseConfig`): Database-specific configuration options for optimizing queries. (Can be set at creation or overridden at runtime.)
**databaseConfig.pinecone** (`PineconeConfig`): Configuration specific to Pinecone vector store
**databaseConfig.pinecone.namespace** (`string`): Pinecone namespace for organizing vectors
**databaseConfig.pinecone.sparseVector** (`{ indices: number[]; values: number[]; }`): Sparse vector for hybrid search
**databaseConfig.pgvector** (`PgVectorConfig`): Configuration specific to PostgreSQL with pgvector extension
**databaseConfig.pgvector.minScore** (`number`): Minimum similarity score threshold for results
**databaseConfig.pgvector.ef** (`number`): HNSW search parameter - controls accuracy vs speed tradeoff
**databaseConfig.pgvector.probes** (`number`): IVFFlat probe parameter - number of cells to visit during search
**databaseConfig.chroma** (`ChromaConfig`): Configuration specific to Chroma vector store
**databaseConfig.chroma.where** (`Record<string, any>`): Metadata filtering conditions
**databaseConfig.chroma.whereDocument** (`Record<string, any>`): Document content filtering conditions
**providerOptions** (`Record<string, Record<string, any>>`): Provider-specific options for the embedding model (e.g., outputDimensionality). \*\*Important\*\*: Only works with AI SDK EmbeddingModelV2 models. For V1 models, configure options when creating the model itself.
**vectorStore** (`MastraVector | VectorStoreResolver`): Direct vector store instance or a resolver function for dynamic selection. Use a function for multi-tenant applications where the vector store is selected based on request context. When provided, \`vectorStoreName\` becomes optional.
## Returns
The tool returns an object with:
**relevantContext** (`string`): Combined text from the most relevant document chunks
**sources** (`QueryResult[]`): Array of full retrieval result objects. Each object contains all information needed to reference the original document, chunk, and similarity score.
### `QueryResult` object structure
```typescript
{
id: string; // Unique chunk/document identifier
metadata: any; // All metadata fields (document ID, etc.)
vector: number[]; // Embedding vector (if available)
score: number; // Similarity score for this retrieval
document: string; // Full chunk/document text (if available)
}
```
## Default tool description
The default description focuses on:
- Finding relevant information in stored knowledge
- Answering user questions
- Retrieving factual content
## Result handling
The tool determines the number of results to return based on the user's query, with a default of 10 results. This can be adjusted based on the query requirements.
## Example with filters
```typescript
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
enableFilter: true,
})
```
With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:
1. A user makes a query with specific filter requirements like "Find content where the 'version' field is greater than 2.0"
2. The agent analyzes the query and constructs the appropriate filters:
```typescript
{
"version": { "$gt": 2.0 }
}
```
This agent-driven approach:
- Processes natural language queries into filter specifications
- Implements vector store-specific filter syntax
- Translates query terms to filter operators
For detailed filter syntax and store-specific capabilities, see the [Metadata Filters](https://mastra.ai/reference/rag/metadata-filters) documentation.
For an example of how agent-driven filtering works, see the [Agent-Driven Metadata Filtering](https://github.com/mastra-ai/mastra/tree/main/examples/basics/rag/filter-rag) example.
## Example with reranking
```typescript
const queryTool = createVectorQueryTool({
vectorStoreName: 'milvus',
indexName: 'documentation',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
reranker: {
model: 'openai/gpt-5.4',
options: {
weights: {
semantic: 0.5, // Semantic relevance weight
vector: 0.3, // Vector similarity weight
position: 0.2, // Original position weight
},
topK: 5,
},
},
})
```
Reranking improves result quality by combining:
- Semantic relevance: Using LLM-based scoring of text similarity
- Vector similarity: Original vector distance scores
- Position bias: Consideration of original result ordering
- Query analysis: Adjustments based on query characteristics
The reranker processes the initial vector search results and returns a reordered list optimized for relevance.
## Example with custom description
```typescript
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
description:
'Search through document archives to find relevant information for answering questions about company policies and procedures',
})
```
This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.
## Database-specific configuration examples
The `databaseConfig` parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.
**Pinecone**:
### Pinecone Configuration
```typescript
const pineconeQueryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pinecone: {
namespace: 'production', // Organize vectors by environment
sparseVector: {
// Enable hybrid search
indices: [0, 1, 2, 3],
values: [0.1, 0.2, 0.15, 0.05],
},
},
},
})
```
**Pinecone Features:**
- **Namespace**: Isolate different data sets within the same index
- **Sparse Vector**: Combine dense and sparse embeddings for improved search quality
- **Use Cases**: Multi-tenant applications, hybrid semantic search
**pgVector**:
### pgVector Configuration
```typescript
const pgVectorQueryTool = createVectorQueryTool({
vectorStoreName: 'postgres',
indexName: 'embeddings',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pgvector: {
minScore: 0.7, // Only return results above 70% similarity
ef: 200, // Higher value = better accuracy, slower search
probes: 10, // For IVFFlat: more probes = better recall
},
},
})
```
**pgVector Features:**
- **minScore**: Filter out low-quality matches
- **ef (HNSW)**: Control accuracy vs speed for HNSW indexes
- **probes (IVFFlat)**: Control recall vs speed for IVFFlat indexes
- **Use Cases**: Performance tuning, quality filtering
**Chroma**:
### Chroma Configuration
```typescript
const chromaQueryTool = createVectorQueryTool({
vectorStoreName: 'chroma',
indexName: 'documents',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
chroma: {
where: {
// Metadata filtering
category: 'technical',
status: 'published',
},
whereDocument: {
// Document content filtering
$contains: 'API',
},
},
},
})
```
**Chroma Features:**
- **where**: Filter by metadata fields
- **whereDocument**: Filter by document content
- **Use Cases**: Advanced filtering, content-based search
**Multiple Configs**:
### Multiple Database Configurations
```typescript
// Configure for multiple databases (useful for dynamic stores)
const multiDbQueryTool = createVectorQueryTool({
vectorStoreName: 'dynamic-store', // Will be set at runtime
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pinecone: {
namespace: 'default',
},
pgvector: {
minScore: 0.8,
ef: 150,
},
chroma: {
where: { type: 'documentation' },
},
},
})
```
**Multi-Config Benefits:**
- Support multiple vector stores with one tool
- Database-specific optimizations are automatically applied
- Flexible deployment scenarios
### Runtime Configuration Override
You can override database configurations at runtime to adapt to different scenarios:
```typescript
import { RequestContext } from '@mastra/core/request-context'
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pinecone: {
namespace: 'development',
},
},
})
// Override at runtime
const requestContext = new RequestContext()
requestContext.set('databaseConfig', {
pinecone: {
namespace: 'production', // Switch to production namespace
},
})
const response = await agent.generate('Find information about deployment', {
requestContext,
})
```
This approach allows you to:
- Switch between environments (dev/staging/prod)
- Adjust performance parameters based on load
- Apply different filtering strategies per request
## Example: Using request context
```typescript
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
```
When using request context, provide required parameters at execution time via the request context:
```typescript
const requestContext = new RequestContext<{
vectorStoreName: string
indexName: string
topK: number
filter: VectorFilter
databaseConfig: DatabaseConfig
}>()
requestContext.set('vectorStoreName', 'my-store')
requestContext.set('indexName', 'my-index')
requestContext.set('topK', 5)
requestContext.set('filter', { category: 'docs' })
requestContext.set('databaseConfig', {
pinecone: { namespace: 'runtime-namespace' },
})
requestContext.set('model', 'openai/text-embedding-3-small')
const response = await agent.generate('Find documentation from the knowledge base.', {
requestContext,
})
```
For more information on request context, please see:
- [Agent Request Context](https://mastra.ai/docs/server/request-context)
- [Request Context](https://mastra.ai/docs/server/request-context)
## Usage without a Mastra server
The tool can be used by itself to retrieve documents matching a query:
```typescript
import { RequestContext } from '@mastra/core/request-context'
import { createVectorQueryTool } from '@mastra/rag'
import { PgVector } from '@mastra/pg'
const pgVector = new PgVector({
id: 'pg-vector',
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
})
const vectorQueryTool = createVectorQueryTool({
vectorStoreName: 'pgVector', // optional since we're passing in a store
vectorStore: pgVector,
indexName: 'embeddings',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
const requestContext = new RequestContext()
const queryResult = await vectorQueryTool.execute({ queryText: 'foo', topK: 1 }, { requestContext })
console.log(queryResult.sources)
```
## Dynamic vector store for multi-tenant applications
For multi-tenant applications where each tenant has isolated data (e.g., separate PostgreSQL schemas), you can pass a resolver function instead of a static vector store instance. The function receives the request context and can return the appropriate vector store for the current tenant:
```typescript
import { createVectorQueryTool, VectorStoreResolver } from '@mastra/rag'
import { PgVector } from '@mastra/pg'
// Cache for tenant-specific vector stores
const vectorStoreCache = new Map<string, PgVector>()
// Resolver function that returns the correct vector store based on tenant
const vectorStoreResolver: VectorStoreResolver = async ({ requestContext }) => {
const tenantId = requestContext?.get('tenantId')
if (!tenantId) {
throw new Error('tenantId is required in request context')
}
// Return cached instance or create new one
if (!vectorStoreCache.has(tenantId)) {
vectorStoreCache.set(
tenantId,
new PgVector({
id: `pg-vector-${tenantId}`,
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
schemaName: `tenant_${tenantId}`, // Each tenant has their own schema
}),
)
}
return vectorStoreCache.get(tenantId)!
}
const vectorQueryTool = createVectorQueryTool({
indexName: 'embeddings',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
vectorStore: vectorStoreResolver, // Dynamic resolution!
})
// Usage with tenant context
const requestContext = new RequestContext()
requestContext.set('tenantId', 'acme-corp')
const result = await vectorQueryTool.execute(
{ queryText: 'company policies', topK: 5 },
{ requestContext },
)
```
This pattern is similar to how `Agent.memory` supports dynamic configuration and enables:
- **Schema isolation**: Each tenant's data in separate PostgreSQL schemas
- **Database isolation**: Route to different database instances per tenant
- **Dynamic configuration**: Adjust vector store settings based on request context
## Tool details
The tool is created with:
- **ID**: `VectorQuery {vectorStoreName} {indexName} Tool`
- **Input Schema**: Requires queryText and filter objects
- **Output Schema**: Returns relevantContext string
## Related
- [rerank()](https://mastra.ai/reference/rag/rerank)
- [createGraphRAGTool](https://mastra.ai/reference/tools/graph-rag-tool)