UNPKG

@mastra/rag

Version:

The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.

471 lines (350 loc) • 16.2 kB
# createVectorQueryTool() The `createVectorQueryTool()` function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends. ## Basic usage ```typescript import { createVectorQueryTool } from '@mastra/rag' import { ModelRouterEmbeddingModel } from '@mastra/core/llm' const queryTool = createVectorQueryTool({ vectorStoreName: 'pinecone', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), }) ``` ## Parameters > **Note:** **Parameter Requirements:** Most fields can be set at creation as defaults. Some fields can be overridden at runtime via the request context or input. If a required field is missing from both creation and runtime, an error will be thrown. Note that `model`, `id`, and `description` can only be set at creation time. **id** (`string`): Custom ID for the tool. By default: 'VectorQuery {vectorStoreName} {indexName} Tool'. (Set at creation only.) **description** (`string`): Custom description for the tool. By default: 'Access the knowledge base to find information needed to answer user questions' (Set at creation only.) **model** (`EmbeddingModel`): Embedding model to use for vector search. (Set at creation only.) **vectorStoreName** (`string`): Name of the vector store to query. (Can be set at creation or overridden at runtime.) **indexName** (`string`): Name of the index within the vector store. (Can be set at creation or overridden at runtime.) **enableFilter** (`boolean`): Enable filtering of results based on metadata. (Set at creation only, but will be automatically enabled if a filter is provided in the request context.) (Default: `false`) **includeVectors** (`boolean`): Include the embedding vectors in the results. (Can be set at creation or overridden at runtime.) (Default: `false`) **includeSources** (`boolean`): Include the full retrieval objects in the results. (Can be set at creation or overridden at runtime.) (Default: `true`) **reranker** (`RerankConfig`): Options for reranking results. (Can be set at creation or overridden at runtime.) **reranker.model** (`MastraLanguageModel`): Language model to use for reranking **reranker.options** (`RerankerOptions`): Options for the reranking process **reranker.options.weights** (`WeightConfig`): Weights for scoring components (semantic: 0.4, vector: 0.4, position: 0.2) **reranker.options.topK** (`number`): Number of top results to return **databaseConfig** (`DatabaseConfig`): Database-specific configuration options for optimizing queries. (Can be set at creation or overridden at runtime.) **databaseConfig.pinecone** (`PineconeConfig`): Configuration specific to Pinecone vector store **databaseConfig.pinecone.namespace** (`string`): Pinecone namespace for organizing vectors **databaseConfig.pinecone.sparseVector** (`{ indices: number[]; values: number[]; }`): Sparse vector for hybrid search **databaseConfig.pgvector** (`PgVectorConfig`): Configuration specific to PostgreSQL with pgvector extension **databaseConfig.pgvector.minScore** (`number`): Minimum similarity score threshold for results **databaseConfig.pgvector.ef** (`number`): HNSW search parameter - controls accuracy vs speed tradeoff **databaseConfig.pgvector.probes** (`number`): IVFFlat probe parameter - number of cells to visit during search **databaseConfig.chroma** (`ChromaConfig`): Configuration specific to Chroma vector store **databaseConfig.chroma.where** (`Record<string, any>`): Metadata filtering conditions **databaseConfig.chroma.whereDocument** (`Record<string, any>`): Document content filtering conditions **providerOptions** (`Record<string, Record<string, any>>`): Provider-specific options for the embedding model (e.g., outputDimensionality). \*\*Important\*\*: Only works with AI SDK EmbeddingModelV2 models. For V1 models, configure options when creating the model itself. **vectorStore** (`MastraVector | VectorStoreResolver`): Direct vector store instance or a resolver function for dynamic selection. Use a function for multi-tenant applications where the vector store is selected based on request context. When provided, \`vectorStoreName\` becomes optional. ## Returns The tool returns an object with: **relevantContext** (`string`): Combined text from the most relevant document chunks **sources** (`QueryResult[]`): Array of full retrieval result objects. Each object contains all information needed to reference the original document, chunk, and similarity score. ### `QueryResult` object structure ```typescript { id: string; // Unique chunk/document identifier metadata: any; // All metadata fields (document ID, etc.) vector: number[]; // Embedding vector (if available) score: number; // Similarity score for this retrieval document: string; // Full chunk/document text (if available) } ``` ## Default tool description The default description focuses on: - Finding relevant information in stored knowledge - Answering user questions - Retrieving factual content ## Result handling The tool determines the number of results to return based on the user's query, with a default of 10 results. This can be adjusted based on the query requirements. ## Example with filters ```typescript const queryTool = createVectorQueryTool({ vectorStoreName: 'pinecone', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), enableFilter: true, }) ``` With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows: 1. A user makes a query with specific filter requirements like "Find content where the 'version' field is greater than 2.0" 2. The agent analyzes the query and constructs the appropriate filters: ```typescript { "version": { "$gt": 2.0 } } ``` This agent-driven approach: - Processes natural language queries into filter specifications - Implements vector store-specific filter syntax - Translates query terms to filter operators For detailed filter syntax and store-specific capabilities, see the [Metadata Filters](https://mastra.ai/reference/rag/metadata-filters) documentation. For an example of how agent-driven filtering works, see the [Agent-Driven Metadata Filtering](https://github.com/mastra-ai/mastra/tree/main/examples/basics/rag/filter-rag) example. ## Example with reranking ```typescript const queryTool = createVectorQueryTool({ vectorStoreName: 'milvus', indexName: 'documentation', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), reranker: { model: 'openai/gpt-5.4', options: { weights: { semantic: 0.5, // Semantic relevance weight vector: 0.3, // Vector similarity weight position: 0.2, // Original position weight }, topK: 5, }, }, }) ``` Reranking improves result quality by combining: - Semantic relevance: Using LLM-based scoring of text similarity - Vector similarity: Original vector distance scores - Position bias: Consideration of original result ordering - Query analysis: Adjustments based on query characteristics The reranker processes the initial vector search results and returns a reordered list optimized for relevance. ## Example with custom description ```typescript const queryTool = createVectorQueryTool({ vectorStoreName: 'pinecone', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), description: 'Search through document archives to find relevant information for answering questions about company policies and procedures', }) ``` This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval. ## Database-specific configuration examples The `databaseConfig` parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution. **Pinecone**: ### Pinecone Configuration ```typescript const pineconeQueryTool = createVectorQueryTool({ vectorStoreName: 'pinecone', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), databaseConfig: { pinecone: { namespace: 'production', // Organize vectors by environment sparseVector: { // Enable hybrid search indices: [0, 1, 2, 3], values: [0.1, 0.2, 0.15, 0.05], }, }, }, }) ``` **Pinecone Features:** - **Namespace**: Isolate different data sets within the same index - **Sparse Vector**: Combine dense and sparse embeddings for improved search quality - **Use Cases**: Multi-tenant applications, hybrid semantic search **pgVector**: ### pgVector Configuration ```typescript const pgVectorQueryTool = createVectorQueryTool({ vectorStoreName: 'postgres', indexName: 'embeddings', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), databaseConfig: { pgvector: { minScore: 0.7, // Only return results above 70% similarity ef: 200, // Higher value = better accuracy, slower search probes: 10, // For IVFFlat: more probes = better recall }, }, }) ``` **pgVector Features:** - **minScore**: Filter out low-quality matches - **ef (HNSW)**: Control accuracy vs speed for HNSW indexes - **probes (IVFFlat)**: Control recall vs speed for IVFFlat indexes - **Use Cases**: Performance tuning, quality filtering **Chroma**: ### Chroma Configuration ```typescript const chromaQueryTool = createVectorQueryTool({ vectorStoreName: 'chroma', indexName: 'documents', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), databaseConfig: { chroma: { where: { // Metadata filtering category: 'technical', status: 'published', }, whereDocument: { // Document content filtering $contains: 'API', }, }, }, }) ``` **Chroma Features:** - **where**: Filter by metadata fields - **whereDocument**: Filter by document content - **Use Cases**: Advanced filtering, content-based search **Multiple Configs**: ### Multiple Database Configurations ```typescript // Configure for multiple databases (useful for dynamic stores) const multiDbQueryTool = createVectorQueryTool({ vectorStoreName: 'dynamic-store', // Will be set at runtime indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), databaseConfig: { pinecone: { namespace: 'default', }, pgvector: { minScore: 0.8, ef: 150, }, chroma: { where: { type: 'documentation' }, }, }, }) ``` **Multi-Config Benefits:** - Support multiple vector stores with one tool - Database-specific optimizations are automatically applied - Flexible deployment scenarios ### Runtime Configuration Override You can override database configurations at runtime to adapt to different scenarios: ```typescript import { RequestContext } from '@mastra/core/request-context' const queryTool = createVectorQueryTool({ vectorStoreName: 'pinecone', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), databaseConfig: { pinecone: { namespace: 'development', }, }, }) // Override at runtime const requestContext = new RequestContext() requestContext.set('databaseConfig', { pinecone: { namespace: 'production', // Switch to production namespace }, }) const response = await agent.generate('Find information about deployment', { requestContext, }) ``` This approach allows you to: - Switch between environments (dev/staging/prod) - Adjust performance parameters based on load - Apply different filtering strategies per request ## Example: Using request context ```typescript const queryTool = createVectorQueryTool({ vectorStoreName: 'pinecone', indexName: 'docs', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), }) ``` When using request context, provide required parameters at execution time via the request context: ```typescript const requestContext = new RequestContext<{ vectorStoreName: string indexName: string topK: number filter: VectorFilter databaseConfig: DatabaseConfig }>() requestContext.set('vectorStoreName', 'my-store') requestContext.set('indexName', 'my-index') requestContext.set('topK', 5) requestContext.set('filter', { category: 'docs' }) requestContext.set('databaseConfig', { pinecone: { namespace: 'runtime-namespace' }, }) requestContext.set('model', 'openai/text-embedding-3-small') const response = await agent.generate('Find documentation from the knowledge base.', { requestContext, }) ``` For more information on request context, please see: - [Agent Request Context](https://mastra.ai/docs/server/request-context) - [Request Context](https://mastra.ai/docs/server/request-context) ## Usage without a Mastra server The tool can be used by itself to retrieve documents matching a query: ```typescript import { RequestContext } from '@mastra/core/request-context' import { createVectorQueryTool } from '@mastra/rag' import { PgVector } from '@mastra/pg' const pgVector = new PgVector({ id: 'pg-vector', connectionString: process.env.POSTGRES_CONNECTION_STRING!, }) const vectorQueryTool = createVectorQueryTool({ vectorStoreName: 'pgVector', // optional since we're passing in a store vectorStore: pgVector, indexName: 'embeddings', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), }) const requestContext = new RequestContext() const queryResult = await vectorQueryTool.execute({ queryText: 'foo', topK: 1 }, { requestContext }) console.log(queryResult.sources) ``` ## Dynamic vector store for multi-tenant applications For multi-tenant applications where each tenant has isolated data (e.g., separate PostgreSQL schemas), you can pass a resolver function instead of a static vector store instance. The function receives the request context and can return the appropriate vector store for the current tenant: ```typescript import { createVectorQueryTool, VectorStoreResolver } from '@mastra/rag' import { PgVector } from '@mastra/pg' // Cache for tenant-specific vector stores const vectorStoreCache = new Map<string, PgVector>() // Resolver function that returns the correct vector store based on tenant const vectorStoreResolver: VectorStoreResolver = async ({ requestContext }) => { const tenantId = requestContext?.get('tenantId') if (!tenantId) { throw new Error('tenantId is required in request context') } // Return cached instance or create new one if (!vectorStoreCache.has(tenantId)) { vectorStoreCache.set( tenantId, new PgVector({ id: `pg-vector-${tenantId}`, connectionString: process.env.POSTGRES_CONNECTION_STRING!, schemaName: `tenant_${tenantId}`, // Each tenant has their own schema }), ) } return vectorStoreCache.get(tenantId)! } const vectorQueryTool = createVectorQueryTool({ indexName: 'embeddings', model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'), vectorStore: vectorStoreResolver, // Dynamic resolution! }) // Usage with tenant context const requestContext = new RequestContext() requestContext.set('tenantId', 'acme-corp') const result = await vectorQueryTool.execute( { queryText: 'company policies', topK: 5 }, { requestContext }, ) ``` This pattern is similar to how `Agent.memory` supports dynamic configuration and enables: - **Schema isolation**: Each tenant's data in separate PostgreSQL schemas - **Database isolation**: Route to different database instances per tenant - **Dynamic configuration**: Adjust vector store settings based on request context ## Tool details The tool is created with: - **ID**: `VectorQuery {vectorStoreName} {indexName} Tool` - **Input Schema**: Requires queryText and filter objects - **Output Schema**: Returns relevantContext string ## Related - [rerank()](https://mastra.ai/reference/rag/rerank) - [createGraphRAGTool](https://mastra.ai/reference/tools/graph-rag-tool)