@mastra/rag
Version:
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
111 lines (75 loc) • 2.74 kB
Markdown
# GraphRAG
The `GraphRAG` class implements a graph-based approach to retrieval augmented generation. It creates a knowledge graph from document chunks where nodes represent documents and edges represent semantic relationships, enabling both direct similarity matching and discovery of related content through graph traversal.
## Basic usage
```typescript
import { GraphRAG } from '@mastra/rag'
const graphRag = new GraphRAG({
dimension: 1536,
threshold: 0.7,
})
// Create the graph from chunks and embeddings
graphRag.createGraph(documentChunks, embeddings)
// Query the graph with embedding
const results = await graphRag.query({
query: queryEmbedding,
topK: 10,
randomWalkSteps: 100,
restartProb: 0.15,
})
```
## Constructor parameters
**dimension** (`number`): Dimension of the embedding vectors (Default: `1536`)
**threshold** (`number`): Similarity threshold for creating edges between nodes (0-1) (Default: `0.7`)
## Methods
### `createGraph`
Creates a knowledge graph from document chunks and their embeddings.
```typescript
createGraph(chunks: GraphChunk[], embeddings: GraphEmbedding[]): void
```
#### Parameters
**chunks** (`GraphChunk[]`): Array of document chunks with text and metadata
**embeddings** (`GraphEmbedding[]`): Array of embeddings corresponding to chunks
### query
Performs a graph-based search combining vector similarity and graph traversal.
```typescript
query({
query,
topK = 10,
randomWalkSteps = 100,
restartProb = 0.15
}: {
query: number[];
topK?: number;
randomWalkSteps?: number;
restartProb?: number;
}): RankedNode[]
```
#### Parameters
**query** (`number[]`): Query embedding vector
**topK** (`number`): Number of results to return (Default: `10`)
**randomWalkSteps** (`number`): Number of steps in random walk (Default: `100`)
**restartProb** (`number`): Probability of restarting walk from query node (Default: `0.15`)
#### Returns
Returns an array of `RankedNode` objects, where each node contains:
**id** (`string`): Unique identifier for the node
**content** (`string`): Text content of the document chunk
**metadata** (`Record<string, any>`): Additional metadata associated with the chunk
**score** (`number`): Combined relevance score from graph traversal
## Advanced example
```typescript
const graphRag = new GraphRAG({
dimension: 1536,
threshold: 0.8, // Stricter similarity threshold
})
// Create graph from chunks and embeddings
graphRag.createGraph(documentChunks, embeddings)
// Query with custom parameters
const results = await graphRag.query({
query: queryEmbedding,
topK: 5,
randomWalkSteps: 200,
restartProb: 0.2,
})
```
## Related
- [createGraphRAGTool](https://mastra.ai/reference/tools/graph-rag-tool)