@devilsdev/rag-pipeline-utils
Version:
A modular toolkit for building RAG (Retrieval-Augmented Generation) pipelines in Node.js
77 lines (55 loc) • 1.76 kB
Markdown
# Architecture
This page outlines the internal architecture of the RAG pipeline utilities, emphasizing modularity, plugin design, and SOLID-compliant structure.
## Core Design Philosophy
The architecture adheres to enterprise-grade practices:
- **Single Responsibility**: Each component handles one domain concern
- **Pluggable Interfaces**: Any layer (loader, retriever, LLM) can be swapped
- **Streaming-Ready**: Async flows support token-based output
- **Environment-Safe**: Config via `.ragrc.json`, not hardcoded
- **Testable**: All modules can be mocked and unit tested
## Layered Components
```
+-------------------------+
| createRagPipeline() |
+-------------------------+
| | |
▼ ▼ ▼
Loader Embedder Retriever
▼ | ▼
Chunks Embeddings Context
\ | /
▼ ▼ ▼
LLM Runner
|
Output
```
## Core Interfaces
Each plugin is registered via the `PluginRegistry` and looked up by key.
```ts
registry.register('loader', 'pdf', new PDFLoader())
const loader = registry.get('loader', 'pdf');
```
All plugins implement interface contracts like:
```ts
interface Loader {
load(path: string): Promise<{ chunk(): string[] }[]>;
}
interface Embedder {
embed(chunks: string[]): Vector[];
embedQuery(prompt: string): Vector;
}
interface Retriever {
store(vectors: Vector[]): Promise<void>;
retrieve(query: Vector): Promise<Context[]>;
}
```
## DAG Support
The `dag-engine.js` module supports chaining multiple components:
- Example: Summarize → Retrieve → Rerank → LLM
- Enables more complex workflows than linear ingestion/query
Next → [Evaluation](./Evaluation.md)