git-contextor
Version:
A code context tool with vector search and real-time monitoring, with optional Git integration.
112 lines (84 loc) • 3.75 kB
Markdown
# Git Contextor - LangChain Integration
[LangChain](https://www.langchain.com/) is a popular framework for developing applications powered by language models. Git Contextor can act as a powerful, repository-aware knowledge source for LangChain applications.
The primary integration point is to use Git Contextor as a custom **Retriever**.
## Custom LangChain Retriever
A retriever is a component that, given a query, returns relevant documents. Here's how you can create a custom retriever that calls the Git Contextor API.
### Prerequisites
- Git Contextor running with its API accessible.
- Your Git Contextor API key.
- Python with `langchain` and `requests` packages installed.
### Python Implementation
Below is a sample implementation of a `BaseRetriever` for LangChain that queries your Git Contextor instance.
```python
import requests
from typing import List
from langchain_core.retrievers import BaseRetriever
from langchain_core.callbacks import CallbackManagerForRetrieverRun
from langchain_core.documents import Document
class GitContextorRetriever(BaseRetriever):
"""
A custom retriever for LangChain that fetches context from a running
Git Contextor instance.
"""
api_url: str = "http://localhost:3000/api/search"
api_key: str
max_tokens: int = 2048
def _get_relevant_documents(
self, query: str, *, run_manager: CallbackManagerForRetrieverRun
) -> List[Document]:
"""
Get documents from the Git Contextor API.
"""
headers = {
"x-api-key": self.api_key,
"Content-Type": "application/json",
}
payload = {
"query": query,
"max_tokens": self.max_tokens,
}
try:
response = requests.post(self.api_url, headers=headers, json=payload)
response.raise_for_status() # Raise an exception for bad status codes
# The API returns a single string of optimized context.
# We wrap it in a LangChain Document.
context_string = response.json().get("context", "")
if not context_string:
return []
return [Document(page_content=context_string)]
except requests.exceptions.RequestException as e:
print(f"Error calling Git Contextor API: {e}")
return []
# --- Example Usage ---
if __name__ == "__main__":
# Remember to replace with your actual API key
GIT_CONTEXTOR_API_KEY = "gctx_..."
retriever = GitContextorRetriever(api_key=GIT_CONTEXTOR_API_KEY)
query = "how is user authentication implemented?"
documents = retriever.invoke(query)
if documents:
print("--- Retrieved Context ---")
print(documents[0].page_content)
else:
print("No context retrieved.")
```
### Using in a Chain
Once you have the retriever, you can easily use it in a Retrieval-Augmented Generation (RAG) chain.
```python
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# Assuming 'retriever' is the GitContextorRetriever instance from above
# and you have set up your LLM
# llm = ChatOpenAI(model="gpt-4-turbo-preview")
# rag_chain = (
# {"context": retriever, "question": RunnablePassthrough()}
# | prompt_template # Your prompt template here
# | llm
# | StrOutputParser()
# )
# # Now you can invoke the chain with a question
# result = rag_chain.invoke("What's the database schema for products?")
# print(result)
```
This setup allows LangChain to use your entire repository as a dynamic, real-time knowledge base for your LLM applications.