prompt-version-manager
Version:
Centralized prompt management system for Human Behavior AI agents
440 lines (337 loc) • 10.3 kB
Markdown
# PVM - Prompt Version Management
<div align="center">
[](https://www.python.org)
[](https://www.typescriptlang.org)
[](LICENSE)
**Git-like version control for AI prompts with automatic execution tracking**
[Features](#features) • [Quick Start](#quick-start) • [Simplified API](#simplified-api) • [Documentation](docs/) • [Examples](examples/)
</div>
## Overview
PVM (Prompt Version Management) brings Git-like version control to AI prompt development with automatic execution tracking. The new simplified API makes it incredibly easy to integrate prompt management into any AI workflow.
## Quick Start with Simplified API
### Installation
```bash
pip install prompt-version-management
# or
npm install prompt-version-management
```
### Basic Usage in 3 Lines
```python
from pvm import prompts, model
# Load and render a prompt template
prompt = prompts("assistant.md", ["expert", "analyze this data"])
# Execute with automatic tracking
response = await model.complete("gpt-4", prompt, "analysis-task")
print(response.content)
```
That's it! Every execution is automatically tracked with timing, tokens, costs, and model details.
## Simplified API Reference
### `prompts()` - Load and Render Templates
Load prompt templates with variable substitution:
```python
# Python
prompt = prompts("template_name.md", ["var1", "var2", "var3"])
# TypeScript
const prompt = prompts("template_name.md", ["var1", "var2", "var3"])
```
**How it works:**
1. Templates are stored in `.pvm/prompts/` directory
2. Variables are replaced in order of appearance
3. Supports YAML frontmatter for metadata
**Example template** (`.pvm/prompts/assistant.md`):
```markdown
---
version: 1.0.0
description: General assistant prompt
---
You are a {{role}} assistant specialized in {{domain}}.
Task: {{task}}
Please be {{tone}} in your response.
```
**Usage:**
```python
prompt = prompts("assistant.md", [
"helpful AI", # {{role}}
"data analysis", # {{domain}}
"analyze sales", # {{task}}
"concise" # {{tone}}
])
```
### `model.complete()` - Execute with Auto-Tracking
Execute any LLM with automatic tracking:
```python
# Python
response = await model.complete(
model_name="gpt-4", # Model to use
prompt="Analyze this text", # Prompt or previous response
tag="analysis", # Tag for tracking
json_output=MyModel, # Optional: structured output
temperature=0.7 # Optional: model parameters
)
# TypeScript
const response = await model.complete(
"gpt-4", // Model to use
"Analyze this text", // Prompt or previous response
"analysis", // Tag for tracking
{ schema: mySchema }, // Optional: structured output
{ temperature: 0.7 } // Optional: model parameters
)
```
**Automatic features:**
- Provider detection (OpenAI, Anthropic, Google)
- Execution tracking (tokens, latency, cost)
- SHA-256 prompt hashing
- Git-like auto-commits
- Chain detection
### Automatic Chaining
Pass a response directly to create execution chains:
```python
# First agent
analysis = await model.complete("gpt-4", prompt1, "analyze")
# Second agent - automatically chains!
summary = await model.complete("claude-3", analysis, "summarize")
# Third agent - chain continues
translation = await model.complete("gemini-pro", summary, "translate")
```
PVM automatically:
- Links executions with a chain ID
- Tracks parent-child relationships
- Preserves context across models
### Native Structured Output
Get typed responses using each provider's native capabilities:
```python
from pydantic import BaseModel
class Analysis(BaseModel):
sentiment: str
confidence: float
keywords: list[str]
# Automatic structured output
result = await model.complete(
"gpt-4",
"Analyze: PVM is amazing!",
"sentiment",
json_output=Analysis # Pass Pydantic model
)
print(result.content.sentiment) # "positive"
print(result.content.confidence) # 0.95
print(result.content.keywords) # ["PVM", "amazing"]
```
**Provider-specific implementations:**
- **OpenAI**: Uses `response_format` with Pydantic models
- **Anthropic**: Uses tool/function calling
- **Google**: Uses `response_schema` with Type constants
### ModelResponse Object
Every execution returns a `ModelResponse` with:
```python
response.content # The actual response (string or object)
response.model # Model used
response.execution_id # Unique execution ID
response.prompt_hash # SHA-256 of prompt
response.metadata # Execution details:
- tag # Your tracking tag
- provider # Detected provider
- chain_id # Chain ID (if chained)
- tokens # Token usage breakdown
- latency_ms # Execution time
- structured # Whether structured output was used
# Methods
str(response) # Convert to string
response.to_prompt() # Format for chaining
```
## Full Version Control Features
### Initialize Repository
```bash
pvm init
```
### Add and Track Prompts
```bash
# Add a prompt file
pvm add prompts/assistant.md -m "Add assistant prompt"
# Track executions automatically
python my_agent.py # Uses simplified API
# View execution history
pvm log
# See execution dashboard
pvm dashboard
```
### Branching and Experimentation
```bash
# Create experiment branch
pvm branch experiment/new-approach
pvm checkout experiment/new-approach
# Make changes and test
# ... edit prompts ...
# Merge back when ready
pvm checkout main
pvm merge experiment/new-approach
```
### Analytics and Insights
```bash
# View execution analytics
pvm analytics summary
# See token usage
pvm analytics tokens --days 7
# Track costs
pvm analytics cost --group-by model
```
## Repository Structure
```
.pvm/
├── prompts/ # Prompt templates
│ ├── assistant.md
│ ├── analyzer.md
│ └── summarizer.md
├── config.json # Repository configuration
├── HEAD # Current branch reference
└── objects/ # Content-addressed storage
├── commits/ # Commit objects
├── prompts/ # Prompt versions
└── executions/ # Execution records
```
## Advanced Features
### Templates with Complex Variables
```python
# Create a template with sections
template = """
You are a {{role}}.
## Instructions
{{instructions}}
## Context
{{context}}
## Output Format
{{format}}
"""
# Use with prompts()
prompt = prompts("complex_template.md", [
"senior data analyst",
"Analyze trends and patterns",
"Q4 sales data from 2023",
"Bullet points with insights"
])
```
### Multi-Provider Execution
```python
providers = ["gpt-4", "claude-3-opus", "gemini-ultra"]
# Run same prompt across providers
for model_name in providers:
response = await model.complete(
model_name,
prompt,
f"comparison-{model_name}"
)
print(f"{model_name}: {response.content}")
```
### Custom Tracking Metadata
```python
response = await model.complete(
"gpt-4",
prompt,
"analysis",
temperature=0.3,
max_tokens=1000,
# Custom parameters stored in metadata
user_id="12345",
session_id="abc-def",
experiment="v2"
)
```
### Working with Branches
```python
from pvm import Repository
repo = Repository(".")
# Create and switch branch
await repo.branch("feature/new-prompts")
await repo.checkout("feature/new-prompts")
# Make changes
prompt = prompts("new_assistant.md", ["expert", "task"])
response = await model.complete("gpt-4", prompt, "test")
# Commit and merge
await repo.commit("Test new prompts")
await repo.checkout("main")
await repo.merge("feature/new-prompts")
```
## Environment Setup
Set your API keys:
```bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
```
Or use `.env` file:
```env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
```
## Execution Analytics
View detailed analytics:
```bash
# Summary statistics
pvm analytics summary
# Token usage over time
pvm analytics tokens --days 30 --group-by model
# Cost analysis
pvm analytics cost --group-by tag
# Chain visualization
pvm analytics chains --limit 10
```
## Testing
Run tests for both implementations:
```bash
# Python tests
pytest tests/
# TypeScript tests
npm test
```
## Examples
### Two-Agent Analysis Pipeline
```python
from pvm import prompts, model
from pydantic import BaseModel
# Define output structures
class SentimentResult(BaseModel):
sentiment: str
confidence: float
class Summary(BaseModel):
title: str
key_points: list[str]
# Agent 1: Sentiment Analysis
sentiment_prompt = prompts("sentiment.md", ["expert analyst", text])
sentiment = await model.complete(
"gpt-4",
sentiment_prompt,
"sentiment-analysis",
json_output=SentimentResult
)
# Agent 2: Summarization (chained)
summary = await model.complete(
"claude-3",
sentiment, # Chains automatically!
"summarization",
json_output=Summary
)
print(f"Sentiment: {sentiment.content.sentiment}")
print(f"Summary: {summary.content.title}")
```
### A/B Testing Prompts
```python
# Version A
prompt_a = prompts("assistant_v1.md", variables)
response_a = await model.complete("gpt-4", prompt_a, "test-v1")
# Version B
prompt_b = prompts("assistant_v2.md", variables)
response_b = await model.complete("gpt-4", prompt_b, "test-v2")
# Compare results
print(f"Version A tokens: {response_a.metadata['tokens']['total']}")
print(f"Version B tokens: {response_b.metadata['tokens']['total']}")
```
## Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
## License
MIT License - see [LICENSE](LICENSE) for details.
## Links
- [Documentation](docs/)
- [API Reference](docs/api-reference.md)
- [Examples](examples/)
- [Migration Guide](docs/migration.md)
- [Template Guide](docs/template-guide.md)