@posthog/agent
Version:
TypeScript agent framework wrapping Claude Agent SDK with Git-based task execution for PostHog
266 lines (213 loc) • 7.81 kB
Markdown
# PostHog Agent SDK
TypeScript agent framework that wraps the Claude Agent SDK for PostHog's Array desktop app. Features a Git-based task execution system that stores task artifacts alongside your code.
## Quick Start
```bash
pnpm install
pnpm run example
```
## Key Features
- **Git-Based Task Execution**: Plans and artifacts stored in `.posthog/` folders and committed to Git
- **PostHog Integration**: Fetches existing tasks from PostHog API
- **3-Phase Execution**: Research → Plan → Build with automatic progression
- **Branch Management**: Automatic branch creation for planning and implementation
- **Progress Tracking**: Execution status stored in PostHog `TaskRun` records for easy polling
## Usage
```typescript
import { Agent, PermissionMode } from '@posthog/agent';
import type { AgentEvent } from '@posthog/agent';
const agent = new Agent({
workingDirectory: "/path/to/repo",
posthogApiUrl: "https://app.posthog.com",
posthogApiKey: process.env.POSTHOG_API_KEY, // Used for both API and MCP
posthogProjectId: 1,
onEvent: (event) => {
// Streamed updates for responsive UIs
if (event.type !== 'token') {
handleLiveEvent(event);
}
},
});
// Run a task
const taskId = "task_abc123";
const task = await agent.getPostHogClient()?.fetchTask(taskId);
await agent.runTask(task, {
repositoryPath: "/path/to/repo",
permissionMode: PermissionMode.ACCEPT_EDITS,
isCloudMode: false,
createPR: true, // Optional: create PR after build. This setting has no effect if running in cloud mode.
autoProgress: true,
});
```
For local MCP development:
```typescript
const agent = new Agent({
workingDirectory: "/path/to/repo",
posthogMcpUrl: 'http://localhost:8787/mcp',
});
```
## Task Execution
Each task execution creates Git branches and follows a 3-phase approach:
1. **Research Phase**: Analyzes the codebase and may generate clarifying questions
2. **Planning Phase**: Creates an implementation plan in `.posthog/{id}/plan.md` on branch `posthog/task-{id}-planning`
3. **Build Phase**: Implements code changes on branch `posthog/task-{id}-implementation`
## File System
```
your-repo/
├── .posthog/
│ ├── README.md
│ ├── .gitignore
│ └── {task-id}/
│ ├── plan.md
│ ├── questions.json (if research phase generated questions)
│ └── context.md (optional)
└── (your code)
```
## Progress Updates
Progress for each task execution is persisted to PostHog's `TaskRun` model, so UIs can poll for updates without relying on streaming hooks:
```typescript
const agent = new Agent({
workingDirectory: repoPath,
posthogApiUrl: "https://app.posthog.com",
posthogApiKey: process.env.POSTHOG_KEY,
posthogProjectId: 1,
});
const poller = setInterval(async () => {
const client = agent.getPostHogClient();
const runs = await client?.listTaskRuns(taskId);
const latestRun = runs?.sort((a, b) =>
new Date(b.created_at).getTime() - new Date(a.created_at).getTime()
)[0];
if (latestRun) {
// Fetch logs from S3 using presigned URL
const logs = await client?.fetchTaskRunLogs(latestRun);
renderProgress(latestRun.status, logs || []);
}
}, 3000);
try {
await agent.runTask(task, { repositoryPath: repoPath });
} finally {
clearInterval(poller);
}
// Live stream still available through the onEvent hook
function handleLiveEvent(event: AgentEvent) {
switch (event.type) {
case 'status':
// optimistic UI update
break;
case 'error':
notifyError(event.message);
break;
default:
break;
}
}
```
> Prefer streaming updates? Pass an `onEvent` handler when constructing the agent to keep receiving real-time events while progress is also written to PostHog.
## Requirements
- PNPM
- Git repository
- PostHog API access
- Claude API access via `-ai/claude-agent-sdk`
## Configuration Options
You can customize behavior using `TaskExecutionOptions`:
```ts
await agent.runTask(task, {
repositoryPath: "/path/to/repo",
permissionMode: PermissionMode.ACCEPT_EDITS, // or PLAN, DEFAULT, BYPASS
isCloudMode: false, // local execution with pauses between phases
autoProgress: true, // automatically progress through phases
queryOverrides: {
model: 'claude-sonnet-4-5-20250929',
temperature: 0.7
}
});
```
## Fine-Grained Permissions
For advanced control over agent actions, you can provide a `canUseTool` callback that intercepts every tool use during the **build phase** (for task execution) or **direct run calls**. This allows you to implement custom approval flows, logging, or restrictions.
See the [Claude Agent SDK Permissions docs](https://docs.claude.com/en/api/agent-sdk/permissions) for more details.
### Per-Agent Configuration
Apply the same permission hook to all task executions and direct runs:
```typescript
import { Agent } from '/agent';
import type { PermissionResult } from '/agent';
const agent = new Agent({
workingDirectory: "/path/to/repo",
posthogApiUrl: "https://app.posthog.com",
posthogApiKey: process.env.POSTHOG_API_KEY,
posthogProjectId: 1,
canUseTool: async (toolName, input, { signal, suggestions }) => {
// Block destructive commands
if (toolName === 'Bash' && input.command?.includes('rm -rf')) {
return {
behavior: 'deny',
message: 'Destructive rm -rf commands are not allowed',
interrupt: true
};
}
// Allow everything else
return {
behavior: 'allow',
updatedInput: input
};
}
});
```
### Per-Task Configuration
Override permissions for specific tasks (only applied during build phase):
```typescript
await agent.runTask(task, {
repositoryPath: "/path/to/repo",
permissionMode: PermissionMode.DEFAULT,
canUseTool: async (toolName, input, { signal, suggestions }) => {
// Custom approval UI
const approved = await showApprovalDialog({
tool: toolName,
input: input,
suggestions: suggestions // Permission updates for "always allow"
});
if (approved.action === 'allow') {
return {
behavior: 'allow',
updatedInput: approved.modifiedInput || input,
updatedPermissions: approved.rememberChoice ? suggestions : undefined
};
}
return {
behavior: 'deny',
message: approved.reason || 'User denied permission',
interrupt: !approved.continueWithGuidance
};
}
});
```
### Direct Run Example
For one-off queries with custom permissions:
```typescript
const result = await agent.run("Fix the authentication bug", {
repositoryPath: "/path/to/repo",
permissionMode: PermissionMode.DEFAULT,
canUseTool: async (toolName, input, { signal }) => {
console.log(`Agent wants to use ${toolName}:`, input);
// Simple approval logic
if (toolName === 'Write' || toolName === 'Edit') {
const allowedFiles = ['src/', 'tests/'];
const filePath = input.file_path || input.path;
const isAllowed = allowedFiles.some(prefix => filePath?.startsWith(prefix));
if (!isAllowed) {
return {
behavior: 'deny',
message: `Can only modify files in: ${allowedFiles.join(', ')}`
};
}
}
return { behavior: 'allow', updatedInput: input };
}
});
```
### Available Tool Names
The `canUseTool` callback receives one of these tool names:
- **Read-only**: `Read`, `Glob`, `Grep`, `WebFetch`, `WebSearch`, `ListMcpResources`, `ReadMcpResource`
- **Modifications**: `Write`, `Edit`, `NotebookEdit`
- **Execution**: `Bash`, `BashOutput`, `KillBash`, `Task`
- **Planning**: `ExitPlanMode`, `TodoWrite`
**Note**: Research and planning phases have fixed, read-only tool sets. The `canUseTool` hook only applies to the build phase and direct run calls.