@dollhousemcp/mcp-server
Version:
DollhouseMCP - A Model Context Protocol (MCP) server that enables dynamic AI persona management from markdown files, allowing Claude and other compatible AI assistants to activate and switch between different behavioral personas.
603 lines (442 loc) • 12.8 kB
YAML
name: token-estimation-guidelines
type: memory
description: Guidelines for estimating and optimizing token usage in auto-load memories
version: 1.0.0
author: DollhouseMCP
category: documentation
tags:
- tokens
- optimization
- guidelines
- auto-load
- performance
triggers:
- token-estimation
- token-guidelines
- optimize-tokens
- token-budget
- token-usage
autoLoad: false
priority: 999
retention: permanent
privacyLevel: public
searchable: true
trustLevel: VALIDATED
# Token Estimation Guidelines for Auto-Load Memories
## Understanding Tokens
### What Are Tokens?
Tokens are the fundamental units that AI models process. Think of them as "word pieces":
- **1 token** ≈ 4 characters in English
- **1 token** ≈ 0.75 words (conservative estimate)
- **100 tokens** ≈ 75 words ≈ 1 paragraph
- **1,000 tokens** ≈ 750 words ≈ 1-2 pages
**Examples**:
```
"Hello, world!" = 4 tokens
"The quick brown fox jumps" = 5 tokens
"DollhouseMCP" = 3 tokens (Doll-house-MCP)
```
### Why Tokens Matter for Auto-Load
Auto-load memories consume tokens at **every startup**:
- **Token Budget**: Default 5,000 tokens (~3,750 words)
- **Impact**: Larger memories = less room for other memories
- **Performance**: More tokens = longer startup time
- **Cost**: Some AI services charge per token
**Goal**: Maximize value per token.
## DollhouseMCP Estimation Method
### The Formula
```typescript
estimatedTokens = Math.ceil(wordCount / 0.75)
```
**Conservative estimate**: 1 token per 0.75 words
### Example Calculation
```
Content: "The quick brown fox jumps over the lazy dog"
Word count: 9 words
Estimated tokens: 9 / 0.75 = 12 tokens
```
### Accuracy
- **Conservative**: Tends to overestimate slightly
- **English-optimized**: Best for English text
- **Code-aware**: May underestimate for code-heavy content
**Comparison to other estimators**:
- DollhouseMCP: 12 tokens (conservative)
- Actual (Claude): 10-11 tokens (varies by model)
- OpenAI tiktoken: 9-10 tokens
**Recommendation**: Use DollhouseMCP estimate for budgeting, actual usage will be slightly lower.
## Token Size Categories
### Micro (0-500 tokens)
**~0-375 words, ~0-0.5 pages**
**Best For**:
- Quick reference cards
- Cheat sheets
- Key definitions
- Contact lists
**Example**:
```yaml
name: team-contacts
tokens: 350
content: |
# Team Contacts
- PM: Jane (@jane)
- Tech Lead: Bob (@bob)
- DevOps: Alice (@alice)
## Escalation
1. Team Lead → Director → VP
2. After hours: On-call rotation (PagerDuty)
```
### Small (500-1,500 tokens)
**~375-1,125 words, ~0.5-1.5 pages**
**Best For**:
- Coding standards summaries
- API endpoint lists
- Configuration options
- Brief project overviews
**Example**:
```yaml
name: coding-standards
tokens: 1,200
content: |
# Coding Standards
## TypeScript
- Use strict mode
- No `any` types
- Prefer interfaces over types
## Testing
- Coverage >80%
- Jest for unit tests
- Playwright for E2E
## PR Process
1. Feature branch from main
2. Write tests first
3. Get 2 approvals
4. Squash merge
```
### Medium (1,500-5,000 tokens)
**~1,125-3,750 words, ~1.5-5 pages**
**Best For**:
- Project architecture overviews
- Domain knowledge primers
- Team onboarding guides
- Detailed API references
**Example**:
```yaml
name: project-architecture
tokens: 3,500
content: |
# Project Architecture
## System Overview
MyApp is a 3-tier web application...
## Components
### Frontend (React)
- User interface
- State management (Redux)
- API client
### Backend (Node.js)
- REST API (Express)
- Business logic
- Authentication (JWT)
### Database (PostgreSQL)
- User data
- Transaction logs
- Analytics
## Deployment
- Dev: AWS ECS (fargate)
- Staging: AWS ECS (fargate)
- Prod: AWS ECS (EC2 reserved)
[... detailed sections ...]
```
### Large (5,000-10,000 tokens)
**~3,750-7,500 words, ~5-10 pages**
**Best For**:
- Comprehensive documentation
- Extensive domain knowledge
- Multi-project context
**Warning**: May trigger warnings, consider splitting.
**Example**:
```yaml
name: healthcare-domain-knowledge
tokens: 7,800
content: |
# Healthcare Domain Knowledge
## Medical Terminology
[Extensive glossary...]
## Regulatory Compliance
[HIPAA, GDPR, etc...]
## Clinical Workflows
[Patient intake, treatment, discharge...]
## Coding Systems
[ICD-10, CPT, SNOMED...]
```
### Very Large (10,000+ tokens)
**~7,500+ words, ~10+ pages**
**Not Recommended for Auto-Load**
**Why**:
- Exceeds default budget alone
- Slows startup significantly
- Likely contains unnecessary detail
**Solution**: Split into focused memories or use on-demand loading.
## Optimization Strategies
### Strategy 1: Use Bullet Points
**Before** (verbose):
```
The application uses a three-tier architecture. The frontend is
implemented using React and handles all user interactions. The
backend is built with Node.js and Express and provides a REST API
for the frontend to consume. The database layer uses PostgreSQL
to store persistent data.
```
**Tokens**: ~60
**After** (bullet points):
```
Architecture:
- Frontend: React (user interface)
- Backend: Node.js + Express (REST API)
- Database: PostgreSQL (persistent storage)
```
**Tokens**: ~30 (50% reduction)
### Strategy 2: Remove Examples
**Before**:
```
Use TypeScript strict mode.
Bad:
function add(a, b) { return a + b; }
Good:
function add(a: number, b: number): number { return a + b; }
```
**Tokens**: ~40
**After**:
```
Use TypeScript strict mode (type all parameters and returns).
```
**Tokens**: ~12 (70% reduction)
### Strategy 3: Link Instead of Embed
**Before**:
```
[Paste entire API documentation - 5,000 tokens]
```
**After**:
```
API Docs: https://docs.company.com/api
Quick Reference:
- GET /users - List users
- POST /users - Create user
- PUT /users/:id - Update user
- DELETE /users/:id - Delete user
```
**Tokens**: ~50 (99% reduction)
### Strategy 4: Use Abbreviations
**Before**:
```
The application programming interface endpoint for creating
a new user account requires authentication via JSON Web Token
in the Authorization header following the Bearer authentication
scheme.
```
**Tokens**: ~40
**After**:
```
User creation API endpoint requires JWT Bearer auth in Authorization header.
```
**Tokens**: ~15 (62% reduction)
### Strategy 5: Remove Redundancy
**Before**:
```
The frontend uses React. React is a JavaScript library for
building user interfaces. We chose React because it's popular
and has a large ecosystem of libraries and tools.
```
**Tokens**: ~35
**After**:
```
Frontend: React (popular, large ecosystem)
```
**Tokens**: ~8 (77% reduction)
### Strategy 6: Structured Lists Instead of Prose
**Before**:
```
Our deployment process starts with creating a feature branch,
then you write your code and tests, after that you open a pull
request and get two approvals, and finally you merge to main
which triggers automatic deployment to staging and then after
QA approval it goes to production.
```
**Tokens**: ~60
**After**:
```
Deployment:
1. Create feature branch
2. Write code + tests
3. Open PR (2 approvals required)
4. Merge → staging (auto)
5. QA approval → production
```
**Tokens**: ~30 (50% reduction)
## Token Budgeting
### Default Budget (5,000 tokens)
**Recommended Distribution**:
```yaml
# System baseline (1,000 tokens)
dollhousemcp-baseline-knowledge: 1,000
# Organizational context (1,500 tokens)
company-coding-standards: 800
security-policies: 700
# Team context (1,500 tokens)
team-patterns: 1,000
domain-knowledge: 500
# Project context (1,000 tokens)
project-architecture: 600
current-sprint: 400
# Total: 5,000 tokens
```
### Expanded Budget (10,000 tokens)
For larger teams or complex projects:
```yaml
autoLoad:
maxTokenBudget: 10000
```
**Recommended Distribution**:
```yaml
# System (1,000)
dollhousemcp-baseline-knowledge: 1,000
# Organizational (2,500)
company-standards: 1,200
security-compliance: 800
architecture-principles: 500
# Team (3,500)
team-patterns: 1,500
domain-knowledge: 1,000
api-conventions: 1,000
# Project (2,500)
project-architecture: 1,500
current-sprint: 600
recent-decisions: 400
# Reference (500)
error-codes: 300
common-commands: 200
# Total: 10,000 tokens
```
### Aggressive Budget (15,000+ tokens)
**Warning**: May impact startup performance.
**Use Cases**:
- Highly specialized domains (medical, legal, financial)
- Large enterprise with complex context
- Documentation-heavy projects
**Recommendation**: Monitor startup time, consider splitting into multiple smaller memories instead.
## Measuring Token Usage
### Method 1: Validation Command
```bash
dollhouse validate memory my-memory
# Output includes:
# "Estimated tokens: 1,234"
```
### Method 2: Startup Logs
```bash
# Look for auto-load summary in logs:
# "[ServerStartup] Auto-load complete: 5 memories loaded (~3,200 tokens)"
```
### Method 3: Bulk Analysis
```bash
# Get token counts for all auto-load memories
for memory in $(dollhouse list memories --filter autoLoad=true --json | jq -r '.[].name'); do
echo "Memory: $memory"
dollhouse validate memory "$memory" | grep "Estimated tokens"
done
```
### Method 4: Configuration Review
```bash
# Check current budget
dollhouse config show autoLoad.maxTokenBudget
# Check budget utilization
dollhouse config show autoLoad | grep -E "(maxTokenBudget|totalLoaded)"
```
## Performance Impact
### Startup Time
Token count affects startup time:
- **1,000 tokens**: +5ms
- **5,000 tokens**: +25ms (default)
- **10,000 tokens**: +50ms
- **20,000 tokens**: +100ms (not recommended)
**Guideline**: Keep total under 10,000 tokens for fast startup.
### Memory Usage
Each token consumes ~4 bytes in memory:
- **5,000 tokens**: ~20 KB
- **10,000 tokens**: ~40 KB
- **50,000 tokens**: ~200 KB
**Guideline**: Memory usage is negligible for typical budgets.
### Cost Impact (for paid AI services)
Some AI services charge per token:
- **Claude**: $0.015 per 1K tokens (input)
- **GPT-4**: $0.03 per 1K tokens (input)
**Example**:
- 5,000 tokens per startup
- 100 startups per day
- 500,000 tokens per day
- Claude cost: $7.50/day = $225/month
**Guideline**: For high-frequency usage, optimize aggressively.
## Troubleshooting
### Issue 1: Budget Exceeded
**Symptom**: "Token budget reached, loaded X memories, skipping remaining Y"
**Diagnosis**:
```bash
# List auto-load memories with estimates
dollhouse list memories --filter autoLoad=true --format detailed
```
**Solutions**:
1. Increase budget: `autoLoad.maxTokenBudget: 10000`
2. Optimize large memories (see optimization strategies)
3. Reduce priority of less-important memories
4. Remove rarely-used memories from auto-load
### Issue 2: Memory Too Large Warning
**Symptom**: "Memory 'xyz' is very large (~12,000 tokens)"
**Diagnosis**:
```bash
# Validate specific memory
dollhouse validate memory xyz
# Check word count
wc -w ~/.dollhouse/portfolio/memories/xyz.yaml
```
**Solutions**:
1. Split into multiple focused memories
2. Remove verbose examples
3. Link to external docs instead of embedding
4. Use bullet points instead of paragraphs
5. Set `maxSingleMemoryTokens` limit
### Issue 3: Slow Startup
**Symptom**: Startup takes >500ms
**Diagnosis**:
```bash
# Check total tokens loaded
# Look in logs: "[ServerStartup] Auto-load complete: ... (~X tokens)"
```
**Solutions**:
1. Reduce total token budget
2. Optimize memories (target 60-80% budget utilization)
3. Remove auto-load flag from rarely-used memories
4. Consider caching (already implemented in v1.9.25+)
## Quick Reference
| Size | Tokens | Words | Pages | Best For |
|------|--------|-------|-------|----------|
| Micro | 0-500 | 0-375 | 0-0.5 | Quick reference, contact lists |
| Small | 500-1.5K | 375-1.1K | 0.5-1.5 | Standards summary, API lists |
| Medium | 1.5K-5K | 1.1K-3.8K | 1.5-5 | Architecture, domain knowledge |
| Large | 5K-10K | 3.8K-7.5K | 5-10 | Comprehensive docs (split recommended) |
| Very Large | 10K+ | 7.5K+ | 10+ | Not recommended for auto-load |
## Token Optimization Checklist
- [ ] Use bullet points instead of paragraphs
- [ ] Remove verbose examples (link to docs instead)
- [ ] Abbreviate common terms (API, JWT, DB, etc.)
- [ ] Use structured lists for processes
- [ ] Remove redundant explanations
- [ ] Link to external docs for details
- [ ] Use tables for comparisons
- [ ] Remove "filler" words (the, a, an, that, which)
- [ ] Target 60-80% budget utilization
- [ ] Review and optimize quarterly
## Related Documentation
- [How to Create Custom Auto-Load Memories](./how-to-create-custom-auto-load-memories.yaml)
- [Priority Best Practices for Teams](./priority-best-practices-for-teams.yaml)
**Last Updated**: v1.9.25 (October 2025)