UNPKG

agentvibes

Version:

Now your AI Agents can finally talk back! Professional TTS voice for Claude Code and Claude Desktop (via MCP) with multi-provider support.

236 lines (181 loc) 7.66 kB
# Extra Piper Voices - Implementation Summary ## Problem Statement Users encountered difficulties when trying to download and use extra custom Piper voices (jenny, kristin, 16Speakers): 1. **Download Issue**: `tracy.onnx` file was incorrectly named (should be `16Speakers.onnx`) 2. **Voice Registration Issue**: Custom voices couldn't be switched to because they didn't match the Piper voice pattern detection (`*_*-*`) 3. **Multi-Speaker Support**: The 16Speakers model contains 16 individual voices that couldn't be accessed individually ## Solution Implemented ### 1. Fixed Download Script ✅ **File**: `.claude/hooks/download-extra-voices.sh` **Changes**: - Already uses correct filename `16Speakers.onnx` (not `tracy.onnx`) - Downloads three custom voices: - `kristin.onnx` (64MB) - US English female - `jenny.onnx` (64MB) - UK English female with Irish accent - `16Speakers.onnx` (77MB) - Multi-speaker with 12 US + 4 UK voices ### 2. Enhanced Voice Manager ✅ **File**: `.claude/hooks/voice-manager.sh` **Changes**: - **Before**: Only detected Piper voices matching pattern `*_*-*` (e.g., `en_US-lessac-medium`) - **After**: Scans voice directory for ALL `.onnx` files, including custom voices **New Logic**: ```bash # For Piper provider: # 1. Check standard voice directory for .onnx files (jenny, kristin, 16Speakers) # 2. If not found, check multi-speaker registry (Cori_Samuel, Rose_Ibex, etc.) # 3. If multi-speaker found, store model + speaker ID separately ``` **Result**: Users can now switch to custom voices: - `/agent-vibes:switch jenny`- `/agent-vibes:switch kristin`- `/agent-vibes:switch 16Speakers`### 3. Multi-Speaker Registry ✅ **File**: `.claude/hooks/piper-multispeaker-registry.sh` (NEW) **Purpose**: Maps individual speaker names to model files and speaker IDs **Features**: - Registry of all 16 speakers in the 16Speakers model - Functions to lookup model and speaker ID by name - Function to list all multi-speaker voices with descriptions **Example Registry Entry**: ```bash "Cori_Samuel:16Speakers:0:US English Female" "Rose_Ibex:16Speakers:8:US English Female" "Paul_Hampton:16Speakers:12:UK English Male" ``` **Usage**: ```bash /agent-vibes:switch Cori_Samuel # Uses 16Speakers.onnx with speaker ID 0 /agent-vibes:switch Rose_Ibex # Uses 16Speakers.onnx with speaker ID 8 ``` ### 4. Multi-Speaker TTS Support ✅ **File**: `.claude/hooks/play-tts-piper.sh` **Changes**: **Voice Resolution**: ```bash # Check for multi-speaker voice configuration if [[ -f "tts-piper-model.txt" ]] && [[ -f "tts-piper-speaker-id.txt" ]]; then VOICE_MODEL=$(cat tts-piper-model.txt) # e.g., "16Speakers" SPEAKER_ID=$(cat tts-piper-speaker-id.txt) # e.g., "8" fi ``` **TTS Synthesis**: ```bash if [[ -n "$SPEAKER_ID" ]]; then # Multi-speaker: Pass speaker ID to Piper echo "$TEXT" | piper --model "$VOICE_PATH" --speaker "$SPEAKER_ID" --output_file "$OUTPUT" else # Single-speaker: Standard synthesis echo "$TEXT" | piper --model "$VOICE_PATH" --output_file "$OUTPUT" fi ``` ## File Changes Summary | File | Status | Changes | |------|--------|---------| | `download-extra-voices.sh` | ✅ Already Correct | Uses `16Speakers.onnx` (not tracy) | | `voice-manager.sh` | ✅ Updated | Added Piper voice directory scanning, multi-speaker support | | `piper-multispeaker-registry.sh` | ✅ Created | New registry for 16 speaker mappings | | `play-tts-piper.sh` | ✅ Updated | Added multi-speaker voice resolution and speaker ID support | | `docs/voice-registration-fix.md` | ✅ Created | Detailed problem analysis and solution design | | `docs/extra-voices-implementation-summary.md` | ✅ Created | This file | ## Available Voices After Implementation ### Standard Custom Voices - `jenny` - UK English female with Irish accent (CC BY) - `kristin` - US English female (Public Domain) - `16Speakers` - Access all 16 speakers at once (uses first speaker by default) ### Individual Multi-Speaker Voices (16Speakers Model) **US English Speakers (12)**: - `Cori_Samuel` (Female, ID: 0) - `Kara_Shallenberg` (Female, ID: 1) - `Kristin_Hughes` (Female, ID: 2) - `Maria_Kasper` (Female, ID: 3) - `Mike_Pelton` (Male, ID: 4) - `Mark_Nelson` (Male, ID: 5) - `Michael_Scherer` (Male, ID: 6) - `James_K_White` (Male, ID: 7) - `Rose_Ibex` (Female, ID: 8) - `progressingamerica` (Male, ID: 9) - `Steve_C` (Male, ID: 10) - `Owlivia` (Female, ID: 11) **UK English Speakers (4)**: - `Paul_Hampton` (Male, ID: 12) - `Jennifer_Dorr` (Female, ID: 13) - `Emily_Cripps` (Female, ID: 14) - `Martin_Clifton` (Male, ID: 15) ## Usage Examples ### Download Extra Voices ```bash # Via MCP mcp__agentvibes__download_extra_voices() # Via slash command (when implemented) /agent-vibes:provider download ``` ### Switch to Custom Voice ```bash # UK English female with Irish accent /agent-vibes:switch jenny # US English female /agent-vibes:switch kristin ``` ### Switch to Multi-Speaker Voice ```bash # Use a specific speaker from 16Speakers model /agent-vibes:switch Cori_Samuel # US Female speaker /agent-vibes:switch Rose_Ibex # US Female speaker /agent-vibes:switch Paul_Hampton # UK Male speaker ``` ### List Available Voices ```bash /agent-vibes:list ``` Output will show: - Standard Piper voices (en_US-lessac-medium, etc.) - Custom voices (jenny, kristin, 16Speakers) - Multi-speaker voices (Cori_Samuel, Rose_Ibex, etc.) ## Technical Details ### State Files for Multi-Speaker Voices When you switch to a multi-speaker voice, three files are created: 1. `.claude/tts-voice.txt` - Stores speaker name (e.g., "Cori_Samuel") 2. `.claude/tts-piper-model.txt` - Stores model file (e.g., "16Speakers") 3. `.claude/tts-piper-speaker-id.txt` - Stores speaker ID (e.g., "0") ### Voice Lookup Priority (Piper Provider) 1. **Standard voice files**: Check voice directory for `<name>.onnx` 2. **Multi-speaker registry**: Check registry for speaker name 3. **Error**: Show available voices if not found ### Piper TTS Command Format **Single-speaker voice**: ```bash piper --model /path/to/jenny.onnx --output_file output.wav ``` **Multi-speaker voice**: ```bash piper --model /path/to/16Speakers.onnx --speaker 8 --output_file output.wav ``` ## Testing Checklist - [x] Download script uses correct filename (16Speakers.onnx) - [x] Voice manager scans directory for custom voices - [x] Multi-speaker registry created with all 16 speakers - [x] Voice manager checks multi-speaker registry - [x] play-tts-piper.sh reads speaker ID from config - [x] play-tts-piper.sh passes --speaker flag to Piper - [ ] Test: Download extra voices via MCP - [ ] Test: Switch to jenny voice - [ ] Test: Switch to kristin voice - [ ] Test: Switch to Cori_Samuel (multi-speaker) - [ ] Test: Play TTS with each voice - [ ] Test: List voices shows all custom + multi-speaker ## Next Steps 1. **Test the implementation** with actual voice downloads and switching 2. **Update voice listing** to show custom and multi-speaker voices in organized sections 3. **Add MCP methods** for listing multi-speaker voices 4. **Create user documentation** with voice preview samples ## Benefits ✅ **Users can now**: - Download high-quality custom voices with one command - Switch to custom voices (jenny, kristin) just like standard voices - Access individual speakers from multi-speaker models - Choose from 19 total custom voices (3 models + 16 individual speakers) ✅ **System improvements**: - Consistent voice switching experience across all voice types - Automatic multi-speaker detection and configuration - Clear error messages with available voice listings - Support for future multi-speaker models