UNPKG

agentvibes

Version:

Now your AI Agents can finally talk back! Professional TTS voice for Claude Code and Claude Desktop (via MCP) with multi-provider support.

146 lines (104 loc) 3.79 kB
# 🎤 Piper TTS Installation for AgentVibes This directory contains installation scripts for [Piper TTS](https://github.com/rhasspy/piper), a fast, local neural text-to-speech system. ## Why Piper? -**100% Free** - No API costs, ever -**Offline** - Works without internet connection -**Cross-platform** - Windows, macOS, Linux -**High Quality** - Comparable to Google TTS -**Fast** - Even runs on Raspberry Pi -**No Rate Limits** - Generate unlimited audio ## Installation ### For WSL (Windows Subsystem for Linux) ```bash # Run the installation script ./scripts/piper-voice/wsl-install.sh ``` **What it does:** 1. Installs Python dependencies (pipx) 2. Installs Piper TTS via pipx 3. Downloads English US voice (medium quality, ~20MB) 4. Tests the installation 5. Provides usage examples **Requirements:** - WSL2 (Ubuntu/Debian-based) - Internet connection (for initial download) - ~50MB disk space ### Manual Installation If you prefer manual installation or need a different platform: ```bash # Install via pipx pipx install piper-tts # Download voices manually mkdir -p ~/.local/share/piper/voices cd ~/.local/share/piper/voices # Download a voice from HuggingFace wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json ``` ## Usage ### Basic Usage ```bash # Generate speech to file echo "Hello from Piper!" | piper \ --model ~/.local/share/piper/voices/en_US-lessac-medium.onnx \ --output_file output.wav # Play immediately (WSL) echo "Hello from Piper!" | piper \ --model ~/.local/share/piper/voices/en_US-lessac-medium.onnx \ --output_file - | paplay ``` ### With AgentVibes Integration with AgentVibes is coming soon! This will allow you to: - Use Piper as an alternative to ElevenLabs - Switch between providers with `/agent-vibes:provider` - Save on API costs while maintaining quality ## Available Voices Browse and download voices from: - **HuggingFace**: https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0 - **Samples**: https://rhasspy.github.io/piper-samples/ ### Recommended Voices **English (US):** - `en_US-lessac-medium` - Clear, natural (included in install script) - `en_US-amy-medium` - Warm, friendly - `en_US-libritts-high` - Highest quality **Other Languages:** - Spanish: `es_ES-sharvard-medium` - French: `fr_FR-siwis-medium` - German: `de_DE-thorsten-medium` - Italian: `it_IT-riccardo-medium` ### Quality Levels - `x_low` - 16kHz, ~5-7MB - Fast, lower quality - `low` - 16kHz, ~15-20MB - Good balance - `medium` - 22.05kHz, ~15-20MB - **Recommended** - `high` - 22.05kHz, ~28-32MB - Best quality ## Troubleshooting ### Command not found: piper Add pipx bin directory to PATH: ```bash export PATH="$HOME/.local/bin:$PATH" # Add to ~/.bashrc or ~/.zshrc to make permanent ``` ### No audio output Install audio utilities: ```bash # For WSL sudo apt-get install pulseaudio-utils ``` ### Voice download fails - Check internet connection - Verify HuggingFace is accessible - Try alternative mirror or manual download ## Performance **Benchmark (WSL2 on modern laptop):** - Generation speed: ~10-20x real-time - Latency: ~100-200ms for short phrases - Memory: ~50-100MB per voice model ## Links - **Piper GitHub**: https://github.com/rhasspy/piper - **Voice Samples**: https://rhasspy.github.io/piper-samples/ - **HuggingFace Models**: https://huggingface.co/rhasspy/piper-voices - **AgentVibes Discussion**: https://github.com/paulpreibisch/AgentVibes/discussions/24 ## License Piper TTS is licensed under the MIT License. --- **Need help?** Open an issue at https://github.com/paulpreibisch/AgentVibes/issues