ai-audiobook-maker
Version:
AI Audiobook Maker - Convert PDFs and text files to audiobooks using OpenAI TTS
227 lines (169 loc) β’ 6.3 kB
Markdown
# π§ AI Audiobook Maker (AIABM)
[](https://www.npmjs.com/package/ai-audiobook-maker)
[](https://opensource.org/licenses/MIT)
[](https://nodejs.org)
Transform your PDFs and text files into high-quality audiobooks using OpenAI's advanced Text-to-Speech technology. No installation required - just run with `npx`!
## β¨ Features
- **π Zero Installation**: Run directly with `npx aiabm`
- **π Smart File Handling**: Supports PDF and TXT files with drag & drop
- **π€ Voice Preview**: Listen to all 6 OpenAI voices before choosing
- **βΈοΈ Resume & Pause**: Continue interrupted conversions anytime
- **π Secure API Key Management**: Encrypted local storage
- **π Progress Tracking**: Real-time conversion progress with estimates
- **ποΈ Advanced Controls**: Adjust speed, quality, and output format
- **π° Cost Transparency**: See exact pricing before conversion
## π Quick Start
### Method 1: Direct Usage (Recommended)
```bash
# Convert a specific file
npx aiabm mybook.pdf
# Interactive mode
npx aiabm
```
### Method 2: Global Installation
```bash
npm install -g ai-audiobook-maker
aiabm mybook.pdf
```
## π Prerequisites
- **Node.js 16+** (Download from [nodejs.org](https://nodejs.org/))
- **OpenAI API Key** (Get from [platform.openai.com](https://platform.openai.com/account/api-keys))
- **FFmpeg** (for audio combining - auto-installed on most systems)
## π― Usage Examples
### CLI Mode
```bash
# Basic conversion
npx aiabm document.pdf
# With specific options
npx aiabm book.txt --voice nova --speed 1.2 --model tts-1-hd
# Manage API key
npx aiabm --config
```
### Interactive Mode
```bash
npx aiabm
```
Then follow the interactive prompts to:
1. Select your file (browse, drag & drop, or enter path)
2. Preview and choose a voice
3. Configure settings (speed, quality, output format)
4. Monitor progress and resume if needed
## π€ Available Voices
- **Alloy**: Neutral, versatile
- **Echo**: Clear, professional
- **Fable**: Warm, storytelling
- **Onyx**: Deep, authoritative
- **Nova**: Bright, engaging
- **Shimmer**: Gentle, soothing
## π° Pricing
OpenAI TTS pricing: **$0.015 per 1,000 characters**
| Content Length | Estimated Cost | Example |
|----------------|----------------|---------|
| 10,000 characters | ~$0.15 | Short article |
| 50,000 characters | ~$0.75 | Small e-book |
| 100,000 characters | ~$1.50 | Average novel |
| 250,000 characters | ~$3.75 | Large book |
## π§ Advanced Features
### Resume Interrupted Conversions
If conversion stops, simply run the tool again - it will automatically detect and offer to resume your previous session.
### Multiple Output Formats
- **Single File**: One complete audiobook MP3
- **Chapter Files**: Separate MP3 per chunk
- **Both**: Get both formats
### Voice Preview Caching
Voice previews are cached locally to save API costs and improve performance.
### Smart Text Chunking
- Respects sentence boundaries
- Preserves chapter structure for PDFs
- Configurable chunk sizes (default: 4000 characters)
## π File Support
### PDF Files
- β
Up to 50MB
- β
Text extraction with structure preservation
- β
Automatic chapter detection
### Text Files
- β
Up to 1M characters
- β
UTF-8 encoding
- β
Automatic formatting cleanup
## βοΈ Configuration
### API Key Storage
Your OpenAI API key is encrypted and stored locally at:
- **macOS/Linux**: `~/.config/ai-audiobook-maker/config.json`
- **Windows**: `%APPDATA%\ai-audiobook-maker\config.json`
### Cache Location
Voice previews and temporary files:
- **macOS/Linux**: `~/.config/ai-audiobook-maker/cache/`
- **Windows**: `%APPDATA%\ai-audiobook-maker\cache\`
## π οΈ Troubleshooting
### Common Issues
**"FFmpeg not found"**
```bash
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.html
```
**"API key invalid"**
- Verify your key at [OpenAI Platform](https://platform.openai.com/account/api-keys)
- Use `npx aiabm --config` to update your key
**"File too large"**
- PDFs: Maximum 50MB
- Text: Maximum 1M characters
- Split large files before conversion
**Voice preview not playing**
- macOS: Uses built-in `afplay`
- Windows: Uses PowerShell media player
- Linux: Requires `ffplay`, `mpv`, `vlc`, or `mplayer`
### Performance Tips
- Use `tts-1` model for faster processing
- Use `tts-1-hd` for higher quality (slower)
- Cache clears automatically after 30 days
- Resume feature prevents re-processing completed chunks
## π Privacy & Security
- API keys are encrypted locally using AES-192
- No data is sent to servers other than OpenAI
- Cache files are stored locally only
- Session data helps resume interrupted conversions
## π Examples
### Converting a PDF Book
```bash
npx aiabm "My Great Novel.pdf"
```
### Interactive Voice Selection
```bash
npx aiabm
# Select "Preview all voices"
# Listen to each voice sample
# Choose your favorite
# Configure speed and quality
# Start conversion
```
### Batch Processing Tips
```bash
# Process multiple files
for file in *.pdf; do npx aiabm "$file" --voice nova --speed 1.1; done
```
## π€ Contributing
Issues and feature requests welcome at: [GitHub Issues](https://github.com/iamthamanic/AI-Audiobook-Maker/issues)
## π License
MIT License - see LICENSE file for details
## π Acknowledgments
- Built on OpenAI's TTS API
- Inspired by the original bash script version
- Uses FFmpeg for audio processing
## π Changelog
### v2.0.1 (2025-07-31)
- π§ Fixed CLI command back to `aiabm` as originally intended
- π Updated documentation to reflect correct command usage
### v2.0.0 (2025-07-31)
- π¨ Renamed npm package to `ai-audiobook-maker` for better discoverability
- β¨οΈ CLI command remains `aiabm` for convenience
- π¦ Improved package structure and metadata
- π§ Added proper .gitignore and .npmignore files
- π Added MIT LICENSE file
- π Updated documentation and installation instructions
- π Ready for npm publishing
---
**Happy listening! π§** Turn any text into your personal audiobook library.