Create long Audiobooks ππ with custom voices using Qwen3-TTS Voice Design
Go to WorkflowDescription
This workflow automates the creation of audiobooks from structured text data using AI-powered text-to-speech and audio processing services.
Click here to listen the result of my example.
Key Advantages
1. β
Fully Automated Audiobook Production
The entire pipelineβfrom text retrieval to final audio uploadβis automated. This removes manual steps, reduces human error, and enables repeatable audiobook generation at scale.
2. β
Advanced Voice Customization
By using voice design prompts (voice description + style instruction), the workflow produces highly expressive and context-aware narration, ideal for audiobooks, storytelling, and branded audio content.
3. β
Scalable and API-Safe Architecture
The batch processing and looping logic respects external API limits. This makes the workflow robust even for large audiobooks with dozens or hundreds of segments.
4. β
Centralized Content Management
Google Sheets acts as a lightweight CMS:
Easy to edit scripts and voice parameters
Clear tracking of processed items
Temporary URLs and merge flags ensure full visibility into the workflow state
5. β
Asynchronous and Fault-Tolerant
The use of wait nodes and status checks allows the workflow to handle long-running audio operations without blocking or failing prematurely.
6. β
Seamless Cloud Storage Integration
Final audiobooks are automatically stored in Google Drive, making them immediately accessible for distribution, review, or further processing.
7. β
Modular and Extensible Design
Each step (TTS generation, batching, merging, storage) is modular. This makes it easy to:
Swap TTS providers
Change storage destinations
Add post-processing steps (e.g. metadata, chapter markers)
How it Works
This workflow automates the creation of audiobooks using AI-generated voice synthesis with custom voice design. The process begins by retrieving script data from a Google Sheets document containing text, speaker information, voice descriptions, and style instructions.
The workflow then processes each row in batches, sending the text to the Qwen3-TTS model on Replicate with specified voice parameters to generate individual audio segments.
Each generated audio URL is stored back in the spreadsheet.
Concurrently, once multiple audio segments are ready, they are merged into a single audio file using an external FFmpeg API service.
The system polls for merge completion, retrieves the final merged audio file, and uploads it to Google Drive as a complete audiobook with a timestamped filename.
Set up Steps
Data Source Configuration: Set up the Google Sheets node to connect to your spreadsheet containing the audiobook script with required columns: Text, Speaker, Voice Description, Style Instruction, Temp URL, and To Merge
API Credentials Setup:
Configure Replicate API credentials for Qwen3-TTS voice synthesis
Set up Fal.run API credentials for FFmpeg audio merging operations
Configure Google Drive OAuth2 credentials for uploading the final audiobook
Voice Design Parameters: Ensure your spreadsheet contains appropriate voice descriptions and style instructions compatible with the Qwen3-TTS model's requirements
Destination Settings: Verify the Google Drive folder ID in the upload node points to your desired storage location for the final audiobook
Execution: Trigger the workflow manually to begin processing your script rows and generating the complete audiobook with custom voice design
π Subscribe to my new YouTube channel. Here Iβll share videos and Shorts with practical tutorials and FREE templates for n8n.
Need help customizing?
Contact me for consulting and support or add me on Linkedin.