Loading tool…
Convert scripts to natural Chinese speech · voice clone · CapCut-ready MP3
Generate narration audio from long Chinese scripts with MiniMax T2A V2. Built for science explainers, commentary channels, and YouTube voiceovers up to thousands of characters.
Upload a reference clip to clone your voice, or pick from curated Mandarin presets. Output MP3 or WAV ready for CapCut, Premiere, or direct YouTube upload.
Long scripts are automatically split, synthesized segment by segment, and merged into one continuous audio file with podcast-style pause enhancement.
API keys stay on the server — your MiniMax credentials are never exposed to the browser.
Generate 15-minute voiceovers in minutes instead of hours in the booth.
Clone once, reuse the same voice_id for every video in your series.
Download MP3 and drop directly onto your timeline — no format conversion needed.
High-quality Mandarin synthesis with emotional control.
Upload 10s+ sample audio to create a custom voice_id.
Automatic split-and-merge for 1000–4000 character scripts.
Optional natural pause markers for commentary-style delivery.
Enter or paste your narration text (1000–4000 chars recommended).
Pick a preset voice or upload a sample for cloning.
Click generate — progress shows per segment.
Play preview or download MP3 for your video editor.
Pair narration with video tools and AI Tools.
speech-02-hd, speech-02-turbo, speech-2.8-hd, and speech-2.8-turbo.
Up to 10,000 characters per request. Longer scripts are auto-segmented.
MP3 at 32kHz is recommended — select MP3 in settings before generating.
Upload a 10s–5min MP3/WAV sample. The API registers a custom voice_id, then synthesizes your full script.
The site admin configures MINIMAX_API_KEY and MINIMAX_GROUP_ID on the server.
Yes — set language boost to English and pick an English preset voice_id.
MiniMax T2A V2 – convert long scripts to natural narration with voice cloning
Podcast-style pauses
Adds natural pauses after punctuation – great for commentary channels
Generated audio will appear here