Back to Help Center
Audio Guide

Creating Audio & Dubs

Part 1: Gen Dub (SRT → Target Language Audio)

Gen Dub lets you create a dubbed audio track from an SRT subtitle file.

How It Works
  1. 1) Upload your .SRT file.
  2. 2) Click Generate → StoryTool creates a dubbed audio track aligned to the SRT timing.
Auto Language Detection (Gemini TTS)

Gemini TTS can auto-detect the language inside your SRT content.

So you do NOT need to manually select the source language.

Auto Speed Matching (Important)

StoryTool automatically adjusts speaking speed per subtitle chunk to fit the SRT time window.

Example:

  • Subtitle slot = 5 seconds.
  • Target language line is longer and would take ~6–7 seconds at normal speed.
  • → StoryTool speeds up that chunk so it still finishes within the allowed SRT duration.
Best Practices (For Clean Dub)
  • Use a clean SRT: correct timestamps, no overlaps, no missing sequence numbers.
  • Keep each subtitle line short and natural (avoid very long sentences in one subtitle).
  • Avoid mixing more than 2 languages inside the same SRT.
  • For best stability: keep Instruction empty, or use ONE short vibe line only.
  • For tricky names/terms, add a simple phonetic hint in the subtitle text.
What to Expect
  • Small speed changes are normal, especially when dubbing into languages with longer pronunciation.
  • If a chunk must be sped up too much, it may sound less natural.

Fix: Split that subtitle into 2 shorter lines and generate again.

Part 2: How to Write "Instructions" for the Voice (Google Gemini Only)

0) Purpose

"Instruction" is an OPTIONAL note for the AI voice about vibe/tone (how to speak).

Important (please read):

  • Instruction is optional.
  • For most scripts, leaving Instruction EMPTY gives the most stable, consistent voice.
  • Long / detailed Instructions can make the voice LESS consistent across long text (mood shifts, prosody changes). Keep it very short, or skip it.
Available Languages (Gemini)

Primary languages (24) — recommended for best quality:

Arabic (Egypt)
Bangla (Bangladesh)
Dutch (Netherlands)
English (India)
English (United States)
French (France)
German (Germany)
Hindi (India)
Indonesian (Indonesia)
Italian (Italy)
Japanese (Japan)
Korean (South Korea)
Marathi (India)
Polish (Poland)
Portuguese (Brazil)
Romanian (Romania)
Russian (Russia)
Spanish (Spain)
Tamil (India)
Telugu (India)
Thai (Thailand)
Turkish (Turkey)
Ukrainian (Ukraine)
Vietnamese (Vietnam)

Additional languages (63) — supported, but may have more mistakes than the 24 above:

Afrikaans (South Africa)
Albanian (Albania)
Amharic (Ethiopia)
Arabic (World)
Armenian (Armenia)
Azerbaijani (Azerbaijan)
Basque (Spain)
Belarusian (Belarus)
Bulgarian (Bulgaria)
Burmese (Myanmar)
Catalan (Spain)
Cebuano (Philippines)
Chinese, Mandarin (China)
Chinese, Mandarin (Taiwan)
Croatian (Croatia)
Czech (Czech Republic)
Danish (Denmark)
English (Australia)
English (United Kingdom)
Estonian (Estonia)
Filipino (Philippines)
Finnish (Finland)
French (Canada)
Galician (Spain)
Georgian (Georgia)
Greek (Greece)
Gujarati (India)
Haitian Creole (Haiti)
Hebrew (Israel)
Hungarian (Hungary)
Icelandic (Iceland)
Javanese (Java)
Kannada (India)
Konkani (India)
Lao (Laos)
Latin (Vatican City)
Latvian (Latvia)
Lithuanian (Lithuania)
Luxembourgish (Luxembourg)
Macedonian (North Macedonia)
Maithili (India)
Malagasy (Madagascar)
Malay (Malaysia)
Malayalam (India)
Mongolian (Mongolia)
Nepali (Nepal)
Norwegian, Bokmål (Norway)
Norwegian, Nynorsk (Norway)
Odia (India)
Pashto (Afghanistan)
Persian (Iran)
Portuguese (Portugal)
Punjabi (India)
Serbian (Serbia)
Sindhi (India)
Sinhala (Sri Lanka)
Slovak (Slovakia)
Slovenian (Slovenia)
Spanish (Latin America)
Spanish (Mexico)
Swahili (Kenya)
Swedish (Sweden)
Urdu (Pakistan)

Total: 87 languages.

1) Rules for Instructions (Keep It Short)
  • 1

    Default = no Instruction

    Start with empty Instruction. Only add it if you truly need a different vibe.

  • 2

    If you use Instruction: make it SUPER SHORT

    Best: 1 short line (or 2 lines max). Only describe vibe/tone. Avoid long "acting directions".

  • 3

    Avoid contradictions

    Don't mix: "very calm" + "extremely excited". Pick ONE main vibe.

  • 4

    Let your script do the work

    Good punctuation and line breaks are the best "control" for pauses and rhythm.

2) Simple Instruction Template (Optional)

Use ONE of these formats:

Option A (1 line):

Vibe: calm, friendly, clear.

Option B (2 lines):

Vibe: serious documentary narrator.

Pace: medium, steady.

3) Ready-made Short Instructions (Copy/Paste)

Click the copy icon to grab any template instantly.

(01) Customer support
Vibe: calm, professional, reassuring.
(02) YouTube host
Vibe: energetic, friendly, engaging.
(03) Documentary / history
Vibe: serious, steady, authoritative.
(04) Meditation
Vibe: soft, gentle, soothing.
(05) Teacher / tutor
Vibe: patient, clear, encouraging.
4) UI Suggestion (For Your Website)
  • Default: Instruction box is empty.
  • Provide "Presets" buttons that fill ONE short line (like the 5 examples above).:Provide "Presets" buttons that fill ONE short line (like the 5 examples above).
  • Show a tip:

"For long scripts, short or empty Instruction usually gives the most stable voice."

Part 3: Common Gemini TTS Issues and How to Avoid Them

0) General Notes

Gemini TTS is strong across many languages. Most issues come from:

  • unclear text formatting,
  • tricky names/terms,
  • unusual abbreviations,
  • overly long / overly detailed Instructions.

Below are common problems + simple fixes.

Short Summary for End Users
  1. 1.Best stability: leave Instruction empty, rely on clean punctuation + line breaks.
  2. 2.If needed: Instruction = 1 short vibe line (avoid long acting directions).
  3. 3.Use phonetic hints for tricky names/terms.
  4. 4.Avoid internal abbreviations, or define them once.
  5. 5.Mixing 2 languages is OK; don't mix more than 2, and avoid frequent switching.