What can Custom Voices generate?

Custom Voices can generate natural speech from text and create reusable voice profiles when the user has authorization.

Does Custom Voices allow adult or NSFW content?

No. Custom Voices prohibits NSFW, adult, pornographic, illegal, hateful, fraudulent, and unauthorized impersonation content.

Audio examples

Write prompts that sound intentional.

Use these examples to shape pacing, emotion, pronunciation, and voice design before spending credits on a final take.

What makes a good audio request

A strong request separates the words to speak from the performance direction. Tell the model who is speaking, who is listening, the emotional temperature, and any words that need careful pronunciation.

Start with the listening context

Name the format first: product demo, support message, podcast intro, character line, learning narration, or short ad.

Give one clear performance direction

Choose a primary tone such as warm, calm, urgent, playful, documentary, or reassuring instead of stacking many moods.

Mark pauses and emphasis in the script

Use short sentences, commas, line breaks, and bracketed cues like [pause] or [softly] where a human narrator would naturally breathe.

Prompt patterns

Product walkthrough

· Clear, trustworthy SaaS narration

Too vague

Read this product update in a nice voice.

Better request

Style direction: calm product narrator, confident but not salesy, medium pace. Script: Welcome to your weekly workspace summary. [pause] Three projects moved forward, two invoices are ready for review, and one deadline needs attention today.

The improved version defines role, tone, pace, and where the listener should feel a pause.

Customer support

· Reassuring message after a delay

Too emotional

Say sorry with a sad voice.

Better request

Style direction: sincere support specialist, steady pace, warm and accountable. Script: We are sorry for the delay. Your request is already with our review team, and we will send the next update before Friday afternoon.

The voice is empathetic without sounding theatrical, and the script includes concrete next-step information.

Short social ad

· Energetic but still natural

Too many adjectives

Make it super exciting, happy, premium, funny, dramatic, and viral.

Better request

Style direction: bright creator voice with a subtle smile, fast but understandable. Script: Your launch video does not need another rewrite. Drop in the script, choose a voice, and export a clean take in minutes.

One performance idea and a compact script usually produce a cleaner take than conflicting mood instructions.

Voice design

· Create a reusable narrator profile

Too generic

A good English voice.

Better request

Voice design prompt: English female narrator in her 30s, warm studio tone, slightly lower pitch, precise consonants, suitable for product tutorials and onboarding videos.

Voice design works better when you describe age range, pitch, texture, articulation, and repeated use case.

Narrated story with emotional turns

· Move from suspense to relief without changing voices

Too flat

Read this story dramatically.

Better request

Style direction: cinematic audiobook narrator, low volume at first, slow pace, then warmer after the reveal. Script: [quietly] The hallway light flickered once. Then again. [pause] Mira held her breath. [tense] The door opened by itself. [pause] [relieved] It was only her brother, holding the birthday cake with both hands.

This gives the model a timeline of performance changes. The cues describe when the emotion changes instead of asking for one vague dramatic mood.

Two-character dialogue

· Make a short scene understandable in one generated take

Unclear speakers

Make this conversation sound real: Are we late? No, we still have time.

Better request

Style direction: light radio drama, natural reactions, keep both characters distinct with pacing rather than extreme voices. Script: Ava [worried, quick]: Are we late? Noah [calm, slight smile]: No. We still have time. Ava [exhale]: Good. I thought the doors closed at eight. Noah [reassuring]: They do. It is only seven forty.

Speaker names, emotional tags, and line breaks make the exchange easier to follow while avoiding overacted character voices.

Tutorial with precise timing

· Create step-by-step audio that matches screen recording

No timing structure

Explain how to upload a voice sample.

Better request

Style direction: patient tutorial narrator, medium-slow pace, leave room between steps for screen actions. Script: First, open the Voice Studio. [pause 1s] Choose Create voice. [pause 1s] Upload a clean MP3 or WAV sample. [pause 1s] Read the authorization statement carefully, then confirm only if you have permission to use the voice.

Explicit step boundaries and pauses help the audio fit a product demo or onboarding video without rushed narration.

Pronunciation and brand names

· Reduce mistakes on product names, acronyms, and numbers

Ambiguous text

Say: CVX ships API v2.5 on 05/06 with 1200 new voices.

Better request

Style direction: clear launch announcer, precise pronunciation, no hype. Pronunciation notes: CVX is read as C V X. API is read as A P I. v2.5 is read as version two point five. 05/06 is read as May sixth. Script: C V X ships A P I version two point five on May sixth, with twelve hundred new voice options.

For names, versions, and dates, writing the spoken form directly is often more reliable than leaving the model to infer pronunciation.

Guided meditation

· Control silence and softness without making the voice dull

Missing breath and space

Read this meditation calmly.

Better request

Style direction: gentle meditation guide, soft volume, unhurried, warm but not sleepy. Script: Settle your shoulders. [long pause] Notice the weight of your hands. [softly] There is nothing to solve right now. [long pause] Breathe in slowly. [pause] Breathe out, and let the room become quiet around you.

Meditation audio depends on silence as much as speech. Longer pause cues and fewer words create a better rhythm.

Localized delivery

· Keep meaning natural when mixing English and Chinese

Language switch is abrupt

Read this in Chinese and English: 欢迎使用 Custom Voices. Create your first voice now.

Better request

Style direction: bilingual product host, smooth code-switching, Mandarin first, English brand words pronounced clearly. Script: 欢迎使用 Custom Voices。[pause] 你可以先上传授权样本，创建自己的 voice profile，然后用文本生成自然的英文或中文音频。

The request tells the model how to handle mixed-language terms and keeps the bilingual phrasing natural instead of sounding pasted together.

Clone voice sample selection

· Improve the source material before cloning

Poor sample guidance

Upload any clip of the speaker.

Better request

Sample guidance: choose 30 to 90 seconds of clean solo speech, stable microphone distance, no music, no overlapping speakers, and at least a few complete sentences with the speaker's normal tone. Voice use note: after cloning, use style direction for performance changes instead of trying to fix a noisy sample with stronger prompts.

The cloned voice quality starts with the sample. A clean, representative clip gives later style prompts much more room to work.

Compliance-sensitive announcement

· Sound clear and trustworthy without sounding like a legal threat

Too harsh

Read this warning in a serious voice.

Better request

Style direction: professional compliance narrator, calm authority, neutral pace, no alarm. Script: This voice can only be used with permission from the voice owner. [pause] Do not use generated audio to impersonate, mislead, or imply endorsement without consent.

For policy or safety copy, a controlled neutral tone usually builds more trust than exaggerated severity.

Before generating

Remove filler words unless you want them spoken.
Keep numbers, dates, acronyms, and product names written the way they should be pronounced.
Generate a short test line before spending credits on a long script.
Use bracketed cues sparingly and place them exactly where the performance should change.
For long scripts, split the script into scenes or paragraphs with a single style goal for each one.
Use a cloned voice only when you own the voice or have explicit permission.

Write prompts that sound intentional.

Use these examples to shape pacing, emotion, pronunciation, and voice design before spending credits on a final take.