Skip to main content

Voice Options Overview

TopCalls supports multiple voice providers, each optimized for different use cases:
ProviderCustom VoicesVoice CloningLanguagesLatency
OpenAI Realtime❌ (Preset voices only)Auto-detect~200-500ms
ElevenLabs✅ (Thousands of voices)Multiple~75ms
Deepgram Aura-2✅ (90+ voices)7~100ms
Voice Availability: Check the current list of available voices via the API endpoints:
  • Realtime voices: GET /v1/voices/builtin?provider=openai
  • ElevenLabs voices: GET /v1/voices/builtin?provider=elevenlabs
  • Deepgram voices: GET /v1/voices/builtin?provider=deepgram
Voice availability may change over time. For the most up-to-date list, use the API endpoints or contact support.

Realtime Mode: OpenAI Voices

In Realtime Mode, you can choose from OpenAI’s preset voices. The default is alloy, which works well for most use cases.
{
  "mode": "realtime",
  "voice": "alloy"  // Default voice, or choose from available preset voices
}
Check available Realtime voices via GET /v1/voices/builtin?provider=openai or contact support for the current list.
Realtime Mode voices are optimized for ultra-low latency and natural conversation. Perfect for most production use cases.

Legacy Mode: ElevenLabs Voices

ElevenLabs offers the most flexibility with thousands of voices and voice cloning support.

Using ElevenLabs Voice Library

Browse the ElevenLabs Voice Library and use any voice ID:
{
  "mode": "legacy",
  "tts_provider": "elevenlabs",
  "voice": "21m00Tcm4TlvDq8ikWAM"  // Any ElevenLabs voice ID
}

Voice Cloning

Clone any voice with 1-5 minutes of audio through the TopCalls dashboard: Step 1: Prepare Audio Samples
  • 1-5 audio files (MP3, WAV)
  • At least 1 minute total duration
  • Clear, high-quality recordings
  • Single speaker, minimal background noise
Step 2: Clone via Dashboard Upload your audio samples through the TopCalls dashboard. The platform will process them and create a custom voice for your account. Step 3: Use Cloned Voice Once cloned, use the voice ID in your API calls:
{
  "mode": "legacy",
  "tts_provider": "elevenlabs",
  "voice": "your_cloned_voice_id"
}
Your cloned voices will appear in the GET /v1/voices endpoint alongside built-in voices.
Voice cloning requires high-quality audio samples. Poor quality samples will result in poor voice quality. Use professional recordings when possible.

Legacy Mode: Deepgram Aura-2 Voices

Deepgram Aura-2 offers 90+ natural-sounding voices optimized for dialogue across 7 languages.

Voice Pattern

aura-2-{name}-{lang}

Available Languages

Deepgram Aura-2 supports: English (en), Spanish (es), German (de), French (fr), Dutch (nl), Italian (it), and Japanese (ja).
Check the current list of available Deepgram Aura-2 voices via GET /v1/voices/builtin?provider=deepgram or contact support for the latest voice availability. Example voices include aura-2-thalia-en, aura-2-apollo-en, aura-2-helena-en, and many more.

Usage

{
  "mode": "legacy",
  "tts_provider": "deepgram",
  "voice": "aura-2-apollo-en"
}

Languages

Deepgram Aura-2 supports:
  • en (English)
  • es (Spanish)
  • de (German)
  • fr (French)
  • nl (Dutch)
  • it (Italian)
  • ja (Japanese)

Choosing the Right Voice

For Customer Support

  • Realtime: Choose from available preset voices optimized for support
  • ElevenLabs: Professional, warm voices from the voice library
  • Deepgram: Caring, natural voices (check available voices via API)

For Sales

  • Realtime: Choose from available preset voices optimized for sales
  • ElevenLabs: Confident, energetic voices from the voice library
  • Deepgram: Confident, casual voices (check available voices via API)

For Healthcare

  • Realtime: Choose from available preset voices optimized for healthcare
  • ElevenLabs: Warm, empathetic voices from the voice library
  • Deepgram: Caring, natural voices (check available voices via API)
Check available voices for each provider via GET /v1/voices/builtin?provider={provider} to see current voice options and characteristics.

For Brand Consistency

  • ElevenLabs: Clone your brand spokesperson’s voice
  • Use the same voice across all channels

Multi-Language Voices

Legacy Mode with Language Control

{
  "mode": "legacy",
  "stt_language": "es-ES",  // Spanish (Spain)
  "voice": "aura-2-thalia-es",
  "tts_provider": "deepgram"
}

ElevenLabs Multi-Language

ElevenLabs voices support 32+ languages automatically:
{
  "mode": "legacy",
  "tts_provider": "elevenlabs",
  "voice": "21m00Tcm4TlvDq8ikWAM",
  "instructions": "Habla solo en español. Eres un asistente de servicio al cliente..."
}

Best Practices

✅ Do This

  • Test voices: Try different voices to find the best fit
  • Match tone: Choose voices that match your brand
  • Consider use case: Support needs different voices than sales
  • Use cloning for brands: Clone spokesperson voices for consistency
  • Test quality: Always test cloned voices before production

❌ Don’t Do This

  • Ignore latency: Realtime Mode is faster but less flexible
  • Poor audio samples: Use high-quality recordings for cloning
  • Mismatched languages: Ensure voice language matches instructions
  • Too many voices: Stick to 1-2 voices for consistency

Next Steps