Creating Your AI Persona
The most important part of your AI agent is the instructions (system prompt). This defines who the agent is, what they do, and how they behave.Writing Effective Instructions
Your instructions should include:1. Identity & Role
2. Goal & Context
3. Tone & Style
4. Handling Scenarios
5. Boundaries
Complete Example
Here’s a complete instruction set for an appointment reminder:Voice Selection
Realtime Mode Voices
In Realtime Mode, choose from OpenAI’s preset voices. The default voice isalloy, which works well for most use cases. Other voices include echo, shimmer, ash, ballad, coral, sage, verse, marin, and cedar.
Realtime Mode voices are optimized for low latency and natural conversation. They work best for English but can handle other languages with proper instructions. Check the current list of available voices via the API endpoint
GET /v1/voices/builtin?provider=openai or contact support.Legacy Mode: Custom Voices
Legacy Mode unlocks unlimited voice customization:ElevenLabs Voices
- Thousands of voices from the ElevenLabs Voice Library
- Voice cloning: Upload 1-5 minutes of audio to clone any voice
- Multiple languages supported
- Custom voice IDs: Use any voice ID from ElevenLabs
Check the current list of available ElevenLabs voices via the API endpoint
GET /v1/voices/builtin?provider=elevenlabs or browse the ElevenLabs Voice Library directly.Deepgram Aura-2 Voices
- 90+ voices across multiple languages with natural breathing and intonation
- 7 languages: en, es, de, fr, nl, it, ja
- Ultra-fast TTS (~100ms latency)
- Pattern:
aura-2-{name}-{lang}
Check the current list of available Deepgram Aura-2 voices via the API endpoint
GET /v1/voices/builtin?provider=deepgram or contact support for the latest voice availability.Language & Dialect Control
Realtime Mode
- Auto-detection: Works automatically but best with explicit instructions
- Instruction-based: Tell the AI “Speak only Spanish” or “Respond in French”
- Limited control: No explicit dialect selection
Legacy Mode
- Full control: Set
stt_languagefor speech recognition - 36+ languages: en-US, en-GB, en-AU, es-ES, es-MX, fr-FR, de-DE, etc.
- Dialect-specific: Choose British English (
en-GB) vs American (en-US)
Temperature & Creativity
Control how creative or focused your AI is:- 0.0-0.3: Very focused, consistent responses (good for confirmations)
- 0.4-0.7: Balanced (default, good for most use cases)
- 0.8-1.0: Creative, varied responses (good for sales, engaging conversations)
First Sentence
Set the opening line to control how the call starts:Best Practices
✅ Do This
- Be specific: Include exact scenarios and how to handle them
- Set boundaries: Clearly define what the AI should and shouldn’t do
- Use variables: Leverage
{{variable}}syntax for personalization - Test thoroughly: Try different scenarios to ensure the AI behaves correctly
- Iterate: Refine instructions based on real call transcripts
❌ Don’t Do This
- Vague instructions: “Be helpful” is too generic
- Conflicting goals: Don’t ask the AI to both sell and not be pushy without clear boundaries
- Missing context: Provide relevant information about the customer, product, or situation
- Too long: Keep instructions focused—aim for 200-500 words

