Create a new call

Authorizations

Authorization

string

header

required

Use Authorization: Bearer tc_live_xxxxx

Body

application/json

Option 1
Option 2

phone_number

string

required

Phone number in E.164 format (e.g., +14155551234)

Must start with +
Country code must be 1-9 (not 0)
Total length: 1-15 digits after the +

Example:

"+14155551234"

task

string

required

Simple prompt describing what the AI should do. Use this OR instructions (not both).

Minimum string length: 10

Example:

"Call to confirm John's appointment tomorrow at 3 PM"

from_number

string

Caller ID in E.164 format (optional, falls back to FROM_NUMBER env var)

Must be in E.164 format if provided
Either provide this or set FROM_NUMBER environment variable

Example:

"+18005551234"

first_sentence

string

The AI's opening line

Minimum string length: 1

Example:

"Hi, this is Sarah from TopView Dental calling about your appointment."

instructions

string

Full system instructions for the AI. Use this OR task (not both).

Minimum string length: 10

Example:

"You are Sarah, a friendly appointment coordinator..."

mode

enum<string>

default:realtime

Conversation mode

realtime: OpenAI Realtime API (speech-to-speech, low latency)
legacy: Separate STT → LLM → TTS pipeline (custom voices, voice cloning)

Available options:

realtime,

legacy

voice

string

default:alloy

Voice to use for AI responses.

Realtime mode:

OpenAI voices: alloy, echo, shimmer, ash, ballad, coral, sage, verse

Legacy mode (ElevenLabs):

Voice names: rachel, domi, bella, antoni, elli, josh, arnold, sam, adam, nicole, matilda
Or voice_id directly: 21m00Tcm4TlvDq8ikWAM (24-char alphanumeric)
Custom/cloned voices: Use the voice_id from your ElevenLabs account

Legacy mode (Deepgram):

Aura-2 voices: aura-2-thalia-en, aura-2-orion-en, etc.

Example:

"alloy"

model

string

AI model to use for the call. Default is selected based on mode.

See GET /v1/models for the complete list of available models and their capabilities.

Defaults are automatically selected per mode if not specified.

Example:

"gemini-2.5-flash"

temperature

number

default:0.7

LLM creativity/temperature (0-1). Higher values = more creative responses.

Most models: Full range 0-1 supported
Some reasoning models only support default temperature

Required range: 0 <= x <= 1

stt_provider

enum<string>

default:deepgram

STT provider (legacy mode only)

deepgram: Deepgram (default, 36+ languages)
gladia: Gladia (100+ languages, automatic language detection, multilingual support)
Provider name must match telephony provider configuration
Use stt_language: "multi" for automatic multilingual detection (Gladia only)
Only used when mode=legacy

Available options:

deepgram,

gladia

Example:

"deepgram"

stt_model

string

default:nova-3

STT model (legacy mode only). See GET /v1/models for complete list of available STT models and their capabilities. Only used when mode=legacy.

Minimum string length: 1

Example:

"nova-3"

stt_language

string

STT language/dialect code (legacy mode only)

Examples: en-US, en-GB, en-AU, es-ES, nl-BE
Use multi for automatic multilingual detection (supported by Gladia only)
Controls speech recognition accent/dialect
Only used when mode=legacy
For restricted multi-language detection, use stt_languages array instead

Minimum string length: 2

Example:

"en-GB"

stt_languages

string[]

Array of language codes for restricted multi-language detection (Gladia only).

When multiple languages are provided:

Enables code_switching mode automatically
Restricts detection to ONLY these specified languages
Dramatically improves accuracy for short phrases

This is preferred over stt_language: "multi" when you know which languages your callers will speak, as it narrows the detection space from 100+ languages to just the ones you specify.

Examples:

["en", "ro"] - Detect English and Romanian only
["en", "es", "fr"] - Detect English, Spanish, and French
Use ISO 639-1 language codes (e.g., en, es, fr, de, ro)

Only used when mode=legacy and stt_provider=gladia.

Required array length: 1 - 10 elements

Minimum string length: 2

Example:

["en", "ro"]

stt_vocabulary

(string | object)[]

Custom vocabulary for STT (Gladia only). Boost recognition of domain-specific words and phrases in real time.

Formats supported:

Simple strings: ["Capex", "TopCalls"]
Objects with language: [{"value": "Capex", "language": "en"}]
Mixed: ["Capex", {"value": "مرحبا", "language": "ar"}]

Use cases:

Company/product names
Industry-specific terminology
Names that may be mispronounced
Technical terms

Only used when mode=legacy and stt_provider=gladia.

Required array length: 1 - 100 elements

Minimum string length: 1

Example:

[
  "Capex",
  { "value": "TopCalls", "language": "en" }
]

stt_endpoint_sensitivity

number

default:0.01

STT endpoint sensitivity (Gladia only). Controls how long to wait after silence before considering speech complete.

Range: 0.01 - 2.0 seconds
Default: 0.01 seconds (per Gladia recommendation for telephony audio)
Lower values (0.01-0.1): Recommended for telephony-quality audio, accented speech
Higher values (0.8-2.0): Better for thoughtful speakers, elderly users

Only used when mode=legacy and stt_provider=gladia.

Required range: 0.01 <= x <= 2

Example:

0.01

stt_interrupt_sensitivity

number

default:0.8

STT interrupt/speech detection sensitivity (Gladia only). Controls the speech detection threshold for distinguishing speech from noise.

Range: 0.0 - 1.0
Default: 0.8 (per Gladia recommendation for telephony audio)
Higher values (0.7-0.9): Recommended for telephony audio, background noise
Lower values (0.0-0.4): More sensitive to speech, may pick up more noise

Only used when mode=legacy and stt_provider=gladia.

Required range: 0 <= x <= 1

Example:

0.8

transcript_correction_vocabulary

(string | object)[]

Transcript correction vocabulary for LLM-based STT error correction (legacy mode only). Provides domain-specific terms that STT often mishears, allowing the LLM to use context to mentally correct transcription errors.

Formats supported:

Simple strings: ["Weaviate", "Kubernetes", "TopCalls"]

Objects with sounds_like hints:

[
  { "correct": "Weaviate", "sounds_like": ["we activate", "web VT"] },
  { "correct": "NVIDIA", "sounds_like": ["in video"], "context": "hardware" }
]

Mixed: ["TopCalls", { "correct": "Kubernetes", "sounds_like": ["cube net ease"] }]

How it works:

The vocabulary is added to the LLM system prompt
When STT mishears a domain term, the LLM uses context to interpret correctly
No additional latency (processed in the main LLM call)
LLM responds naturally without mentioning the correction

Use cases:

Company/product names (Weaviate, Kubernetes, NVIDIA)
Industry-specific terminology (medical, legal, financial terms)
Technical terms that sound like common words
Names that may be mispronounced

Only used when mode=legacy.

Required array length: 1 - 100 elements

Minimum string length: 1

Example:

[
  "TopCalls",
  {
    "correct": "Weaviate",
    "sounds_like": ["we activate", "web VT"]
  },
  {
    "correct": "Kubernetes",
    "sounds_like": ["cube net ease", "cooper nettie"],
    "context": "technology"
  }
]

tts_provider

enum<string>

default:deepgram

TTS provider (legacy mode only). See GET /v1/voices/builtin for available voices per provider. Only used when mode=legacy.

Available options:

deepgram,

elevenlabs

Example:

"deepgram"

tts_model

string

TTS model (legacy mode only). See GET /v1/models for complete list of available TTS models. Only used when mode=legacy.

Minimum string length: 1

Example:

"eleven_flash_v2_5"

tts_stability

number

ElevenLabs voice stability (legacy mode, tts_provider=elevenlabs only). Controls the consistency of the voice output.

Lower values (0): More variable, emotional, expressive
Higher values (1): More consistent, stable, less expressive
Default: 0.75 (optimized for voice agents)
Only used when mode=legacy and tts_provider=elevenlabs

Required range: 0 <= x <= 1

Example:

0.75

tts_similarity_boost

number

ElevenLabs voice similarity boost (legacy mode, tts_provider=elevenlabs only). Controls how closely the generated voice matches the original.

Lower values (0): Less similar to original voice
Higher values (1): More similar to original voice
Default: 0.5 (balanced for voice agents)
Only used when mode=legacy and tts_provider=elevenlabs

Required range: 0 <= x <= 1

Example:

0.5

tts_speed

number

ElevenLabs speech speed (legacy mode, tts_provider=elevenlabs only). Controls the rate of speech.

Lower values (0.7): Slower speech
Higher values (1.2): Faster speech
Default: 0.78 (slightly slower for clarity)
Only used when mode=legacy and tts_provider=elevenlabs

Required range: 0.7 <= x <= 1.2

Example:

0.78

filler_enabled

boolean

default:false

Enable filler acknowledgments (legacy mode only). When enabled, the AI will generate brief acknowledgments (e.g., "Got it...", "Sure...") before the main response to reduce perceived latency.

false (default): No filler - AI responds directly
true: AI generates contextual filler before main response

Only used when mode=legacy.

Example:

false

block_interruption

boolean

default:false

Block interruption mode (legacy mode only). When enabled, the AI continues speaking even if the user talks over it.

User speech during TTS is buffered (not processed immediately)
When TTS ends, buffered speech is merged and checked:
- If ≥5 words: processed through LLM (single call)
- If <5 words: discarded (fillers like "uh huh", "okay")

Use cases:

Delivering critical information that shouldn't be interrupted
Users who provide active listening cues during AI speech
Noisy environments with background speech/noise

Only used when mode=legacy.

Example:

false

max_duration

number

default:5

Maximum call duration in minutes (enforced by telephony provider)

Required range: 1 <= x <= 60

background_audio

enum<string>

default:office

Background audio preset to play during the call.

office: Office ambiance (default) - subtle office sounds
none: No background audio

Background audio plays continuously under the conversation and helps create a professional atmosphere.

Available options:

office,

none

Example:

"office"

background_audio_gain

enum<string>

default:medium

Volume level for background audio relative to speech.

low: Subtle (-10 dB) - quieter background
medium: Balanced (-4 dB) - noticeable but balanced (default)
high: Full volume (0 dB) - background at same level as speech

Only used when background_audio is not none.

Available options:

low,

medium,

high

Example:

"medium"

webhook_url

string<uri>

Webhook URL to receive call completion/failure notifications. Webhook is sent after call finishes (includes recording_url and call_summary when available).

Example:

"https://your-app.com/webhooks/call-complete"

analysis_schema

object

Schema for post-call AI analysis. Defines what information to extract from the transcript. After the call, AI analyzes the transcript and extracts structured data matching this schema. Results are included in the webhook payload under the analysis field.

Supported types:

boolean: true/false values (e.g., "converted", "appointment_confirmed")
string/text: Free-form text (e.g., "objections", "questions")
number: Numeric values (e.g., "rating", "call_count")
date: Date/time in ISO 8601 format (e.g., "appointment_time")

Simple format: Just specify the type

{ "converted": "boolean", "objections": "string" }

Rich format: Include description for better AI understanding

{
  "converted": {
    "type": "boolean",
    "description": "Whether the lead agreed to schedule an appointment"
  },
  "appointment_time": {
    "type": "date",
    "description": "The scheduled appointment date/time if booked"
  }
}

Show child attributes

Example:

{
  "converted": {
    "type": "boolean",
    "description": "Whether the lead agreed to schedule an appointment or expressed buying interest"
  },
  "objections": {
    "type": "string",
    "description": "Any concerns or objections the lead raised during the call"
  },
  "appointment_time": {
    "type": "date",
    "description": "The scheduled appointment date and time if one was booked"
  }
}

metadata

object

Custom metadata to include in webhook payload. System fields (task, voice, model, etc.) are filtered out automatically.

Example:

{
  "patient_id": "pat_123",
  "source": "reminder_system"
}

Response

Call created successfully

call_id

string<uuid>

Call UUID

Example:

"564d4fd4-03bc-400a-abe0-05540fbeff88"

provider_call_id

string | null

Provider call ID (may be null if call creation failed)

Example:

"64e9bf0e-7c2f-4443-a759-7eb1731cd583"

status

enum<string>

Current call status

Available options:

queued,

pending,

in_progress,

completed,

failed,

cancelled

Example:

"queued"

Overview

Calls

Phone Numbers

Configuration

Account

Authorizations

Body

Response