Create a new call
Create and dispatch an AI-powered phone call. The call will be queued and executed immediately.
Phone Number Format: Must be in E.164 format (e.g., +14155551234)
- Must start with
+ - Country code must be 1-9 (not 0)
- Total length: 1-15 digits after the
+
Simple Mode: Provide task (simple prompt)
Advanced Mode: Provide instructions (full system prompt)
Documentation Index
Fetch the complete documentation index at: https://docs.topcalls.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Use Authorization: Bearer tc_live_xxxxx
Headers
Optional client-supplied idempotency key. When present, the gateway
caches the response for 24 hours and returns the same response on
retried requests with the same key (account-scoped). Safe for
retries on network blips. Format: 8-255 ASCII characters from
[A-Za-z0-9_-].
8 - 255^[A-Za-z0-9_-]+$"a1b2c3d4-e5f6-7890-abcd-ef0123456789"
Body
- Option 1
- Option 2
- Option 3
- Option 4
Phone number in E.164 format (e.g., +14155551234)
- Must start with
+ - Country code must be 1-9 (not 0)
- Total length: 1-15 digits after the
+ - Also validated as dialable by libphonenumber-js — country codes that don't exist are rejected
^\+[1-9]\d{1,14}$"+14155551234"
Simple prompt describing what the AI should do.
Use this OR instructions (not both).
10"Call to confirm John's appointment tomorrow at 3 PM"
Caller ID in E.164 format (optional, falls back to FROM_NUMBER env var)
- Must be in E.164 format if provided
- Either provide this or set FROM_NUMBER environment variable
^\+[1-9]\d{1,14}$"+18005551234"
The AI's opening line
1"Hi, this is Rachel from TopView Dental calling about your appointment."
Full system instructions for the AI.
Use this OR task (not both).
10"You are Rachel, a friendly appointment coordinator..."
Conversation mode
realtime: Speech-to-speech mode (low latency)legacy: Separate STT → LLM → TTS pipeline (custom voices, voice cloning)
realtime, legacy Voice to use for AI responses.
Realtime mode:
- Available voices:
alloy,echo,shimmer,ash,ballad,coral,sage,verse
Legacy mode (custom voices):
- Voice names:
rachel,domi,bella,antoni,elli,josh,arnold,sam,adam,nicole,matilda - Or voice_id directly:
21m00Tcm4TlvDq8ikWAM(24-char alphanumeric) - Custom/cloned voices: Use the voice_id from your account
Legacy mode (fast voices):
- Aura-2 voices:
aura-2-thalia-en,aura-2-orion-en, etc.
"alloy"
AI model to use for the call. Default is selected based on mode.
See GET /v1/models for the complete list of available models and their capabilities.
Defaults are automatically selected per mode if not specified.
LLM creativity/temperature (0-1). Higher values = more creative responses.
- Most models: Full range 0-1 supported
- Some reasoning models only support default temperature
0 <= x <= 1STT provider (legacy mode only)
deepgram: Default provider (36+ languages)gladia: Multi-language provider (100+ languages, automatic detection)speechmatics: High-accuracy multilingual provider with native Arabic+English code-switching (usestt_language: "ar_en")soniox: Real-time multilingual provider with native code-switching viastt_languagesarray (e.g.["en", "ar"])- Provider name must match telephony provider configuration
- Use
stt_language: "multi"for automatic multilingual detection (gladia only) - Only used when
mode=legacy
deepgram, gladia, speechmatics, soniox "deepgram"
STT model (legacy mode only). See GET /v1/models for complete list of available STT models and their capabilities. Only used when mode=legacy.
1"nova-3"
STT language code (legacy mode only). See GET /v1/config for the exact list per model.
- For deepgram: accepts ISO 639-1 base codes plus provider-supported regional variants (e.g.
en,en-US,en-GB,en-IN,zh-CN,pt-BR) - For gladia: accepts only ISO 639-1 base codes (e.g.
en,ar,hi). Regional variants are not supported - omitstt_languages(or send an empty array) for automatic multilingual detection multiis a deepgram-only sentinel for multilingual code-switching (supported bynova-3,nova-2,flux-general-multi)- Only used when
mode=legacy - For restricted multi-language detection (gladia), use
stt_languagesarray instead
2"en-US"
Array of language codes for restricted multi-language detection (gladia only).
When multiple languages are provided:
- Enables
code_switchingmode automatically - Restricts detection to ONLY these specified languages
- Dramatically improves accuracy for short phrases
Narrows the detection space from 99 languages to just the ones you specify. Omit this field (or send an empty array) for unrestricted multilingual auto-detection.
Must be ISO 639-1 base codes only - gladia does not accept regional
variants like en-US or zh-CN. See GET /v1/config for the full list.
Examples:
["en", "ro"]- Detect English and Romanian only["en", "es", "fr"]- Detect English, Spanish, and French["en", "ar", "hi"]- Detect English, Arabic, and Hindi
Only used when mode=legacy and stt_provider=gladia.
1 - 10 elements2["en", "ro"]Custom vocabulary for STT (multi-language provider only). Boost recognition of domain-specific words and phrases in real time.
Formats supported:
- Simple strings:
["Capex", "TopCalls"] - Objects with language:
[{"value": "Capex", "language": "en"}] - Mixed:
["Capex", {"value": "مرحبا", "language": "ar"}]
Use cases:
- Company/product names
- Industry-specific terminology
- Names that may be mispronounced
- Technical terms
Only used in legacy mode with the multi-language STT provider.
1 - 100 elements1[
"Capex",
{ "value": "TopCalls", "language": "en" }
]STT endpoint sensitivity (seconds to wait after silence before finalizing a transcript).
Effective range is provider-dependent:
gladia: 0.01 - 2.0 seconds (default 0.01; 0.01-0.1 recommended for telephony)soniox: 0.5 - 3.0 seconds (default 2.0)
Lower values = snappier turn-ends; higher values = more patience for slow speakers. Values outside the active provider effective range may be clamped or rejected by the upstream provider. Only used in legacy mode and only honoured by providers that expose this knob.
0.01 <= x <= 30.01
STT interrupt/speech detection sensitivity (multi-language provider only). Controls the speech detection threshold for distinguishing speech from noise.
- Range: 0.0 - 1.0
- Default: 0.8 (recommended for telephony audio)
- Higher values (0.7-0.9): Recommended for telephony audio, background noise
- Lower values (0.0-0.4): More sensitive to speech, may pick up more noise
Only used in legacy mode with the multi-language STT provider.
0 <= x <= 10.8
Max wait in seconds before finalizing a transcript (speechmatics only). Lower = snappier turn-ends, higher = more patience for slow speakers.
- Range: 0.7 - 4.0 seconds
- Default: 1.5 seconds
- Fixed platform-wide for other providers.
Only used in legacy mode with stt_provider=speechmatics.
0.7 <= x <= 41.5
Transcript correction vocabulary for LLM-based STT error correction (legacy mode only). Provides domain-specific terms that STT often mishears, allowing the LLM to use context to mentally correct transcription errors.
Formats supported:
- Simple strings:
["Weaviate", "Kubernetes", "TopCalls"] - Objects with sounds_like hints:
[
{ "correct": "Weaviate", "sounds_like": ["we activate", "web VT"] },
{ "correct": "NVIDIA", "sounds_like": ["in video"], "context": "hardware" }
] - Mixed:
["TopCalls", { "correct": "Kubernetes", "sounds_like": ["cube net ease"] }]
How it works:
- The vocabulary is added to the LLM system prompt
- When STT mishears a domain term, the LLM uses context to interpret correctly
- No additional latency (processed in the main LLM call)
- LLM responds naturally without mentioning the correction
Use cases:
- Company/product names (Weaviate, Kubernetes, NVIDIA)
- Industry-specific terminology (medical, legal, financial terms)
- Technical terms that sound like common words
- Names that may be mispronounced
Only used when mode=legacy.
1 - 100 elements1[
"TopCalls",
{
"correct": "Weaviate",
"sounds_like": ["we activate", "web VT"]
},
{
"correct": "Kubernetes",
"sounds_like": ["cube net ease", "cooper nettie"],
"context": "technology"
}
]TTS provider (legacy mode only). See GET /v1/voices/builtin for available voices per provider. Only used when mode=legacy.
deepgram, elevenlabs "deepgram"
TTS model (legacy mode only). See GET /v1/models for complete list of available TTS models. Only used when mode=legacy.
1"eleven_flash_v2_5"
Voice stability (legacy mode). Controls the consistency of the voice output.
- Lower values (0): More variable, emotional, expressive
- Higher values (1): More consistent, stable, less expressive
- Default: 0.75 (optimized for voice agents)
- Only used in legacy mode with the corresponding TTS provider
0 <= x <= 10.75
Voice similarity boost (legacy mode). Controls how closely the generated voice matches the original.
- Lower values (0): Less similar to original voice
- Higher values (1): More similar to original voice
- Default: 0.5 (balanced for voice agents)
- Only used in legacy mode with the corresponding TTS provider
0 <= x <= 10.5
Speech speed (legacy mode). Controls the rate of speech.
- Lower values (0.7): Slower speech
- Higher values (1.2): Faster speech
- Default: 0.78 (slightly slower for clarity)
- Only used in legacy mode with the corresponding TTS provider
0.7 <= x <= 1.20.78
Enable filler acknowledgments (legacy mode only). When enabled, the AI will generate brief acknowledgments (e.g., "Got it...", "Sure...") before the main response to reduce perceived latency.
false(default): No filler - AI responds directlytrue: AI generates contextual filler before main response
Only used when mode=legacy.
false
Block interruption mode (legacy mode only). When enabled, the AI continues speaking even if the user talks over it.
- User speech during TTS is buffered (not processed immediately)
- When TTS ends, buffered speech is merged and checked:
- If ≥5 words: processed through LLM (single call)
- If <5 words: discarded (fillers like "uh huh", "okay")
Use cases:
- Delivering critical information that shouldn't be interrupted
- Users who provide active listening cues during AI speech
- Noisy environments with background speech/noise
Only used when mode=legacy.
false
Maximum call duration in minutes (enforced by telephony provider)
1 <= x <= 60Background audio preset to play during the call.
office: Office ambiance (default) - subtle office soundsnone: No background audio
Background audio plays continuously under the conversation and helps create a professional atmosphere.
office, none "office"
Volume level for background audio relative to speech.
low: Subtle (-10 dB) - quieter backgroundmedium: Balanced (-4 dB) - noticeable but balanced (default)high: Full volume (0 dB) - background at same level as speech
Only used when background_audio is not none.
low, medium, high "medium"
Webhook URL to receive call completion/failure notifications. Webhook is sent after call finishes (includes recording_url and call_summary when available).
"https://your-app.com/webhooks/call-complete"
Schema for post-call AI analysis. Defines what information to extract from the transcript.
After the call, AI analyzes the transcript and extracts structured data matching this schema.
Results are included in the webhook payload under the analysis field.
Supported types:
boolean: true/false values (e.g., "converted", "appointment_confirmed")string/text: Free-form text (e.g., "objections", "questions")number: Numeric values (e.g., "rating", "call_count")date: Date/time in ISO 8601 format (e.g., "appointment_time")
Simple format: Just specify the type
{ "converted": "boolean", "objections": "string" }Rich format: Include description for better AI understanding
{
"converted": {
"type": "boolean",
"description": "Whether the lead agreed to schedule an appointment"
},
"appointment_time": {
"type": "date",
"description": "The scheduled appointment date/time if booked"
}
}{
"converted": {
"type": "boolean",
"description": "Whether the lead agreed to schedule an appointment or expressed buying interest"
},
"objections": {
"type": "string",
"description": "Any concerns or objections the lead raised during the call"
},
"appointment_time": {
"type": "date",
"description": "The scheduled appointment date and time if one was booked"
}
}Optional MCP server URL for remote tool dispatch (Activepieces). When set, the gateway opens an SSE MCP client at call start and merges the remote tools into the LLM tool list.
"https://integrations.example.com/mcp/abc123"
Bearer token for the MCP server. Redacted from logs by suffix rule (any field ending in _token/_secret/_key/_password).
Names of MCP tools (as returned by listTools()) that the AI is allowed to invoke during this call. When absent or empty, zero remote MCP tools are attached — only platform tools like end_call remain. Explicit opt-in to prevent prompt bloat from auto-attaching every flow in the connected workspace.
["book_callback", "send_sms_confirmation"]Per-call timeout for MCP tool invocation in milliseconds. On timeout, the gateway feeds {error: "tool_timeout"} into the second LLM hop so the model can recover conversationally.
500 <= x <= 10000Custom metadata to include in webhook payload. System fields (task, voice, model, etc.) are filtered out automatically.
{
"patient_id": "pat_123",
"source": "reminder_system"
}Optional lead reference. When provided, the gateway loads the lead record (name, email, notes, status, plus any custom fields stored on the lead) and exposes them to the AI via lead_context. Caller-supplied lead_context takes precedence on key collision. Returns 404 if the lead does not exist in the calling account.
Optional campaign reference. When provided, the gateway loads the campaign's attached knowledge base and includes it in the call's runtime context. Returns 404 if the campaign does not exist in the calling account.
When used as a campaign execution path (without phone_number),
campaign_id, lead_id, idempotency_key, and attempt_number
are all required.
Attempt number for campaign execution mode (required when
campaign_id is provided without phone_number).
x >= 1Idempotency key for campaign execution mode (required when
campaign_id is provided without phone_number).
8Optional scheduled time for campaign execution mode.
Free-form key/value context surfaced to the AI during the call. When lead_id is set, the gateway auto-builds a base lead_context from the lead record; any keys passed here shallow-merge on top of the auto-built base and win on collision.
Response
Call created successfully
Call UUID
"564d4fd4-03bc-400a-abe0-05540fbeff88"
Provider call ID (may be null if call creation failed)
"64e9bf0e-7c2f-4443-a759-7eb1731cd583"
Current call status
queued, pending, in_progress, completed, failed, cancelled "queued"