Skip to main content

Platform Overview

TopCalls bridges traditional telephony (SIP/PSTN) and modern AI to automate phone interactions at scale. We handle the hard parts: telephony infrastructure, AI orchestration, audio processing, and compliance tooling. You focus on what your agents say and do.

What TopCalls Does

Telephony Infrastructure

SIP trunking, carrier registration, number provisioning, and audio streaming. No telecom expertise needed.

AI Orchestration

Real-time speech recognition, intelligent conversation handling, and natural voice synthesis. All optimized for phone conversations.

Campaign Management

Queue management for automated outbound calls with retry logic, timezone awareness, and compliance tooling.

Analytics & Insights

Automatic call summaries, sentiment analysis, structured data extraction, and reporting.

System Architecture

┌──────────────────────────────────────┐
│   Your Application / SaaS App        │
│   (User Management, Campaigns,       │
│    Analytics Dashboard)              │
└──────────────┬───────────────────────┘

               │ REST API

┌──────────────────────────────────────┐
│   TopCalls Voice Gateway             │
│                                      │
│   ✅ Call Execution & Control        │
│   ✅ AI Conversation Handling        │
│   ✅ Campaign Queue & Dispatch       │
│   ✅ Quota Management                │
│   ✅ Transcript & Recording          │
│   ✅ Post-Call Analysis & Webhooks  │
│   ✅ Telephony Infrastructure        │
│   ✅ Audio Processing & Streaming    │
└──────────────────────────────────────┘

Conversation Modes

TopCalls supports two modes, each optimized for different use cases: Speech-to-speech processing for the most natural conversations.
FeatureDetails
LatencyUltra-low (~200-500ms)
VoicesPreset voices (default: alloy)
LanguagesAuto-detects, works best with explicit instructions
Best ForCustomer support, appointment management, inbound reception
Realtime Mode provides the most natural conversations with the lowest latency. Use mode: "realtime" in your API calls.

Legacy Mode (Maximum Customization)

Separate speech recognition, language model, and voice synthesis for full control over each stage.
FeatureDetails
LatencyStandard (~300-600ms)
VoicesHundreds of built-in voices + voice cloning
Languages36+ languages with explicit dialect control
Best ForBrand-specific personas, voice cloning, multi-language
Check available models and voices via GET /v1/models and GET /v1/voices/builtin.
Legacy Mode gives you full control over voice, language, and model selection. Use mode: "legacy" in your API calls.

The Call Lifecycle

Every call goes through these stages:
1

1. Call Creation

You trigger a call via API (POST /v1/calls) or it’s dispatched from a campaign. The system validates your request and reserves quota.
2

2. Dispatch

The call is dispatched to our telephony infrastructure. Status changes to queued then in_progress.
3

3. Connection

The recipient picks up. The AI immediately greets them with the first_sentence you configured.
4

4. Conversation

Audio streams in real-time. The AI:
  • Transcribes speech
  • Processes with the language model (with knowledge base context if configured)
  • Responds naturally with voice synthesis
  • Executes tools/functions as needed
  • Can end the call gracefully when the conversation is complete
5

5. Completion

Call ends (either by user or AI). The system:
  • Captures final transcript
  • Fetches recording URL (available ~15s after call ends)
  • Generates call summary (if configured)
  • Extracts structured data from transcript (if analysis_schema provided)
  • Maps analysis fields to outcomes using outcome_mapping rules
6

6. Webhook Delivery

Your server receives a webhook with complete call details including transcript, recording URL, call summary, structured analysis data, and all custom metadata.

Key Features

Intelligent Routing

Automatically detect voicemail, IVR systems, or human answers. Route accordingly or handle each scenario with custom logic.

Function Calling

Give your AI agents tools to interact with your systems during calls: book appointments, look up orders, update CRMs, process payments, and end calls gracefully.

Knowledge Base Injection

Upload documents, scrape websites, or provide structured data. The AI automatically accesses relevant context during conversations.

Structured Data Extraction

Define schemas to extract specific information from calls: “Did the customer agree to a demo?”, “What objections were raised?”, “What’s the next step?”

Multi-Language Support

36+ languages with proper dialect control. Use for operations across multiple regions.

What You Control

AspectYour Control
AI InstructionsFull control over persona, goals, and behavior
Voice SelectionChoose from built-in voices or use custom/cloned voices
Call FlowDefine first sentence, handle objections, set goals
Tools & FunctionsIntegrate with your systems in real-time
KnowledgeProvide context via knowledge bases
AnalyticsDefine what data to extract from calls

What We Handle

AspectTopCalls Responsibility
TelephonySIP trunking, carrier management, number provisioning
Audio ProcessingReal-time streaming, VAD, echo cancellation
AI OrchestrationSpeech recognition, language model, and voice synthesis pipeline
InfrastructureScaling, reliability, monitoring
Compliance ToolingFeatures to help honor local calling laws (TCPA/TSR/DNC, GDPR)
Compliance Responsibility: TopCalls provides production-ready compliance tooling, but customers remain responsible for ensuring lawful use of the platform.

Next Steps

AI & Voice Customization

Control your agent’s personality, voice, and behavior.

Campaign Management

Scale your outbound calling with campaign features.