General Settings - Autocalls

Configure the fundamental settings for your AI assistant including call direction, phone numbers, voice selection, and technical parameters.

Quick Start Guide

Ready to set up your first AI assistant? Here’s the essential flow:

Choose Call Direction: Inbound (answers calls) or Outbound (makes calls)
Set Assistant Name: Internal label like “Support Bot” or “Sales Bot”
Configure Phone Numbers: Assign platform numbers, SIP, or Caller ID
Select Voice & Language: Choose from built-in voices or clone custom ones
Adjust Advanced Settings: Fine-tune models, timing, and audio parameters

Always test your changes by calling the assistant or running a small campaign to confirm it behaves as expected.

Follow this page section by section to configure your assistant. Each setting includes detailed explanations and best practices to help you make the right choices.

Call Direction & Basic Setup

Assistant Type

Choose whether your assistant handles inbound or outbound calls. This fundamental choice affects which other options become available. Inbound (Receive calls): Handles incoming calls from customers. See Inbound calls overview. Outbound (Make calls): Initiates calls to leads or customers. See Outbound calls overview.

Assistant Name

A descriptive name to identify your assistant in the dashboard. Use something memorable that describes the assistant’s purpose (e.g. “Sales Qualifier”, “Support Bot”, “Appointment Scheduler”).

Phone Number Configuration

Your assistant needs a phone number to operate. The available options depend on your call direction choice.

For Outbound Assistants

You can use:

Platform numbers: Numbers rented directly from our platform
SIP numbers: Connect your existing VOIP/PBX system
Caller ID only: Verify ownership of an existing number to display it on outbound calls

For Inbound Assistants

You can use:

Platform numbers: Numbers rented directly from our platform
SIP numbers: Connect your existing VOIP/PBX system

Note: Caller ID only numbers cannot handle inbound calls - they only display on outbound calls.

Pricing & Costs

Platform numbers: Monthly rental fees starting from $3.99/month. See renting a dedicated number for detailed pricing.
SIP integration: No monthly fee, only $0.00045/min for AI bridging. See SIP integration pricing.
Caller ID: No monthly fee, region-based per-minute rates (e.g., $0.01/min in the US). See Caller ID pricing.

See Phone number types for detailed explanations and SIP integration guide for VOIP setup.

Engine Type (Voice Processing Mode)

Choose how your AI processes speech and generates responses. Each mode is optimized for different use cases. See Assistant modes for detailed comparisons.

Pipeline Mode

Traditional Speech-to-Text → LLM → Text-to-Speech pipeline. Offers maximum control over voice selection and response generation. Best for: Complex reasoning, function calling, custom voice requirements

Speech-to-Speech Mode

Direct speech-to-speech generation without intermediate text processing. Provides the most natural conversational flow. Best for: Quick conversations, natural back-and-forth dialogue

Dualplex Mode (Beta)

Combines fast multimodal processing with premium ElevenLabs voice output. Best for: Most use cases - recommended default

Language Configuration

Primary Language

The main language your assistant will use for speech recognition and synthesis. This affects:

Speech recognition accuracy
Available voice options
Filler audio phrases
Voice model selection

See Language support for all available languages and accents.

Secondary Languages

Additional languages your assistant can understand and speak. Useful for:

Multilingual customer support
International businesses
Code-switching conversations

Note: The AI can detect which language the customer is speaking and respond appropriately.

AI Voice Selection

Your assistant can choose from existing voices, clone custom voices, or request voices from the ElevenLabs library.

Voice Options

You have three ways to get the perfect voice for your assistant: 1. Choose from existing voices:

Professional voices: Pre-trained, high-quality options from ElevenLabs
Multiple accents: Available for most languages
Gender options: Male and female voices for each language
Tone variety: From formal business to casual conversational

2. Clone a custom voice: Create a custom voice by uploading audio samples: Requirements:

Clear, high-quality audio sample (1-5 minutes recommended)
MP3 or WAV format
Consistent speaking pace and tone
Minimal background noise
Same voice used throughout

Process:

Record yourself or a voice actor reading sample text
Upload the audio file in assistant settings
Wait for training to complete (few minutes to hours)
Test the cloned voice before using in production

Use cases:

Brand consistency with company spokesperson
Personal touch for customer relationships
Matching voice to specific business persona

3. Request from ElevenLabs library: You can request specific voices from the ElevenLabs public library - contact support to add them to your account. Browse the ElevenLabs Voice Library to discover thousands of professional voices across different languages, accents, and use cases. See Voice selection guide for detailed setup instructions.

Timezone Configuration

Timezone

Set the timezone your assistant operates in. This affects:

Time-based variables in conversations
Appointment scheduling functions
“Current time” references in system prompts
Timestamps in call logs and data extraction

Important: Choose the timezone where your business operates or where most customers are located. The assistant will use this for any time-related calculations or scheduling.

Audio Enhancement Settings

Ambient Sound

Optional background sound mixed under your assistant’s voice to mask processing delays and create a more natural audio experience. Options:

None: No background sound (default)
Office: Subtle office environment sounds

Volume control: Adjust the level of ambient sound relative to the voice. Lower values are usually better - too much background sound can interfere with speech recognition.

Turn off or lower volume if the assistant isn’t hearing the customer clearly.

Filler Audio

Short conversational phrases like “mhm”, “okay”, “I understand” that play during AI processing time. See Filler audio guide for full details.

Benefits

Eliminates awkward silences during processing
Keeps callers engaged
Creates more natural conversation flow
Reduces hang-up rates

Language-aware configuration: Filler phrases are automatically set for your selected language:

Positive responses

“Great!”, “Perfect!”, “Super!”

Negative responses

“Hmm.”, “I see.”, “Okay.”

Question responses

“Right?”, “Really?”, “How so?”

Neutral responses

“Okay.”, “I understand.”, “Got it.”

Customization: You can edit the default phrases for each category to match your brand voice or regional preferences.

Enable by default - most conversations benefit from fillers. Test with your target audience and adjust phrases to match your assistant’s personality.

Advanced Settings

LLM Model Selection

Choose the best language model for your assistant’s mode. See LLM model selection guide for detailed recommendations. Recommended models by mode:

Model	Strengths	Best for
GPT-5 Mini	Balanced reasoning with low latency	Pipeline mode for complex reasoning
GPT-5 Realtime	Ultra-low-latency voice turns	Speech-to-Speech and Dualplex
GPT-4o	Strong reasoning and multimodal understanding	Complex tasks (higher latency)
Gemini Flash 2.0/2.5	Ultra-fast for voice turns	Dualplex/Multimodal for minimal latency

Quick selection guide:

Speed is critical: Use GPT-5 Realtime or Gemini Flash 2.0/2.5
Rich reasoning needed: Use GPT-4o or GPT-5 Mini with filler audios to offset latency

LLM Temperature

Range: 0.0 - 1.0 | Default: 0.1 Adjust the level of creativity of the AI when generating responses. Lower value yields better function call results.

Lower (0.0-0.3)

More stable: Predictable responses, better for function calling and business use cases

Higher (0.7-1.0)

More random: Creative and varied responses, good for casual conversations

Special behavior: For GPT-5 Mini and GPT-5 Nano models in Pipeline mode, temperature is automatically set to 1.0 for optimal performance.

Duration Settings

Control timing and call limits to optimize user experience and costs:

Re-engagement Interval

Range: 7 - 600 seconds | Default: 30 secondsAI will try to re-engage the user if no reply is detected within this time.Recommended: 30-60 seconds for professional calls.

Max Call Duration

Range: 20 - 1200 seconds | Default: 600 seconds (10 minutes)Call will automatically end if this value is reached.Recommended: 5-10 minutes for lead qualification to control costs.

Max Silence Duration

Range: 1 - 120 seconds | Default: 40 secondsCall will end if user doesn’t reply within this time.Recommended: 30-45 seconds to balance patience with efficiency.

Ringing Time

Range: 1 - 60 seconds | Default: 30 secondsFor how long the call will ring before marking as unanswered. Good when you want to avoid voicemail by setting a lower value.

Cost optimization: Lower duration limits help control per-minute costs, especially important for high-volume campaigns.

Call Protection Settings

Noise Cancellation

Default: EnabledFilters caller background noise for clearer speech recognition. Turn OFF if experiencing audio clipping.

End Call on Voicemail

Default: EnabledImmediately ends call if voicemail is detected during outbound calls (saves costs).

Record Calls

Default: EnabledRecords call audio for review and analysis. Ensure compliance with local recording laws.

Max Initial Silence

Range: 1 - 120 seconds | Default: 20 seconds (when enabled)If enabled, end the call if no first user response within this time. Counts only from call start to first user response.Use case: Detect if anyone actually answered the phone.

Synthesizer Settings

Configure text-to-speech voice parameters for natural-sounding conversations. Available for: Pipeline and Dualplex modes only. Speech-to-Speech mode uses native voice generation.

Voice Tuning Parameters

Fine-tune your assistant’s voice characteristics for optimal performance:

Voice Stability

Range: 0.0 - 1.0 | Default: 0.7Lower settings make the voice more expressive but less predictable, while higher settings make it steadier but less emotional.

More Expressive (0.0-0.3)

Dynamic and varied delivery but less predictable

More Stable (0.7-1.0)

Consistent and steady but less emotional range

Voice Similarity

Range: 0.0 - 1.0 | Default: 0.5Determines how closely the AI matches the original voice. Higher settings potentially include unwanted noise from the original recording.

More Stable (0.0-0.4)

Cleaner audio but less accurate to original voice

More Similar (0.6-1.0)

Accurate to original but may include background noise

For cloned voices: Start at 0.5 and increase gradually. Higher similarity can introduce unwanted artifacts from the original recording.

Speech Speed

Range: 0.7 - 1.2 | Default: 1.0Adjust the speed of the AI’s speech for optimal comprehension and user experience.

Slower (0.7-0.85)

Better for complex information or older demographics

Normal (0.9-1.1)

Standard conversational pace for most use cases

Faster (1.15-1.2)

Quick conversations or time-sensitive scenarios

Transcriber Settings

Configure speech-to-text recognition for optimal accuracy and speed. Available for: Pipeline mode only. Speech-to-Speech and Dualplex modes use integrated transcription.

Provider Selection

Choose the best transcriber for your language and use case. The provider that will be used to transcribe the user speech.

Azure

Accuracy: ⭐⭐⭐⭐ Latency: SlowerBest for highest transcription fidelity when accuracy is critical.

Gladia

Accuracy: ⭐⭐⭐ Latency: FasterGood all-rounder for most languages. Supports multilingual configurations.

Deepgram

Accuracy: ⭐⭐⭐ Latency: FasterSolid choice for English and major languages.

Different languages, accents, or background noise can impact each provider differently. Test which performs better for your specific language and audio setup.

Endpoint Configuration

AI Turn Detection

Uses AI to intelligently detect when the caller has finished speaking

Voice Activity Detection (VAD)

Default: Traditional voice activity detectionChoose how the AI will detect the end of the user phrase

Voice Activity Detection (VAD)

Control when your assistant starts and stops talking. See Handling interruptions guide for detailed VAD configuration.

Fine-tune these settings if experiencing interruption issues or sluggish responses.

Endpoint Sensitivity

Range: 0 - 5 seconds | Default: 0.5Adjust the time the AI will wait for the user to speak after the last word. Lower values make the AI faster, higher values are better for long user phrases.

0 (Faster): Quick responses but may cut off callers
5 (Slower): Waits longer, reduces interruptions

Interrupt Sensitivity

How easily the assistant stops when caller talks over it. Controls the sensitivity for detecting when a caller is trying to interrupt.

Minimum Interrupt Words

Require at least N caller words before interrupting assistant. Use: Prevents false triggers from background noise or brief sounds.

Pro tip: Start with default VAD settings and adjust based on real call testing. Increase endpoint sensitivity if callers get cut off, decrease if responses feel slow.

Introduction

Getting Started

AI Assistants Overview

Campaigns

Leads

Phone Numbers

Inbound Calls

Outbound Calls

AI Prompting & Conversation Design

Automation & Integrations

Costs & Pricing

Number Provisioning

Troubleshooting & FAQs

​Quick Start Guide

​Call Direction & Basic Setup

​Assistant Type

​Assistant Name

​Phone Number Configuration

​For Outbound Assistants

​For Inbound Assistants

​Pricing & Costs

​Engine Type (Voice Processing Mode)

​Pipeline Mode

​Speech-to-Speech Mode

​Dualplex Mode (Beta)

​Language Configuration

​Primary Language

​Secondary Languages

​AI Voice Selection

​Voice Options

​Timezone Configuration

​Timezone

​Audio Enhancement Settings

​Ambient Sound

​Filler Audio

Benefits

​Advanced Settings

​LLM Model Selection

​LLM Temperature

Lower (0.0-0.3)

Higher (0.7-1.0)

​Duration Settings

​Call Protection Settings

​Synthesizer Settings

​Voice Tuning Parameters

More Expressive (0.0-0.3)

More Stable (0.7-1.0)

More Stable (0.0-0.4)

More Similar (0.6-1.0)

Slower (0.7-0.85)

Normal (0.9-1.1)

Faster (1.15-1.2)

​Transcriber Settings

​Provider Selection

Azure

Gladia

Deepgram

​Endpoint Configuration

AI Turn Detection

Voice Activity Detection (VAD)

​Voice Activity Detection (VAD)

Quick Start Guide

Call Direction & Basic Setup

Assistant Type

Assistant Name

Phone Number Configuration

For Outbound Assistants

For Inbound Assistants

Pricing & Costs

Engine Type (Voice Processing Mode)

Pipeline Mode

Speech-to-Speech Mode

Dualplex Mode (Beta)

Language Configuration

Primary Language

Secondary Languages

AI Voice Selection

Voice Options

Timezone Configuration

Timezone

Audio Enhancement Settings

Ambient Sound

Filler Audio

Advanced Settings

LLM Model Selection

LLM Temperature

Duration Settings

Call Protection Settings

Synthesizer Settings

Voice Tuning Parameters

Transcriber Settings

Provider Selection

Endpoint Configuration

Voice Activity Detection (VAD)