Last updated: June 29, 2025

Getting great results often comes down to picking the right engine settings. Use this checklist when configuring an assistant:

1. Pick a Mode

ModeWhy choose it?Notes
Speech-to-Speech (Multimodal)Fastest turn-taking & most natural flowWe recommend starting here. Try the Gemini 2.5 engine (beta) for the lowest latency—but be aware it’s still experimental and may be less stable.
PipelineMaximum control over voice & long-form repliesIf you select Pipeline, continue to the Transcriber step below.

Want to know more about the differences between the modes? Read the Assistant modes guide.

Experiment with both modes: record the same scenario in each and compare response time and caller satisfaction.

2. Choose a Transcriber (Pipeline only)

TranscriberAccuracyLatencyBest for
Azure⭐⭐⭐⭐⏱️⏱️⏱️ (slower)When you need the highest transcription fidelity.
Gladia⭐⭐⭐⏱️ (faster)Good all-rounder for most languages.
Deepgram⭐⭐⭐⏱️ (faster)Another solid choice—test which performs better for your language & audio setup.

Tip: Different languages, accents, or background noise can impact each engine differently. Run a quick A/B test and keep the best performer.

3. Select an LLM Model

ModelStrengthsTrade-offs
GPT-4oSmartest reasoning, handles complex promptsSlightly higher latency and cost.
Gemini 2.5-Flash-LiteBlazing-fast, still highly capableMay miss nuance in very complex tasks—test for your use-case.

If speed is critical, start with Gemini 2.5-Flash-Lite. For sophisticated reasoning, use GPT-4o and offset latency by shortening replies.

4. Noise Cancellation

If callers are on speaker phone or in a quiet environment, keep noise cancellation ON. If your call volume is low or some words are “clipped,” turn it OFF so the transcriber gets the full waveform.

If your assistant is not hearing you well, you can try to turn off noise cancellation.

5. Conversation Timers

ParameterRecommendedWhy
Re-engagement≈ 30 sGives callers enough time to think. Lower values can feel pushy.
Max silence duration≈ 60 sPrevents premature hang-ups while still ending truly silent calls.

Test different values in real calls—too low can interrupt, too high leaves awkward gaps.

6. Initial Message

ModeHow it’s usedBest practice
PipelineRead exactly as written (converted by TTS).Write the greeting verbatim: “Hello, this is Alex from …”.
Speech-to-SpeechInterpreted as a prompt by the model.Include instructions like “Greet the customer and say …” or prepend say exactly: to ensure literal output.

7. Ambient sound

Enabled by default, ambient sound is a feature that adds background noise to the assistant’s voice.

If the assistant is not hearing you well, you can try to turn off ambient sound or turn the volume lower.

8. Endpointing sliders

Control when your assistant starts talking with the endpointing sensitivity slider at the bottom of assistant settings.

SettingEffectUse when
Lower sensitivityAssistant responds faster after caller stops speakingYou want snappy, quick-turn conversations
Higher sensitivityAssistant waits longer before respondingCallers give longer, more detailed replies

Pro tip: If your assistant cuts off callers mid-sentence, increase the sensitivity. If responses feel sluggish, decrease it.

9. Debug using call transcript

If you are having issues with your assistant, you can use the call transcript to debug the issue.

  1. Go to the Call history page.
  2. Click on the last call you tested
  3. The call transcript will be shown including function calls and its parameters.

10. Still have questions?

If you have any questions, please contact our support team via the chat widget inside the app.

Test different settings with real calls—the right balance depends on your conversation flow and caller behavior patterns.