Twilio is one of the most widely used telephony platforms for voice AI. If you run inbound or outbound voice calls through Twilio, Mise connects via Twilio Media Streams — a WebSocket-based feature that delivers real-time audio from both the caller and the agent to any endpoint you configure.Documentation Index
Fetch the complete documentation index at: https://docs.sf-voice.sh/llms.txt
Use this file to discover all available pages before exploring further.
Mise is in private alpha. Your Mise Media Streams endpoint and API key are provided after your team is granted access. Request access.
What Mise captures from Twilio
For each Twilio call, Mise captures:- Both call legs: Inbound (caller) and outbound (agent) audio are tracked separately
- Call metadata: Call SID, account SID, caller ID, called number, direction, and timestamps
- Turn-level acoustic features: Tone, prosody, frustration, silence, and interruption signals, segmented per speaker
- Call lifecycle events: Call initiated, answered, completed, and failed events via webhook
How to connect Twilio to Mise
Request alpha access
Request access to Mise. After onboarding, you will receive your Mise WebSocket endpoint URL and API key for Media Streams.
Add a Media Streams noun to your TwiML
In the TwiML that handles your voice calls, add a
<Stream> noun inside a <Connect> verb. Point it to your Mise WebSocket endpoint. This can be added to existing TwiML without removing any other logic.Configure call event webhooks (optional)
For richer metadata, configure your Twilio phone number’s status callback URL to also point to Mise. This allows Mise to capture call lifecycle events alongside the audio stream.Set your Twilio number’s Status Callback URL to:This step is optional — Mise will index calls from audio alone — but adding status callbacks improves call boundary detection and metadata fidelity.