Pipecat is an open-source Python framework for building real-time voice AI pipelines. It provides a composable, frame-based architecture where audio, transcripts, and events flow through a series of processors. Mise integrates with Pipecat as a pipeline observer — a processor that receives frames and forwards turn data to Mise without interrupting the primary pipeline flow.Documentation Index
Fetch the complete documentation index at: https://docs.sf-voice.sh/llms.txt
Use this file to discover all available pages before exploring further.
Mise is in private alpha. The Mise SDK for Pipecat and your API key are provided after your team is granted access. Request access.
What Mise captures from Pipecat
For each Pipecat session, Mise captures:- Turn events:
TranscriptionFrameandUserStartedSpeakingFrame/UserStoppedSpeakingFrameevents define turn boundaries - Audio frames: Raw audio frames per turn for acoustic indexing (tone, prosody, frustration, interruptions)
- Pipeline events: Bot speech start and stop events, function call events, and interruption frames
- Session metadata: Session ID, transport type (WebRTC, WebSocket, telephony), and participant identifiers
How to add Mise to your Pipecat pipeline
Request alpha access
Request access to Mise. After onboarding, you will receive the Mise Pipecat package and your API key.
Install the Mise observer package
Once you have access, install the Mise Pipecat package into your project environment:
Add the MiseObserver to your pipeline
Import
MiseObserver from the package and add it to your Pipecat pipeline. The observer runs in parallel to your existing processors and does not add latency to your pipeline.How the observer works
MiseObserver subscribes to the Pipecat frame bus passively. It does not hold frames or delay their propagation. For each relevant frame type, it extracts turn data and queues it for asynchronous forwarding to Mise:
AudioRawFramefrom the user transport triggers audio capture for the current user turnUserStoppedSpeakingFramesignals the end of a user turn, after which the buffered audio is sent to MiseBotSpeakingFrameandBotStoppedSpeakingFramedelimit agent turnsTranscriptionFrameenriches the indexed turn with transcript text if availableBotInterruptionFrameis logged as an interruption event against the affected turn