Text-to-Speech
ElevenLabs integration, voice selection, and TTS markers in ArgentOS.
Overview
ArgentOS supports text-to-speech (TTS) through ElevenLabs integration, allowing your agent to speak responses aloud through the dashboard. The dashboard handles voice synthesis using the user's selected voice.
TTS Priority System
The dashboard processes speech in this priority order:
[TTS:text]markers: The agent explicitly marks text to be spoken- Auto-summarize: Long responses are automatically summarized and spoken
- Short responses: Brief responses are spoken directly
MEDIA:fallback: Agent-generated audio files (may use a different voice)
The dashboard TTS always uses the voice selected in Audio Settings, ensuring a consistent voice experience regardless of how the speech was triggered.
TTS Markers
The agent can explicitly control what gets spoken using markers in its response:
Here's the full analysis of your server logs...
[TTS:Your server logs show three critical errors in the last hour. I've documented the details below.]The text inside [TTS:...] is spoken aloud. The rest of the response is displayed visually but not spoken. This lets the agent provide detailed written content while speaking a concise summary.
Voice Selection
ElevenLabs Voices
Configure your preferred voice in the dashboard's Audio Settings:
- Open the dashboard
- Go to Settings > Audio
- Select a voice (Jessica, Lily, etc.)
- Adjust speech rate and volume
Configuration
{
"dashboard": {
"tts": {
"provider": "elevenlabs",
"voice": "jessica",
"rate": 1.0,
"volume": 0.8,
"autoSpeak": true
}
}
}Agent-Generated Audio
The agent's sag (Speech Audio Generation) tool can generate audio files directly. However, these may use a different voice than the dashboard's selected voice. The dashboard's own TTS takes priority over agent-generated audio files for consistency.
Disabling TTS
Set autoSpeak: false in settings to disable automatic speech. The agent can still generate [TTS:] markers, but they will be ignored unless you manually trigger playback.
Requirements
- An ElevenLabs API key (configured in dashboard settings)
- Browser with Web Audio API support
- Speakers or headphones