Shipping a real-time voice agent means keeping audio flowing while text and control messages move independently. The new rtm-transport example pairs Agora RTC for audio with Agora RTM for reliable data so you can keep STT -> LLM -> TTS fully duplex without blocking.
TL;DR -- The new
rtm-transportexample shows how to run Agora RTC (audio + data channel) and Agora RTM (messaging) together. It routes per-user audio withstreamid_adapter, chunks outbound text withmessage_collector2, and keeps STT -> LLM -> TTS fully real-time.
Why this update matters
- Dual transport: RTC handles audio while RTM carries text/data without blocking the audio pipeline.
- Multi-user safe:
streamid_adaptermaps each RTCstream_idto a uniquesession_id, so ASR sessions do not collide. - Reliable messaging:
message_collector2chunks text to base64 and sends it via both RTC data channel and RTM with pacing. - Drop-in pipeline: STT -> LLM -> TTS stays real-time; you can swap providers in TMAN Designer.
Architecture at a glance
Agora RTC (incoming audio)
-> streamid_adapter (stream_id -> session_id)
-> STT
-> LLM
-> TTS
-> Agora RTC (outgoing audio)
message_collector2 ("message" input)
-> chunk to base64
-> RTC data channel
-> Agora RTM emits rtm_message_event -> main_controlKey extensions: agora_rtc, agora_rtm, streamid_adapter, message_collector2, main_control.
Run it locally
cd ai_agents/agents/examples/rtm-transport
# 1) Configure env vars (required)
cat <<'EOF' > .env
AGORA_APP_ID=your_agora_app_id_here
AGORA_APP_CERTIFICATE=your_agora_certificate_here
DEEPGRAM_API_KEY=your_deepgram_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o
ELEVENLABS_TTS_KEY=your_elevenlabs_api_key_here
EOF
# Optional extras
export OPENAI_PROXY_URL=...
export WEATHERAPI_API_KEY=...
# 2) Install & run
task install
task runWhat you get:
- Frontend: http://localhost:3000
- API server: http://localhost:8080
- TMAN Designer: http://localhost:49483
Configuration highlights
agora_rtc:publish_audio,publish_data,stream_id(local),remote_stream_id(subscribe), optionalapp_certificate.agora_rtm:channel,user_id,token,rtm_enabledflag.message_collector2: chunks outbound text at ~40 ms intervals to keep latency low.streamid_adapter: translates RTCstream_idinto per-usersession_idfor ASR separation.
Edit tenapp/property.json or use TMAN Designer (right-click extensions → Properties) to swap STT/LLM/TTS providers or change stream IDs.
When to use this pattern
- Chat + voice apps that need low-latency text commands alongside audio.
- Multi-speaker experiences where each RTC stream should map to its own ASR session.
- Live streaming or gaming where RTM carries state/commands while RTC carries voice.
- Collaborative tools combining voice chat with reliable data delivery.
Release as a Docker image
cd ai_agents
docker build -f agents/examples/rtm-transport/Dockerfile -t rtm-transport-app .
docker run --rm -it --env-file .env -p 8080:8080 -p 3000:3000 rtm-transport-appTroubleshooting tips
- No audio back? Verify
remote_stream_idmatches the publisher andpublish_audio/subscribe_audioare true. - Missing RTM messages? Check
rtm_enabled,user_id, and token; watch thertm_message_eventin logs. - Mixed-up transcripts? Confirm
streamid_adapteris in the chain so each RTC stream gets its ownsession_id.
Links
- Example source: https://github.com/TEN-framework/ten-framework/tree/main/ai_agents/agents/examples/rtm-transport
- TMAN Designer docs: https://theten.ai/docs/ten_agent/customize_agent/tman-designer
- Agora RTC: https://docs.agora.io/en/rtc/overview/product-overview
- Agora RTM: https://docs.agora.io/en/Real-time-Messaging/product_rtm