November 25, 2025

RTM Transport: Dual RTC + RTM Voice Agent

Build a TEN agent that streams audio over Agora RTC while sending low-latency text/data over Agora RTM.

Shipping a real-time voice agent means keeping audio flowing while text and control messages move independently. The new rtm-transport example pairs Agora RTC for audio with Agora RTM for reliable data so you can keep STT -> LLM -> TTS fully duplex without blocking.

TL;DR -- The new rtm-transport example shows how to run Agora RTC (audio + data channel) and Agora RTM (messaging) together. It routes per-user audio with streamid_adapter, chunks outbound text with message_collector2, and keeps STT -> LLM -> TTS fully real-time.

Why this update matters

Dual transport: RTC handles audio while RTM carries text/data without blocking the audio pipeline.
Multi-user safe: streamid_adapter maps each RTC stream_id to a unique session_id, so ASR sessions do not collide.
Reliable messaging: message_collector2 chunks text to base64 and sends it via both RTC data channel and RTM with pacing.
Drop-in pipeline: STT -> LLM -> TTS stays real-time; you can swap providers in TMAN Designer.

Architecture at a glance

Agora RTC (incoming audio)
    -> streamid_adapter (stream_id -> session_id)
      -> STT
        -> LLM
          -> TTS
            -> Agora RTC (outgoing audio)
  
  message_collector2 ("message" input)
    -> chunk to base64
      -> RTC data channel
      -> Agora RTM emits rtm_message_event -> main_control

Key extensions: agora_rtc, agora_rtm, streamid_adapter, message_collector2, main_control.

Run it locally

cd ai_agents/agents/examples/rtm-transport
  
  # 1) Configure env vars (required)
  cat <<'EOF' > .env
  AGORA_APP_ID=your_agora_app_id_here
  AGORA_APP_CERTIFICATE=your_agora_certificate_here
  DEEPGRAM_API_KEY=your_deepgram_api_key_here
  OPENAI_API_KEY=your_openai_api_key_here
  OPENAI_MODEL=gpt-4o
  ELEVENLABS_TTS_KEY=your_elevenlabs_api_key_here
  EOF
  
  # Optional extras
  export OPENAI_PROXY_URL=...
  export WEATHERAPI_API_KEY=...
  
  # 2) Install & run
  task install
  task run

What you get:

Configuration highlights

agora_rtc: publish_audio, publish_data, stream_id (local), remote_stream_id (subscribe), optional app_certificate.
agora_rtm: channel, user_id, token, rtm_enabled flag.
message_collector2: chunks outbound text at ~40 ms intervals to keep latency low.
streamid_adapter: translates RTC stream_id into per-user session_id for ASR separation.

Edit tenapp/property.json or use TMAN Designer (right-click extensions → Properties) to swap STT/LLM/TTS providers or change stream IDs.

When to use this pattern

Chat + voice apps that need low-latency text commands alongside audio.
Multi-speaker experiences where each RTC stream should map to its own ASR session.
Live streaming or gaming where RTM carries state/commands while RTC carries voice.
Collaborative tools combining voice chat with reliable data delivery.

Release as a Docker image

cd ai_agents
  docker build -f agents/examples/rtm-transport/Dockerfile -t rtm-transport-app .
  docker run --rm -it --env-file .env -p 8080:8080 -p 3000:3000 rtm-transport-app

Troubleshooting tips

No audio back? Verify remote_stream_id matches the publisher and publish_audio/subscribe_audio are true.
Missing RTM messages? Check rtm_enabled, user_id, and token; watch the rtm_message_event in logs.
Mixed-up transcripts? Confirm streamid_adapter is in the chain so each RTC stream gets its own session_id.