The TEN Framework makes it possible to build real-time, low-latency AI phone call systems that can handle both inbound and outbound calls with natural voice conversations — all orchestrated through a single, unified pipeline.
In this tutorial, we'll show you how to create an AI assistant that can make and receive phone calls using Twilio's Voice API and TEN Framework. The best part? You get real-time speech recognition, intelligent responses, and natural text-to-speech — all working together seamlessly.
Prepare your environment variables by creating a .env file with all the required API keys:
# Twilio (required for call handling)TWILIO_ACCOUNT_SID=your_twilio_account_sid_hereTWILIO_AUTH_TOKEN=your_twilio_auth_token_hereTWILIO_FROM_NUMBER=+1234567890TWILIO_PUBLIC_SERVER_URL=https://your-domain.com# Deepgram (required for speech-to-text)DEEPGRAM_API_KEY=your_deepgram_api_key_here# OpenAI (required for language model)OPENAI_API_KEY=your_openai_api_key_hereOPENAI_MODEL=gpt-4# ElevenLabs (required for text-to-speech)ELEVENLABS_TTS_KEY=your_elevenlabs_api_key_here# Ngrok (required for local development)NGROK_AUTHTOKEN=your_ngrok_auth_token_here
Set up ngrok for local development (required for Twilio webhooks):
# Install ngrok if you haven't already# Download from https://ngrok.com/download or use package manager# Authenticate ngrok with your auth token (from your .env file)ngrok config add-authtoken $NGROK_AUTHTOKEN
Use the Twilio voice assistant:
cd agents/examples/voice-assistant-sip-twiliotask install
When you make an outbound call through the frontend or API:
# From: tenapp/ten_packages/extension/main_python/server.py@self.app.post("/api/call")async def create_call(request: Request): """Create a new outbound call""" body = await request.json() phone_number = body.get("phone_number") message = body.get("message", "Hello from Twilio!") # Create TwiML response with media stream twiml_response = VoiceResponse() # Configure WebSocket URL for real-time audio streaming ws_protocol = "wss" if self.config.twilio_use_wss else "ws" media_ws_url = f"{ws_protocol}://{self.config.twilio_public_server_url}/media" connect = twiml_response.connect() connect.stream(url=media_ws_url) # This tells Twilio to connect to our WebSocket twiml_response.append(connect) # Create the call via Twilio API call = self.twilio_client.calls.create( to=phone_number, from_=self.config.twilio_from_number, twiml=str(twiml_response) )
👉 Key Point: The TwiML includes a <Stream> instruction that tells Twilio to establish a WebSocket connection to our server for real-time audio streaming.