Get Started with TEN Framework in 5 Minutes
Building real-time AI applications often means wrestling with complex architectures, language barriers, and integration headaches. The TEN Framework eliminates these pain points by letting you orchestrate multi-language extensions through a unified runtime—and you can have your first app running in under 5 minutes.
This guide walks through the quick start process using the transcriber demo, a real-world example that showcases Go for WebSocket handling, Python for speech recognition, and TypeScript for subtitle generation—all working together seamlessly.
Why TEN Framework?
Before diving into the setup, here's what makes TEN different:
- Multi-Language by Design → Use the best tool for each job: Go for networking, Python for AI, TypeScript for UI logic
- Real-Time First → Built-in support for audio/video streaming and low-latency data flows
- Extension-Based → Modular architecture lets you swap components without rewriting your entire stack
- Production Ready → Handles cross-language communication, memory management, and concurrency out of the box
The framework abstracts away the complexity of multi-language orchestration while preserving the performance characteristics of each language.
System Requirements
Before you begin, verify your system meets these requirements:
Supported Platforms
- Linux (x64)
- macOS Intel (x64)
- macOS Apple Silicon (arm64)
Required Software
- Python 3.10 → For AI extensions and speech processing
- Go 1.20+ → For WebSocket and networking extensions
- Node.js & npm → For frontend and TypeScript extensions
Quick Verification
Run these commands to check your setup:
python --version # Should show 3.10.x
go version # Should show 1.20 or higher
node --version # Should show recent versionPython Environment Setup
The guide recommends using virtual environments to avoid conflicts with system Python. You can use either pyenv or venv:
# Using pyenv
pyenv install 3.10
pyenv local 3.10
# Or using venv
python3.10 -m venv .venv
source .venv/bin/activateInstallation Process
Step 1: Install TEN Manager
The TEN Manager handles project creation, dependency management, and builds. Install it with a single command:
curl -fsSL https://get.theten.ai/install.sh | bashAfter installation, verify it's in your PATH:
tman --versionIf the command isn't found, add /usr/local/bin to your PATH:
export PATH="/usr/local/bin:$PATH"
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc # or ~/.zshrcStep 2: Create Your First Project
Now generate the transcriber demo application:
tman create transcriber_demo
cd transcriber_demoThis scaffolds a complete project structure with all necessary configuration files, extension definitions, and the multi-language runtime graph.
Step 3: Install Dependencies
The framework needs to install two types of dependencies:
# Install TEN packages and programming language dependencies
tman installThis step typically takes 1-2 minutes and:
- Downloads required TEN framework packages
- Installs Python dependencies (speech recognition libraries)
- Sets up Go modules (WebSocket handling)
- Configures TypeScript dependencies (subtitle generation)
Step 4: Build the Project
Compile all extensions across all languages:
tman buildThe build process takes approximately 30 seconds and compiles:
- Go extensions → Native binaries for WebSocket functions
- Python extensions → Bytecode and dependency linking
- TypeScript extensions → Transpiled JavaScript modules
Configuration
Before running the demo, you need to configure your speech service credentials. The transcriber demo uses Azure Speech Service by default.
Create Environment File
Create a .env file in your project root:
cp .env.example .envAdd Azure Credentials
Open .env and add your Azure Speech Service credentials:
AZURE_SPEECH_KEY=your_azure_speech_key_here
AZURE_SPEECH_REGION=your_region # e.g., eastus, westus2Don't have Azure credentials? You can:
- Sign up for a free Azure account (includes free tier for Speech Service)
- Or swap in a different STT provider by modifying the extension configuration
Running Your First TEN Application
With everything configured, start the application:
tman startThe framework will:
- Initialize the multi-language runtime
- Load and connect all extensions (Go, Python, TypeScript)
- Start the WebSocket server
- Launch the web interface on port 8080
Access the Demo
Open your browser and navigate to:
http://localhost:8080You should see the transcriber interface with two main features:
-
Real-Time Voice Transcription → Click to allow microphone access, then start speaking. Your speech appears as text in real-time.
-
Audio File Upload → Upload pre-recorded audio files and watch as subtitles are generated with timestamps.
Understanding the Demo Architecture
The transcriber demo showcases TEN's multi-language orchestration capabilities. Here's how data flows through the system:
┌─────────────────┐
│ Browser/Client │
└────────┬────────┘
│ WebSocket Audio Stream
▼
┌─────────────────┐
│ Go Extension │ ← WebSocket handling & audio routing
└────────┬────────┘
│ PCM Audio Frames
▼
┌─────────────────┐
│ Python Extension│ ← Azure Speech Recognition
└────────┬────────┘
│ Transcription Events
▼
┌─────────────────┐
│TypeScript Ext. │ ← Subtitle formatting & timestamps
└────────┬────────┘
│ Formatted Subtitles
▼
┌─────────────────┐
│ Browser/Client │
└─────────────────┘Why This Architecture Matters
- Go handles I/O → Efficient WebSocket connections and audio streaming
- Python processes AI → Leverages Azure's Python SDK and ML libraries
- TypeScript manages UI logic → Format subtitles, manage timestamps, handle display
Each component runs in its native runtime but communicates through TEN's unified messaging system. You get the performance of Go, the AI ecosystem of Python, and the web integration of TypeScript—without manual IPC, serialization, or protocol design.
Common Issues and Solutions
macOS Library Loading Failures
Symptom: Error loading dynamic libraries on macOS
Solution: Grant terminal permissions for the app:
xattr -d com.apple.quarantine /path/to/tmanNetwork Connectivity Problems
Symptom: Can't reach Azure Speech Service
Solutions:
- Check your firewall settings
- Verify internet connectivity
- Confirm Azure credentials in
.env - Test with a different network
Port Already in Use
Symptom: "Address already in use: port 8080"
Solution: Either stop the conflicting process or change the port:
# Find what's using port 8080
lsof -i :8080
# Or change the port in config/server.json
{
"port": 8081
}Build Errors
Symptom: Compilation fails with missing dependencies
Solutions:
- Re-run
tman installto ensure all dependencies are present - Check that Go, Python, and Node.js versions meet requirements
- Clear build cache:
tman clean && tman build
Dependency Installation Challenges
Symptom: tman install fails with package resolution errors
Solutions:
- Use a virtual environment for Python isolation
- Check network access to package registries
- Try clearing package cache:
rm -rf ~/.tman/cache
What's Actually Happening Under the Hood
When you run the demo, TEN Framework:
- Loads the Runtime Graph → Parses your configuration to understand which extensions connect to which
- Spawns Language Runtimes → Starts separate processes for Go, Python, and TypeScript
- Establishes Message Channels → Creates high-performance IPC channels between languages
- Routes Data → Forwards audio frames from Go → Python, transcriptions from Python → TypeScript
- Handles Lifecycle → Manages startup, shutdown, and crash recovery across all components
All of this happens transparently. Your extensions just send and receive messages through the TEN API—no manual process management, no custom serialization, no protocol design.
Next Steps: Building Your Own Extensions
Once you've run the demo successfully, you're ready to build custom extensions. The framework makes it easy to:
Swap Out Components
Replace Azure Speech with a different STT provider:
tman extension add deepgram_stt
# Update graph configuration to use new extensionAdd New Languages
TEN supports C++, Rust, and more. Add a Rust extension for audio processing:
tman extension create my_audio_processor --language rustExtend Functionality
Add sentiment analysis to transcriptions:
tman extension create sentiment_analyzer --language python
# Connect it after the STT extension in your graphCreate Custom Workflows
Build a complete voice assistant by chaining:
- STT → Transcription
- LLM → Response generation
- TTS → Voice synthesis
- WebSocket → Real-time delivery
Performance Characteristics
The transcriber demo showcases TEN's real-time capabilities:
- Audio Latency → Sub-100ms from microphone to STT extension (Go WebSocket handling)
- Transcription Speed → Real-time processing (Azure's streaming API via Python)
- End-to-End → Text appears in browser typically within 200-300ms of speech
- Cross-Language Overhead → Minimal—TEN uses zero-copy message passing where possible
This performance holds even as you add more extensions. The framework's message routing scales linearly with the complexity of your graph.
Beyond the Quick Start
This 5-minute guide gets you running, but TEN Framework offers much more:
- Visual Graph Designer → Build extension graphs with TMAN Designer's drag-and-drop interface
- Production Deployment → Docker support, Kubernetes orchestration, and cloud-native tooling
- Advanced Patterns → Implement pub/sub, request/response, and streaming patterns
- Extension Marketplace → Reuse community extensions for common tasks
The transcription demo is intentionally minimal to show the core concepts. In production, you'd add error handling, state management, authentication, and monitoring—all supported by the framework's extension API.
Why This Matters
Traditional approaches to multi-language integration force you to choose:
- Single Language → Use one language and accept its limitations for certain tasks
- Microservices → Build separate services with REST/gRPC, accept network latency
- FFI/Bindings → Write complex C bindings, manage memory across language boundaries
TEN Framework gives you a fourth option: language-native extensions orchestrated by a real-time runtime. You write idiomatic code in each language, and the framework handles the rest.
The result? Go's concurrency, Python's AI ecosystem, TypeScript's web integration—all in a single application, with real-time performance.
Conclusion
In five minutes, you've:
- Installed the TEN Framework toolchain
- Created a multi-language AI application
- Configured external services
- Run a real-time transcription system
- Understood the extension architecture
The transcriber demo is just the beginning. Use it as a template for:
- Live captioning systems
- Voice assistants
- Meeting transcription tools
- Real-time translation services
- Accessibility features
The framework abstracts the hard parts—process management, serialization, language interop—so you can focus on building features that matter.
Ready to build something real?
👉 Explore the TEN Framework Documentation
💬 Join the Discord Community to connect with other developers
📦 Browse Extension Marketplace for reusable components
Continue Learning: