💡 Important: TEN Framework currently only supports Python 3.10. It's recommended to use pyenv or venv to manage your Python environment:
# Install and manage Python 3.10 using pyenv (recommended)pyenv install 3.10.14pyenv local 3.10.14# Or create virtual environment using venvpython3.10 -m venv ~/ten-venvsource ~/ten-venv/bin/activate
💡 Note: If tman is already installed on your system, the installation script will ask whether you want to reinstall/upgrade it. Press y to continue or n to cancel.
Non-interactive Installation (for automation scripts or CI environments):
# Remote installationyes y | bash <(curl -fsSL https://raw.githubusercontent.com/TEN-framework/ten-framework/main/tools/tman/install_tman.sh)# Local installationyes y | bash tools/tman/install_tman.sh
Verify Installation:
tman --version
💡 Tip: If you see tman: command not found, make sure /usr/local/bin is in your PATH:
Before running the app, you need to configure the ASR (Automatic Speech Recognition) service credentials. The current example uses Azure ASR extension. You need to fill in the configuration in the transcriber_demo/.env file:
# Create .env filecat > .env << EOF# Azure Speech Service ConfigurationAZURE_STT_KEY=your_azure_speech_api_keyAZURE_STT_REGION=your_azure_region # e.g., eastusAZURE_STT_LANGUAGE=en-US # Set according to your audio language or real-time recording language, e.g., zh-CN, ja-JP, ko-KR, etc.EOF
💡 Tip: If you want to use other ASR extensions (such as OpenAI Whisper, Google Speech, etc.), you can download and replace them from the cloud store. Similarly, configure the corresponding API keys and environment variables in the .env file.
Using WebRTC VAD (Voice Activity Detection) extension as an example, install a C++ extension from the cloud store:
cd transcriber_demotman install extension webrtc_vad_cpp
💡 Note: webrtc_vad_cpp is a voice activity detection extension implemented in C++. It can filter out non-speech segments in real-time speech recognition scenarios.
[web_audio_control_go] Web server started on port 8080[vad] WebRTC VAD initialized with mode 2[audio_file_player_python] AudioFilePlayerExtension on_start
Now open your browser at http://localhost:8080 and navigate to the microphone real-time transcription page. You'll see the silence state changes after VAD processing. When the silence state is true, it indicates there is no speech in the current audio.