Build a Live2D Voice Assistant with Real-Time Lip Sync

Build a Live2D Voice Assistant with TEN Framework

Imagine building a voice assistant that's more than just audio — one with an animated anime character that moves its lips in perfect sync with speech, reacts naturally, and brings conversations to life.

With the TEN Framework, you can create exactly that: a real-time Live2D voice assistant where animated characters respond to your voice with audio-driven mouth movement and seamless interaction.

In this tutorial, we'll show you how to integrate Live2D models with TEN's voice pipeline to create an immersive conversational experience.

What is this article all about?

TEN Framework provides the real-time audio pipeline (STT → LLM → TTS), and your job is to create a beautiful frontend that brings it to life:

Real-time lip sync → mouth movements synchronized with TTS audio output.
Interactive Live2D characters → 2D anime models that react and animate naturally.
Seamless audio integration → Agora RTC streams audio directly to Live2D MotionSync.
Modular backend → reuse the standard voice assistant backend, focus on frontend magic.

This means you get all the power of TEN's voice pipeline while delivering a visually engaging user experience that goes beyond traditional chatbots.

Project Structure

You don't need to build everything from scratch — TEN Framework provides a ready-to-use Live2D voice assistant example in the repository.

👉 Find it here: voice-assistant-live2d example on GitHub

Backend (tenapp)

The backend is essentially the same as other TEN voice assistant examples:

tenapp/
├── property.json       → TEN graph configuration
└── ten_packages/
    └── extension/
        └── main_python/ → Main control extension

By default, it uses:

Agora RTC for real-time audio streaming
Deepgram for speech-to-text
OpenAI LLM for conversation
ElevenLabs for text-to-speech

Just like other TEN examples, you can easily swap in different vendors (e.g., use Google ASR, Azure TTS, Anthropic Claude, etc) using the graph designer at http://localhost:49483 (TMAN Designer) — no code changes needed. This gives you full flexibility to mix and match components to fit your needs.

The real innovation is in the frontend.

Frontend (Next.js + Live2D)

The frontend is where the magic happens:

frontend/
├── src/
│   ├── components/
│   │   ├── Live2DCharacter.tsx    → Main Live2D rendering component
│   │   ├── ClientOnlyLive2D.tsx   → SSR-safe wrapper
│   │   ├── ConnectionPanel.tsx    → RTC connection controls
│   │   └── TranscriptPanel.tsx    → Conversation display
│   ├── lib/
│   │   ├── pixi-setup.ts          → PixiJS global initialization
│   │   ├── live2d-loader.ts       → Live2D model loader
│   │   └── request.ts             → API client
│   └── app/
│       └── page.tsx               → Main application
└── public/
    └── models/
        └── kei_vowels_pro/        → Live2D model assets

This structure keeps rendering logic, audio processing, and RTC integration cleanly separated.

Frontend Implementation

The frontend brings the Live2D character to life. Let's break down the key parts:

PixiJS Setup

Before loading Live2D models, we need to initialize PixiJS globally because pixi-live2d-display expects it.

pixi-setup.ts handles this:

import * as PIXI from 'pixi.js';
 
// Set up PIXI globally for pixi-live2d-display compatibility
if (typeof window !== 'undefined') {
    window.PIXI = PIXI;
    globalThis.PIXI = PIXI;
}
 
export { PIXI };
export default PIXI;

👉 This ensures PixiJS is available before any Live2D operations begin.

Loading Live2D Models

live2d-loader.ts dynamically imports the Live2D library after PixiJS is ready:

import PIXI from './pixi-setup';
 
export async function loadLive2DModel() {
    // Wait for PIXI to be fully set up
    await new Promise(resolve => setTimeout(resolve, 200));
 
    // Now dynamically import Live2D
    const { Live2DModel } = await import('pixi-live2d-display/cubism4');
 
    return { Live2DModel, PIXI };
}

👉 This lazy-loading strategy prevents SSR issues and ensures proper initialization order.

The Live2DCharacter Component

Live2DCharacter.tsx is the heart of the frontend. It handles:

Model Initialization

const initLive2D = async () => {
    // Create PIXI application with canvas renderer
    const app = new PIXI.Application({
        view: canvasRef.current!,
        autoStart: true,
        resizeTo: canvasRef.current?.parentElement || window,
        backgroundColor: 0x000000,
        backgroundAlpha: 0,
        forceCanvas: true,  // Use canvas for stability
    });
 
    // Load Live2D model
    const { Live2DModel } = await import('@/lib/live2d-loader')
        .then(loader => loader.loadLive2DModel());
 
    const model = await Live2DModel.from(modelPath);
    app.stage.addChild(model);
 
    // Position and scale the model
    model.scale.set(parent.clientHeight / model.height);
    model.x = (parent.clientWidth - model.width) / 2;
};

MotionSync for Lip Sync

The component uses Live2D MotionSync to synchronize mouth movements with audio:

// Initialize MotionSync
const motionSyncUrl = modelPath.replace('.model3.json', '.motionsync3.json');
const motionSync = new MotionSync(model.internalModel);
await motionSync.loadMotionSyncFromUrl(motionSyncUrl);
 
// When audio track arrives from Agora...
if (audioTrack && audioTrack.getMediaStreamTrack) {
    const stream = new MediaStream([audioTrack.getMediaStreamTrack()]);
 
    // Start lip sync playback
    motionSync.play(stream);
 
    // Also play actual audio
    const audio = document.createElement("audio");
    audio.autoplay = true;
    audio.srcObject = stream;
    audio.play();
}

👉 This creates perfectly synchronized lip movements — the character's mouth moves exactly as it "speaks".

Agora RTC Integration

The main page (page.tsx) connects everything:

// When remote audio arrives from TTS
rtcClient.on("user-published", async (user, mediaType) => {
    if (mediaType === "audio") {
        await rtcClient.subscribe(user, "audio");
        const remoteAudioTrack = user.audioTrack;
 
        // Pass audio to Live2D component for lip sync
        setRemoteAudioTrack(remoteAudioTrack);
 
        // Play the audio
        remoteAudioTrack?.play();
    }
});

Then pass it to the Live2D component:

<ClientOnlyLive2D
    modelPath={currentModel.path}
    audioTrack={remoteAudioTrack}
    onModelLoaded={() => console.log("Model loaded!")}
/>

👉 The character automatically syncs its mouth whenever the voice assistant speaks.

SSR Handling

Since Live2D requires browser APIs, we use dynamic imports to prevent SSR issues:

const ClientOnlyLive2D = dynamicImport(
    () => import("@/components/ClientOnlyLive2D"),
    {
        ssr: false,
        loading: () => <div>Loading Live2D Model...</div>
    }
);

👉 This ensures the Live2D component only renders on the client side.

AGORA_APP_ID=your_agora_app_id
DEEPGRAM_API_KEY=your_deepgram_key
OPENAI_API_KEY=your_openai_key
OPENAI_MODEL=gpt-4o-mini
ELEVENLABS_TTS_KEY=your_elevenlabs_key

Installation & Running

Navigate to the example directory:

cd agents/examples/voice-assistant-live2d

Install dependencies:

task install

Run all services (in separate terminals):

task run

Open http://localhost:3000
Click Connect to start the RTC session
Start speaking — watch your Live2D character respond with perfectly synced lip movement!

✨ That's it — you now have a Live2D voice assistant powered by TEN Framework!