ArticleEN🇺🇸

Tutorial: Building Conversational NPCs in Unity 6 with MorVoice SDK (Zero-Latency Setup)

U
Unity Integration Team
1/20/2026
cover

The holy grail of modern gaming is the 'Smart NPC'—a character you can talk to that replies intelligently. While LLMs (like GPT-4) solved the brain part, the voice part has remained a bottleneck. Traditional TTS is too slow (latency) and too robotic (immersion breaking).

This tutorial shows you how to implement **MorVoice Streaming SDK** in Unity 6. We will achieve a voice-response latency of under 200ms (see our [Latency Benchmark](/blog/websocket-vs-http-tts-latency-benchmark-2026)), making the conversation feel instantaneous.

Prerequisites

- Unity 2022.3 LTS or higher (Unity 6 recommended)
- MorVoice SDK (Install via Package Manager: https://npm.morvoice.com)
- An API Key from dashboard.morvoice.com
- A basic NPC GameObject with an AudioSource component

Architecture: The Streaming Pipeline

Do NOT save audio to disk. File I/O adds 50-100ms of lag. We will stream raw PCM data directly from the WebSocket memory buffer into the AudioSource's clip buffer.

Step 1: The NPC Voice Controller

Create a new script `NPCVoiceController.cs` and attach it to your character.

using UnityEngine;
using MorVoice.SDK;
using System.Collections;

public class NPCVoiceController : MonoBehaviour
{
    [SerializeField] private string voiceId = "orc_warrior_v2";
    private MorVoiceClient _client;
    private AudioSource _audioSource;

    void Start()
    {
        _client = new MorVoiceClient(ApiKey.LoadFromEnv());
        _audioSource = GetComponent<AudioSource>();
    }

    public async void Speak(string text)
    {
        // 1. Start the stream. This returns immediately (active connection)
        var stream = await _client.StreamSpeechAsync(text, voiceId);

        // 2. Prepare a streaming AudioClip (Unity 2022+ feature)
        var clip = AudioClip.Create("VoiceStream", 44100 * 60, 1, 44100, true, 
            (float[] data) => stream.ReadBuffer(data));
            
        _audioSource.clip = clip;
        _audioSource.Play();
    }
}

Step 2: Lipsync Integration

Audio isn't enough; the mouth must move. MorVoice sends 'viseme' events (mouth shapes) alongside the audio chunks via the WebSocket. This is much faster than analyzing the audio on the client side.

// Inside Speak() method, subscribe to viseme events
stream.OnViseme += (visemeCode, duration) => {
    // Map MorVoice viseme codes to your character's BlendShapes
    // Example: Code 4 = 'Ah' sound -> Set BlendShape 'MouthOpen' to 100
    float intensity = 100f;
    SkinnedMeshRenderer.SetBlendShapeWeight(visemeCode, intensity);
    
    // Auto-close mouth after duration
    StartCoroutine(ResetMouth(visemeCode, duration));
};

Optimization Tips

1. Pre-warming Connection

Establish the WebSocket connection when the player enters the room, not when they start talking. This saves the initial SSL handshake time (approx 100ms).

2. Caching Common Phrases

For standard replies like 'Hello', 'What do you want?', or 'Goodbye', generate them once and cache them locally. Only use Streaming TTS for dynamic LLM responses.

Common Pitfalls

❌ NEVER call .ToArray() on the stream. That waits for the full audio to download.
✅ ALWAYS use the streaming callback or buffer reader.

❌ WARNING: Don't use standard HTTP requests. They block the main thread in WebGL builds.
✅ Use the async/await pattern shown above.

Conclusion

With this setup, your NPCs can interrupt players, react to game events in real-time, and whisper or shout dynamically. The MorVoice SDK handles the heavy lifting of buffering and decoding, letting you focus on the gameplay logic.

Download the complete Unity Project example from our GitHub repository.

Read Next

cover
Guides

What is AI Text to Speech? A Complete Guide to Neural TTS Technology

Discover how AI text-to-speech technology works, from neural networks to natural-sounding voices. Learn about modern TTS applications, benefits, and how it's revolutionizing content creation.

1/8/2026Read
cover
Guides

Commercial Use AI Voice: Licensing, Legal Rights, and Best Practices

Complete guide to using AI-generated voices commercially. Understand licensing, copyright, ethical considerations, and legal requirements for businesses and content creators.

1/8/2026Read
cover
Guides

Voice for All: How Advanced TTS is Redefining Digital Accessibility in 2026

Digital inclusion has reached a tipping point. Discover how high-fidelity AI voices are breaking down barriers for millions, transforming from simple tools into vital lifelines.

1/8/2026Read
cover
Guides

Stop Burning Cash: A Financial Analysis of Voice AI at Scale

If you are generating >100 hours of audio per month, you are likely overpaying by 40%. A breakdown of 'Phoneme-Billing' vs 'Character-Billing'.

9/22/2025Read
cover
Guides

The Ultimate Guide to Migrating from ElevenLabs to Morvoice

A step-by-step tutorial with code snippets for Node.js and Python. Switch your API endpoint in 5 minutes and keep your voice clones.

9/20/2025Read
cover
Guides

Revolutionizing Game Dev: Integrating Real-Time Voice AI in Unity & Unreal

Static dialogue trees are dead. Learn how to implement Morvoice's <80ms latency SDK to create NPCs that converse dynamically with players.

4/18/2025Read
cover
Guides

Stop Burning Cash: The True Cost of Voice AI (Phoneme vs Character Billing)

A comprehensive financial breakdown revealing how character-based billing makes you pay for silence, pauses, and XML tags. See real ROI calculations from companies saving 40-60% by switching billing models.

1/28/2026Read
cover
Guides

How to Migrate from ElevenLabs to MorVoice in 5 Minutes (Python/Node.js)

Vendor lock-in is a myth. Use our 'Drop-in Compatibility SDK' to switch providers without rewriting your entire backend. A complete guide for CTOs and developers.

1/25/2026Read
cover
Guides

Email Warm-Up Strategy: Increase Deliverability

Email Warm-Up Strategy: Increase Deliverability...

1/3/2026Read
Support & Free Tokens
Tutorial: Building Conversational NPCs in Unity 6 with MorVoice SDK (Zero-Latency Setup) | MorVoice