[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

TTS algorithms (Re: Comments on the Text to Speech "algorithm")

From: A
Subject: TTS algorithms (Re: Comments on the Text to Speech "algorithm")
Date: Sun, 28 Feb 2010 08:35:02 +0200

Thanks for the nice summary of speech engines architecture. But I
think you're too pessimistic about latency with large voice banks.

On Sun, Feb 28, 2010 at 7:11 AM, Klaus Knopper <speechd at knopper.net> wrote:
> Back to unit selection: Because of time-critical issues, selection and
> processing of real recordings requires a lot of IO throughput, so you
> will need a very fast harddisk (maybe raid) or database, possibly cached

I think that the main problem is seek. If voice is kept compressed,
data size of itself wouldn't be much.

> in RAM, or just accept the output data to be generated "offline" with
> playback a few seconds or even minutes after the original text was sent,

Few seconds or minutes!? I guess modern SSD drives can help here a lot
but even with a slow drive a good optimized for latency engine should
start reading in less than a second.
I'm saying again. You don't need to read and generate all data to
start outputting sound. The same like your mp3 player doesn't read and
decode the whole mp3 before starting to play.

> output being in form of a WAV, Ogg or also the aforementioned MP3 if you
> don't mind using a patented format with its problematic legal issues.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]