[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gnuspeech-contact] Sampe input for SoftwareTRM
From: |
Nickolay V. Shmyrev |
Subject: |
Re: [gnuspeech-contact] Sampe input for SoftwareTRM |
Date: |
Sun, 07 Jan 2007 20:55:34 +0300 |
В Сбт, 06/01/2007 в 07:38 -0700, Steve Nygard пишет:
> The first set of lines represent fixed values, and are commented in
> the sample file. After that each line has a bunch of values. These
> are parameters to the tube model, including the radii of eight
> sections of the tube, some values controlling frication, and a few
> other parameters.
>
> If I recall correctly, the input control rate controls the time
> represented by each line. The sample file is 4 Hz, so each line
> should represent 0.25 seconds of sound. For generating speech the
> input control rate is higher, something like 250 Hz.
>
> If you look in the diphones.mxml file in the source for the Monet
> application, you'll find the parameters that the tube uses listed in
> the <parameters> section -- the values in the tube model input file
> occur in the same order they are listed in dihpones.mxml.
>
> The Monet application is used to create and edit the diphones.mxml,
> which is a set of rules for creating "key frames" between sets of
> tube model parameters and interpolating between these key frames to
> generate the input to the tube model.
>
> I'll send you a bigger sample file generated from Monet.
>
Thanks a lot Steve for precise description, sounds impressive. I've
uploaded result on wiki for those interesting:
http://festlang.berlios.de/docu/doku.php?id=gnuspeech
Really such description of speech looks more natural than probability
parameters popular this days. So I wonder, is it possible to update some
existing linux TTS, say festival to generate speech with SoftwareTRM
from diphone parameters file you have.
signature.asc
Description: Эта часть сообщения подписана цифровой подписью