gnuspeech-contact
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] a Kelly-Lochbaum TRM


From: David Hill
Subject: Re: [gnuspeech-contact] a Kelly-Lochbaum TRM
Date: Mon, 5 Nov 2012 20:17:12 -0800

Dear Andras,

I apologise for the long delay in responding to your email. It has been a very 
busy month.

I checked out the two links you provided and was intrigued to know how you 
would propose controlling the Kelly-Lochbaum tube model to produce connected 
speech, including consonants. I also wonder which/whose demonstrations of 
gnuspeech you have listened to, and what audio equipment you used for the 
reproduction. The fidelity of the gnuspeech TRM-based speech is completely 
vitiated by the typical computer sound system. This is the URL for a .wav file 
comparing male, female and child speech and wonder if you have heard it before:

http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav

Also, I wonder if you have read any of the background papers on gnuspeech, 
especially:

http://pages.cpsc.ucalgary.ca/~hill/papers/avios95/index.htm

or some of the papers documenting the work we did on rhythm and intonation, and 
the subjective testing that was part of that.The AVIOS 95 paper provides some 
of the background references we used. A series of sixteen tube sections of 
equal length simply fails to provide the degree of control needed to emulate 
the human vocal tract speaking. In fact we apparently use only eight, but there 
are an underlying 10 sections that allow a fair approximation to the unequal 
sections needed for the task. The length was restricted by the need to compute 
in real time. These days you can get a perfect representation of the 8 
unequal-length sections needed to provide completely independent control of the 
human formants by using 32 sections and combining them appropriately to meet 
the boundaries determined by Fant and Pauli at KTH, Stockholm (FANT, G. & 
PAULI, S. (1974)  Spatial characteristics of vocal tract resonance models. 
Proceedings of the Stockholm Speech Communication Seminar, KTH, Stockholm, 
Sweden

There is a fair amount of stuff at:

http://pages.cpsc.ucalgary.ca/~hill/gnuspeech/gnuspeech-index.htm

and a more complete listing in section F. of:

http://pages.cpsc.ucalgary.ca/~hill/papers/index.htm

with an overview of gnuspeech at:

http://www.gnu.org/software/gnuspeech/

and a recent paper on the history of the work is at:

http://pages.cpsc.ucalgary.ca/~hill/papers/creating-n-applying-rhythm-n-intonation.pdf

The Kelly and Lochbaum work is quite old, of course.

I hope all this helps. I'll be interested in your response.

All good wishes.

david

On Oct 9, 2012, at 9:19 AM, Andras Kadinger wrote:

> I listened to the gnuspeech demos a number of times over the years. I was 
> always fascinated by the pleasant intonation and formant trajectories - but 
> was quite put off by the unnatural, formant synthesizer-like timbre.
> 
> To rekindle the love, I hacked up a Kelly-Lochbaum TRM in Java: 
> http://www.youtube.com/watch?v=tzAnkDki8SU
> 
> The toy UI I have on top of it is the "Four Tube Vocal Tract Models of 
> Vowels" from 
> http://clas.mq.edu.au/acoustics/frequency/vocal_tract_resonance.html just to 
> have an simple yet phonetically somewhat meaningful way of playing with it to 
> get a rough idea of the wovel quality to be expected from such a model.
> 
> Andras
> 
> _______________________________________________
> gnuspeech-contact mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/gnuspeech-contact
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]