[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gnuspeech-contact] a Kelly-Lochbaum TRM
From: |
David Hill |
Subject: |
Re: [gnuspeech-contact] a Kelly-Lochbaum TRM |
Date: |
Mon, 5 Nov 2012 20:17:12 -0800 |
Dear Andras,
I apologise for the long delay in responding to your email. It has been a very
busy month.
I checked out the two links you provided and was intrigued to know how you
would propose controlling the Kelly-Lochbaum tube model to produce connected
speech, including consonants. I also wonder which/whose demonstrations of
gnuspeech you have listened to, and what audio equipment you used for the
reproduction. The fidelity of the gnuspeech TRM-based speech is completely
vitiated by the typical computer sound system. This is the URL for a .wav file
comparing male, female and child speech and wonder if you have heard it before:
http://pages.cpsc.ucalgary.ca/~hill/helloComparison/helloComparison.wav
Also, I wonder if you have read any of the background papers on gnuspeech,
especially:
http://pages.cpsc.ucalgary.ca/~hill/papers/avios95/index.htm
or some of the papers documenting the work we did on rhythm and intonation, and
the subjective testing that was part of that.The AVIOS 95 paper provides some
of the background references we used. A series of sixteen tube sections of
equal length simply fails to provide the degree of control needed to emulate
the human vocal tract speaking. In fact we apparently use only eight, but there
are an underlying 10 sections that allow a fair approximation to the unequal
sections needed for the task. The length was restricted by the need to compute
in real time. These days you can get a perfect representation of the 8
unequal-length sections needed to provide completely independent control of the
human formants by using 32 sections and combining them appropriately to meet
the boundaries determined by Fant and Pauli at KTH, Stockholm (FANT, G. &
PAULI, S. (1974) Spatial characteristics of vocal tract resonance models.
Proceedings of the Stockholm Speech Communication Seminar, KTH, Stockholm,
Sweden
There is a fair amount of stuff at:
http://pages.cpsc.ucalgary.ca/~hill/gnuspeech/gnuspeech-index.htm
and a more complete listing in section F. of:
http://pages.cpsc.ucalgary.ca/~hill/papers/index.htm
with an overview of gnuspeech at:
http://www.gnu.org/software/gnuspeech/
and a recent paper on the history of the work is at:
http://pages.cpsc.ucalgary.ca/~hill/papers/creating-n-applying-rhythm-n-intonation.pdf
The Kelly and Lochbaum work is quite old, of course.
I hope all this helps. I'll be interested in your response.
All good wishes.
david
On Oct 9, 2012, at 9:19 AM, Andras Kadinger wrote:
> I listened to the gnuspeech demos a number of times over the years. I was
> always fascinated by the pleasant intonation and formant trajectories - but
> was quite put off by the unnatural, formant synthesizer-like timbre.
>
> To rekindle the love, I hacked up a Kelly-Lochbaum TRM in Java:
> http://www.youtube.com/watch?v=tzAnkDki8SU
>
> The toy UI I have on top of it is the "Four Tube Vocal Tract Models of
> Vowels" from
> http://clas.mq.edu.au/acoustics/frequency/vocal_tract_resonance.html just to
> have an simple yet phonetically somewhat meaningful way of playing with it to
> get a rough idea of the wovel quality to be expected from such a model.
>
> Andras
>
> _______________________________________________
> gnuspeech-contact mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/gnuspeech-contact
>
>
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [gnuspeech-contact] a Kelly-Lochbaum TRM,
David Hill <=