Re: [gnuspeech-contact] gnuspeech & latin / vergil, plainchant (fwd)

gnuspeech-contact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] gnuspeech & latin / vergil, plainchant (fwd)

From:	D.R. Hill
Subject:	Re: [gnuspeech-contact] gnuspeech & latin / vergil, plainchant (fwd)
Date:	Wed, 30 Nov 2005 21:25:27 -0700 (MST)


Hi Lee,

I am quite embarrassed by the time that has passed since you wrote to
me.  I hope you haven't written me off as being not a useful
contact.  The last 6 weeks have been extraordinarily busy.

Your questions do not admit to easy answers, because so much is
unstated/assumed.  I presume you are asking mainly about the use of
the Monet system to create spoken and sung Latin.

I have put some comments in your message below.

On Oct 10, 2005, at 6:38 PM, Lee Butterman wrote:

Hi, Professor Hill.  My name's Lee Butterman; I just graduated Brown,
double majoring in Latin and Computer Science, and now I'm doing my
Master's in Classics at Tufts.  For my bachelor's thesis, I wrote an
mbrola-based text-to-speech system for Latin poetry.  It's online at
www.poetaexmachina.net and I'm still updating it.  The intonation is
basic, but it worked as a proof of concept.
I was wondering about a few things.

Concerning developing articulatory speech databases: how feasable is
it to design one for (say) Italian?  For Vergil and Catullus and such,
an Italian voice (specifically IT2/mbrola) is a good approximation,
though I'm not certain if I could create one myself;


Being a dead language, there is considerable debate about how Latin
was pronounced, though there are many clues (from poetry and that
sort of thing), plus the use of Latin in church services (though I am
unconvinced that clerical Latin is a good guide).

Creating the necessary databases could be onerous or relatively
straightforward, depending on the assumptions you make and the goals
you attempt.

The tough approach!

If you really think you know how Latin is pronounced, and can obtain
good recordings of someone speaking it that way, you could take the
tough route and begin at the beginning: first (a) create a database
of postures underlying the basic phonemes of the language (based on
some understanding of the articulation necessary, plus the acoustic
effect expected in terms of (virtual or real) "target" formant
values; then (b) create the context rules to allow coarticulation
effects to emulate the dynamics and produce the required variety of
allphones; finally (c) make up a dictionary that relates the spelling
on Latin to the phonetic realisation in terms of the postures/phones
you have defined.  This would allow you to use Monet to generate
arbitrary "Latin" utterances.

Some snags.  You need some kind of model of Latin rhythm and
intonation.  Both are difficult, even for a current language.  You'd
have to make some assumptions for Latin (again, based on things like
poetry).  The rhythm is closely tied to the time-quantity of the
postures/phones and is poorly done in most contemporary synthesisers
(even concatenative synthesisers).  Rhythm would require a good model
of Latin rhythm within which to gather data.  English is considered a
stress-timed language, whereas French is considered a syllable-timed
language, for example, and there has been an ongoing debate for
decades about exactly how such differences translate into rhythmic
variation, and especially the choice of duration for the various
speech elements.  We reviewed a lot of research and concluded that
MAK Halliday's model was simple, practical, and convincing for spoken
English, but we still had to go and gather real data from a corpus of
English speech in order to fill in the values needed for the model.
It then made sense to consider Halliday's model of British English
intonation, and we carried out listening trials and experiments with
synthetic speech to compare the Halliday system with other
possibilities, and also to investigate at least some of the
potentially relevant pitch variation cues to the perception of
natural intonation.

This is a lot of work!


The simple approach!

Just create a new dictionary that tells how to pronounce Latin in
terms of the postures that have already been created, accept that the
dynamics are probably not that different from English (especially if
you are emulating church Latin as spoken in the English-speaking
world), tailor intonation contours to suit your taste using the
facilities in Monet, and you are there.

You could even simply accept intonation contours as generated
automatically for English by Monet.  Who really knows how Latin
should be intoned.  Singing, as you note, is pretty easy because both
the rhythm and pitch are closely determined, and can be specified as
part of the synthesis (but not authomatically in the current system
-- we always thought a singing mode should eventually be added).

for Latin
plainchant, much the same thing, and also spoken words sound very
different from sung words, but singing simplifies the prosody.
What sort of a Linux setup works best with gnuspeech?


Currently, the system only works on NeXT (complete, but you'd have to
go to http://www.blackholeinc.com to order the hardware -- the
software is on the gnuspeech CVBS site); and on Mac OS/X 10.3 or 10.4
(Monet and a text-to-posture utility are working, but "Synthesizer",
which is very useful for developing new posture data and
understanding the articulatory synthesiser in general, as well as
various useful utilities like the dictionary editor PrEditor, are not
yet ported.  The Linux version is planned to run under GNUStep, but
the port is incomplete at present.  The source code was ported to the
Mac by Steve Nygard, and Greg Casamento has been modifying the code
so it will also compile under GNUStep.

Also, how can I help?  From a coding perspective, I've done some nlp
(poetaexmachina) and some data compression work (an error-resilient
bzip), though I've never worked on an open-source project before.

From a non-coding perspective, I worked on literary journals during

college, and I can help with writing and editing manuals, and I've
also done some graphic design work, if that's useful.


Good question.  It would probably be best to work on something that
is close to your heart.  There isn't a lot of competition ;-)  You
may need to think about where you are headed, and then look at what
is missing to help you get there.


Thanks so much,
Lee Butterman



All good wishes

david

David Hill, Prof. Emeritus
        |----------------------------------------       |
CS Dept, U. Calgary                     | Imagination is more                   
|
Calgary, AB T2N 1N4 Canada      | important than knowledge.     |
address@hidden                  |               Alberta Einstein                
|
http://www.cpsc.ucalgary.ca     |                                               
        |
OR address@hidden               | Kill your television!                 |
http://www.firethorne.com               
|----------------------------------------       |

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [gnuspeech-contact] gnuspeech & latin / vergil, plainchant (fwd), D.R. Hill <=

Prev by Date: Re: [gnuspeech-contact] intonation window
Previous by thread: [gnuspeech-contact] intonation window
Index(es):
- Date
- Thread