Re: [gnuspeech-contact] system lexicon

Michael Forbes was working on an improved format for PrEditor dictionaries to include tempo information, AFAIR, and I am trying to reconstruct how far he'd got and what he did from old memos. I don't think I have enough of a handle to start sending any yet, but I do attach one memo >at the very end< that indicates some of what was going on. Fortunately I am a bit of a pack rat at heart!

------

The parts of speech information is in the Main Dictionary (2.0e is indeed the latest -- I hesitate to call it "best" ;-).

Identifying letters follow the % sign at the end of each entry. The parts-of-speech key is as follows:

NOUN 'a'

VERB 'b'

ADJECTIVE 'c'

ADVERB 'd'

PRONOUN 'e'

ARTICLE 'f'

PREPOSITION 'g'

CONJUNCITON 'h'

INTERJECTION 'i'

UNKNOWN 'j'

PROPERNAME (NOUN) 'k'

LOCATIONNAME (NOUN) 'l'

CONCEPTNAME (NOUN) 'm'

-----------

Whilst on the subject, I found Craig's original email to me concerning the Monet syntax. Here it is. Note that it all refers to the original NeXT TTS system which is what Steve Nygard used as a model (but he doped it out from the code!)

-----------

From: Craig-Richard Taube-Schock <address@hidden>

Date: Sat, 20 Jul 96 18:46:53 0600

To: david r hill <uudavid!david>

Subject: Re: MONET Syntax

Hi David,

I did actually get your earlier email, but there has been so much to

get done... I haven't been able to reply :-(

The syntax is fairly straight-forward. The main problem is that it is

slightly optimised for MONET, which makes it a little difficult to

dechiper sometimes. I've bolded some of the more important items in

the text below. [drh: no bolding in this plain text version. it's just the "/c"

items]

To highlight the syntax, I will send the following utterances to the

TTS-Server and point out the reply:

hello there, this is a test. I would like to buy some cheese.

This, of course, comprises two of my favorite utterances for synthesis. I

hope you aren't too bored with them, yet!

I wanted to send two utterances to point out how "chunking" works.

The reply from the server is as follows:

/c // /3 # /w h_e./_l_uh_uu /w /1 /*dher # ^ // /0 # /w /_dh_i_s /w i_z /w uh /w /1 /*test # // /c // /0 # /w /_ah_i /w /_w_u_d /w /lahik /w t_uu /w /_b_ah_i /w /_s_a_m /w /1 /*cheez # // /c

[drh: note that there should be no new-lines in the above]

The embedded symbols are:

/w - word boundary

/c - chunk boundary

1* - tonic placement

/1 - last word in tone-group

// - tone group boundaries

/_ - foot boundary. Also implied by syllable boundary

. - syllable boundary

/<number> (<number>=0-4)

/0 = Statement

/1 = Exclaimation

/2 = Question

/3 = Continuation (ie comma or colon)

/4 = Statement/Continuation hybrid. Used with semicolons.

The important thing to note about chunks is that these are the units

which are most closely associated with utterances. A chunk can have

several tone groups (even several sentences) within it which will be

synthesized as one unit. I've bolded the chunk markers in the above

sentence to show their locations. [drh -- no bolding in this plain text

version.]

The general template I use is as follows:

/c // /3 # %s # // /c

where %s is replaced by all of the stuff I want to synthesize. Please

note that you must include a tonic placement or you will crash the

server. This is a bug, but since this is an internal standard, this is

generally not a problem as all of our stuff conforms to this

standard. It is also wise to put in the "#" characters; this gets rid

of any popping which may occur due to extreme initial and final

changes in the synthesizer parameters. You will not crash MONET or

TTS_Server if you do not put them in, but you may get some pops.

It is also important to point out that NO TESTING for validity is

performed by MONET or TTS_Server. If you get the syntax wrong, you

will most likely crash the server (although MONET is usually a little

more robust). This is not a big problem, as the server will be

restarted transparently, but you won't get any synthesis of the

offending utterance. The reason that no testing is done is (for

reaasons of optimisation) and that this interface is (for the most

part) internal to Trillium.

Hope this is enough information for you. I suggest playing around just

a bit to see if there is anything I've forgotten.

Craig

--------

And here's a memo about PrEditor dictionary formats. Not wonderfully complete, but gives a little flavour. I am working on getting more info.

------

Date: Wed, 9 Aug 1995 01:21:52 -0400

From: Michael Forbes

<address@hidden m. ab. ca>

To: uudavid!trilljum.ab.ca!address@hidden

Subject: Re: PrEditor

In-Reply-To: address@hidden>

The version of preditor that I gave you should be able to open both

.preditor and .ded files, then save them in either format (using the

save as menu command), this should allow you to convert between file

types. Also, you should be able to insert (I mean merge, but insert on

the version you have) either type of file into any open file, no

matter what the file type.

As far as working with the user dictionary, the server uses an

interface to the old PrDict dictionary object, which still works with

the new format, but any tempo information in the new format will be

turned into the pronunciation string "u_u_u_p_s" by the old conversion

routines because they do recognize numbers (which the old version of

PrEditor strips out of the dictionary entries). To use the words

without tempi information, you must save them as a .preditor file

first. I think that we should prevent the server from working with ded

files right now because there are no checks to ensure that

pronunciations will not cause invalid strings to be sent to the

server.

I am just testing the new interface functions to the new PrDict

object and will probably give those to Craig tomorrow so that he can

compile a new server which will be able to understand preditor files

with tempo information.

>From your message, it sounded like it was not possible to convert

between file types. I am not sure what other features would be useful

to work with the two file types "interchangeably". If the conversion

between the two types is not working, please call me tomorrow and

describe the problem as I am pretty sure it was working on the IBM PC.

I did not test it with long dictionaries though.

Michael.

---------

From:	David Hill
Subject:	Re: [gnuspeech-contact] system lexicon
Date:	Tue, 30 May 2006 13:59:39 -0700