gnuspeech-contact
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] system lexicon


From: David Hill
Subject: Re: [gnuspeech-contact] system lexicon
Date: Tue, 30 May 2006 13:59:39 -0700

Hi Eric,

Sorry for the delay -- there's more than just too many alligators in the swamp ;-)

On May 25, 2006, at 9:47 AM, Eric Zoerner wrote:
I found the dictionary files in gnuspeech/trillium/src/SpeechObject/Dictionary/. However, the dictionary files seem to be text files, are missing part of speech info, and they cannot be opened by PrEditor. Is there a system dictionary file somewhere in PrEditor format?

Which is the official best version of the dictionary, 2.0eMainDictionary?
_______________________________________________
gnuspeech-contact mailing list


The dictionary has to be compiled into PrEditor form to be readable by PrEditor.  The original Trillium system used an encrypted dictionary for reasons of protecting proprietary, copyrighted information.  This needs to be changed.  I cannot remember whether the two dictionary formats were the same.  I rather suspect not, again for protection reasons, so the PrEditor dictionaries were almost certainly not encrypted.

Michael Forbes was working on an improved format for PrEditor dictionaries to include tempo information, AFAIR, and I am trying to reconstruct how far he'd got and what he did from old memos.  I don't think I have enough of a handle to start sending any yet, but I do attach one memo >at the very end< that indicates some of what was going on.  Fortunately I am a bit of a pack rat at heart!

------

The parts of speech information is in the Main Dictionary (2.0e is indeed the latest -- I hesitate to call it "best" ;-).

Identifying letters follow the % sign at the end of each entry.  The parts-of-speech key is as follows:

NOUN         'a'
VERB         'b'
ADJECTIVE    'c'
ADVERB       'd'
PRONOUN      'e'
ARTICLE      'f'
PREPOSITION  'g'
CONJUNCITON  'h'
INTERJECTION 'i'
UNKNOWN      'j'
PROPERNAME (NOUN) 'k'
LOCATIONNAME (NOUN) 'l'
CONCEPTNAME (NOUN)  'm'

-----------

Whilst on the subject, I found Craig's original email to me concerning the Monet syntax.  Here it is.  Note that it all refers to the original NeXT TTS system which is what Steve Nygard used as a model (but he doped it out from the code!)

-----------

From: Craig-Richard Taube-Schock <address@hidden>
Date: Sat, 20 Jul 96 18:46:53 0600
To: david r hill <uudavid!david>
Subject: Re: MONET Syntax

Hi David,

I did actually get your earlier email, but there has been so much to
get done... I haven't been able to reply :-(

The syntax is fairly straight-forward. The main problem is that it is
slightly optimised for MONET, which makes it a little difficult to
dechiper sometimes. I've bolded some of the more important items in
the text below. [drh: no bolding in this plain text version.  it's just the "/c"
items]

To highlight the syntax, I will send the following utterances to the
TTS-Server and point out the reply:

hello there, this is a test. I would like to buy some cheese.

This, of course, comprises two of my favorite utterances for synthesis. I
hope you aren't too bored with them, yet!

I wanted to send two utterances to point out how "chunking" works.

The reply from the server is as follows:

/c // /3 # /w h_e./_l_uh_uu /w /1 /*dher # ^ // /0 # /w /_dh_i_s /w i_z /w uh /w /1 /*test # // /c // /0 # /w /_ah_i /w /_w_u_d /w /lahik /w t_uu /w /_b_ah_i /w /_s_a_m /w /1 /*cheez # // /c

[drh: note that there should be no new-lines in the above]

The embedded symbols are:

/w - word boundary
/c - chunk boundary
1* - tonic placement
/1 - last word in tone-group
// - tone group boundaries
/_ - foot boundary. Also implied by syllable boundary
. - syllable boundary
/<number> (<number>=0-4)
   /0 = Statement
   /1 = Exclaimation
   /2 = Question
   /3 = Continuation (ie comma or colon)
   /4 = Statement/Continuation hybrid. Used with semicolons.

The important thing to note about chunks is that these are the units
which are most closely associated with utterances. A chunk can have
several tone groups (even several sentences) within it which will be
synthesized as one unit. I've bolded the chunk markers in the above
sentence to show their locations. [drh -- no bolding in this plain text
version.]

The general template I use is as follows:

/c // /3 # %s # // /c

where %s is replaced by all of the stuff I want to synthesize. Please
note that you must include a tonic placement or you will crash the
server. This is a bug, but since this is an internal standard, this is
generally not a problem as all of our stuff conforms to this
standard. It is also wise to put in the "#" characters; this gets rid
of any popping which may occur due to extreme initial and final
changes in the synthesizer parameters. You will not crash MONET or
TTS_Server if you do not put them in, but you may get some pops.

It is also important to point out that NO TESTING for validity is
performed by MONET or TTS_Server. If you get the syntax wrong, you
will most likely crash the server (although MONET is usually a little
more robust). This is not a big problem, as the server will be
restarted transparently, but you won't get any synthesis of the
offending utterance. The reason that no testing is done is (for
reaasons of optimisation) and that this interface is (for the most
part) internal to Trillium.

Hope this is enough information for you. I suggest playing around just
a bit to see if there is anything I've forgotten.

Craig

--------

--------

And here's a memo about PrEditor dictionary formats.  Not wonderfully complete, but gives a little flavour.  I am working on getting more info.

------

Date: Wed, 9 Aug 1995 01:21:52 -0400
From: Michael Forbes
   <address@hidden m. ab. ca>
To: uudavid!trilljum.ab.ca!address@hidden
Subject: Re: PrEditor
In-Reply-To: address@hidden>

The version of preditor that I gave you should be able to open both
.preditor and .ded files, then save them in either format (using the
save as menu command), this should allow you to convert between file
types. Also, you should be able to insert (I mean merge, but insert on
the version you have) either type of file into any open file, no
matter what the file type.

As far as working with the user dictionary, the server uses an
interface to the old PrDict dictionary object, which still works with
the new format, but any tempo information in the new format will be
turned into the pronunciation string "u_u_u_p_s" by the old conversion
routines because they do recognize numbers (which the old version of
PrEditor strips out of the dictionary entries). To use the words
without tempi information, you must save them as a .preditor file
first. I think that we should prevent the server from working with ded
files right now because there are no checks to ensure that
pronunciations will not cause invalid strings to be sent to the
server.

I am just testing the new interface functions to the new PrDict
object and will probably give those to Craig tomorrow so that he can
compile a new server which will be able to understand preditor files
with tempo information.

>From your message, it sounded like it was not possible to convert
between file types. I am not sure what other features would be useful
to work with the two file types "interchangeably". If the conversion
between the two types is not working, please call me tomorrow and
describe the problem as I am pretty sure it was working on the IBM PC.
I did not test it with long dictionaries though.

Michael.

---------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]