[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Idea: language plugins
From: |
Jonathan Duddington |
Subject: |
Idea: language plugins |
Date: |
Mon, 03 Mar 2008 11:36:47 +0000 (GMT) |
On 03 Mar, Tomas Cerha <cerha at brailcom.org> wrote:
> I understand it is a tempting idea to manage those conversions
> centrally and be able to do a good job even with a very dumb
> synthesizer, but I also believe Speech Dispatcher is not a good place
> for this. One example of a technical problem of such a design is the
> support for callbacks. If the text is changed on its way from
> application to the synthesizer, the indexes reported by the
> synthesizer are no longer valid within the original text.
I like the idea of language-specific plugins in a centralised place
(i.e. in Speech Dispatcher) which can be used with different
synthesizers.
eSpeak does some basic language-specific interpretation for numbers,
where it provides a number of options which can be set for each
language. For example whether "123" is:
One hundred and twenty three.
Hundred twenty three.
Hundred three and twenty.
etc.
But applying different grammatical inflections to numbers, depending on
context, is beyond the ability of a general-purpose multi-language
synthesizer such as eSpeak. This may also apply to questions about how
to interpret text such as "2/3/04" or numbers which might be telephone
numbers.
It would be useful if someone who has the knowledge and interest to do
this could write a language-specific plugin. This operates at the text
level (eg. converting numbers to text) and is independent of which
synthesizer is used.
There is certainly a problem with indices into the text, but it can be
solved. eSpeak has a similar problem internally when it processes SSML
tags. This produces text for speaking which is different from the
original text which it received. It is solved by a table which
translates the indices in the output text to the equivalent indices in
the input text.
An alternative would be to provide a plugin interface to eSpeak for
these functions, but Speech Dispatcher seems a more suitable place.
--