speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Speech Dispatcher roadmap discussion.


From: Bohdan R . Rau
Subject: Speech Dispatcher roadmap discussion.
Date: Thu, 16 Oct 2014 08:39:36 +0200

W dniu 2014-10-15 22:21, Trevor Saunders napisa?(a):
> On Wed, Oct 15, 2014 at 12:33:39PM +0200, Bohdan R. Rau wrote:
>
> I'd prefer we kept things as is and the protocol is just UTF-8.

SSIP protocol is UTF-8. Application may work in different encoding 
(ncursesw applications works internally in wchar_t, not 8-bit strings). 
I see no sense to manually convert internally used in application UCS or 
wchar_t into utf-8 only because someone decided "I don't use this 
function so it's unnecessary".

For me (in meaning "application writer") there should be also function:

int spd_say_wstring(SPDConnection *conn, int priority, wchar_t 
*wstring);

May be very usable for ncurses application writers ;)

>
> imo its an error to pass more than one character to char command,

Speech-dispatcher documentation, chapter 4.1.4 Characters and Keys:

"character is a NULL terminated string of chars containing one UTF-8 
character. If it contains more characters, only the first one is 
processed."


> We could translate back from offsets in plain text to possitions in
> ssml, so client doesn't need to know if synth can deal with ssml.

No, we can't.

Example:

<speak>Hello, <mark name="s1" /> Dolly! How are you?</speak>

Plain text parts will be:
"Hello, Dolly!" and "How are you?"

Please translate it back to index mark.


>
> It seems weird to me a module would only support something with plain
> text, but maybe such a thing exists.

See above.

>>
>> requesting features which are known not possible it's bug - or we
>> have different ideas what is bug :)
>
> I'm not convinced it is a bug, simple client might not want to worry
> about what synth it is using and what it supports.

So simple client won't request extra features.

>
> I'll agree its useful to know the speaking position, but I wonder if 
> we
> need to expose two different way of dealing with it to clients.

See above.
>> >I'm unconvinced, it seems like that's a problem synthesizer should
>> >already be solving, so why should we duplicate that?
>>
>> Because synthesizers are for synthesis, not for dealing with 
>> gramatic
>> problems.
>
> that may be the technical definition, but in practice I think most of
> the synthesis packages out there do both, I'm pretty sure espeak 
> does,
> and I think pico / ibmtts / festival variants all do too.

Espeak does. Pico does not (or there is no documented function in its 
API). I've never play with ibmtts and festival.

In any way, world is not limited to ibmtts, festival, pico and espeak.

ethanak
-- 
http://milena.polip.com/ - Pa pa, Ivonko!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]