[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Design suggestion: The server just for synthesis
From: |
Tomas Cerha |
Subject: |
Design suggestion: The server just for synthesis |
Date: |
Mon, 15 Nov 2010 15:01:47 +0100 |
Dne 15.11.2010 13:38, Andrei Kholodnyi napsal(a):
> Does it mean we want to provide to the application system wide
> capabilities instead of particular driver capabilities?
I'd say NO.
> e.g. if a particular driver does not implement SSML we would implement
> SSML for it inside provider?
I'd say YES.
> and deliver "can_parse_ssml" to the application?
If the applications care whether they get "emulated" or "real" SSML, I'd say we
should
be able to tell them. But this doesn't mean they need to care.
There is currently no specification of the client API, so it is up to the
discussion to
decide which features of the lower level API we want to expose to the client.
The
current SSIP is a good start and we can extend it by features needed by the
clients.
> If yes, what about the capabilities which we can not implement e.g.
> generic drivers can not generate speech samples as output?
These would not be emulated. There is a distinct set of capabilities which can
not be
emulated from principal, so the applications need to be able to handle both
situations
in this case or ignore the drivers which don't support the capability if it is
essential.
> For me it is a key design question. someone shall aggregate/handle all
> these differences.
> If we do not do it, then each app shall do this job.
Sure. I believe we should do it wherever possible.
> E.g. app wants to know all voices it can get back as speech samples.
> currently it will probably do:
> for all drivers get capability "can_retrieve_audio"
> if can_retrieve_audio
> list all voices
> add them to my favorite list
>
> whereas we can do it for the app with High level API like:
> list voices with capabilities can_retrieve_audio, i.e. hide particular
> driver capabilities
>
> This I could imagine as a high level API on top of TTS API
If I understand what you mean, the difference is whether you think of a driver
as a
property of voice or vice versa. Otherwise it is equivalent. Both approaches
can be
implemented above TTS API.
>> An SSIP bridge can also be written on top of the new API for backwards
>> compatibility.
>> Libspeechd, Python library and other client libraries could run without a
>> change through
>> this brigde.
>
> the only difference in SSIP versus TTS API AFAIR are priority handling
> and history. Not sure how it can be smoothly integrated.
> probably it can be added on top of TTS API as well, but there are APIs
> missing for it,
> probably some tags can be incorporated in the messages?
Yes, TTS API is a low level API. Priorities are handled within the layer above
it.
Thus the client API must have some features not present in TTS API
specification. Also
many features present in TTS API specification do not need to be exposed to the
client API.
If it was not clear from the previous discussion, the ambition of TTS API is to
become a
standard API for access to TTS engines. Speech Dispatcher would be the
consumer of this
API -- the layer between the clients and the drivers which implement TTS API.
Another
speech service (like Speech Dispatcher) should be able to use the same API and
reuse the
same drivers to access speech engines. This other service might have a
different client
API but we can also decide to standardize the client API. Standardization of
the client
API would be a benefit for assistive technologies and other client
applications. On the
other hand, TTS API is good for output modules (tts engine drivers). One
common driver
API can be used by differnt speech systems and the output drivers can be
shared. Both
levels of standardisation make sense, but we believe the low level API is
easier to
standardize since it is easier to agree on a common set of low level features.
So we
started with this one.
Thanks everyone for your valuable input.
Best regards, Tomas
- Design suggestion: The server just for synthesis, (continued)
- Design suggestion: The server just for synthesis, Michael Pozhidaev, 2010/11/12
- Design suggestion: The server just for synthesis, William Hubbs, 2010/11/12
- Design suggestion: The server just for synthesis, Andrei Kholodnyi, 2010/11/12
- Design suggestion: The server just for synthesis, William Hubbs, 2010/11/12
- Design suggestion: The server just for synthesis, Andrei . Kholodnyi, 2010/11/13
- Design suggestion: The server just for synthesis, Michael Pozhidaev, 2010/11/13
- Design suggestion: The server just for synthesis, Andrei Kholodnyi, 2010/11/14
- Design suggestion: The server just for synthesis, Hynek Hanke, 2010/11/14
- Design suggestion: The server just for synthesis, Tomas Cerha, 2010/11/15
- Design suggestion: The server just for synthesis, Andrei Kholodnyi, 2010/11/15
- Design suggestion: The server just for synthesis,
Tomas Cerha <=
- Design suggestion: The server just for synthesis, Michael Pozhidaev, 2010/11/12
- Design suggestion: The server just for synthesis, Hynek Hanke, 2010/11/14
- Design suggestion: The server just for synthesis, Halim Sahin, 2010/11/15
- Design suggestion: The server just for synthesis, Michael Pozhidaev, 2010/11/12
- Design suggestion: The server just for synthesis, William Hubbs, 2010/11/13
- Design suggestion: The server just for synthesis, Michael Pozhidaev, 2010/11/14
- Design suggestion: The server just for synthesis, Hynek Hanke, 2010/11/14
- Design suggestion: The server just for synthesis, Hynek Hanke, 2010/11/14
- Design suggestion: The server just for synthesis, Michael Pozhidaev, 2010/11/14
- Design suggestion: The server just for synthesis, William Hubbs, 2010/11/15