speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

kttsd using Speech-Dispatcher


From: Jeremy Whiting
Subject: kttsd using Speech-Dispatcher
Date: Fri, 5 Jun 2009 07:37:41 -0600

On Friday 05 June 2009 01:57:58 Hynek Hanke wrote:
> Jeremy Whiting wrote:
> > I've recently been taking over where Gary Cramblitt left off in porting
> > kttsd to use speech-dispatcher exclusively for speech synthesis. I've
> > been using the C api provided in libspeechd.so and have a couple
> > questions.
>
> This is great, thank you! I think it is an important job.
> I must warn you though that we have very little resources
> to develop Speech Dispatcher now and for some time to come.
> Good news is that Luke Yelavich from Ubuntu got some funding
> and is working on fixing some issues and on Speechd integration.
> So it might be best if you post your emails to
> speechd at lists.freebsoft.org instead where he can read it as well.
>
> > 1. I can't seem to link both qt and libspeechd unless I comment out FILE
> > *spd_debug in libspeechd.h maybe this is something I am doing wrong on my
> > end, I'm not sure, but if not I wonder if that could be commented out of
> > libspeechd.h in the next release.
>
> There might be some conflict. Actually I think libspeechd needs some
> cleanup, fixes and extensions. If you want to work on it, we can give
> you CVS acccess for speechd, just send what account name do you want.
>
> We can of course make a release then.

After discussing the issue with a couple kde developers, we've come up with a 
couple possible solutions.  I'd prefer #1 if it works for you and doesn't 
change binary compatibility with existing speech-dispatcher client apps.

1) change the line in libspeechd.h that says "FILE* spd_debug;" to say "extern 
FILE* spd_debug;"  the problem I'm seeing is coming from including the header 
in two or more cpp files.

2) I'm currently working around the above issue in kttsd by #defining spd_debug 
as spd_debug2 in one of the 2 cpp files that includes libspeechd.h =) It's a 
hack, but it gets around the issue for now.

P.S. I'd love to get cvs access to speech-dispatcher. I would of course get 
peer reviews before committing any changes.

> > 2. Should kttsd be using the C api at all, or using tcp-ip directly to
> > talk to speech-dispatcher?  using the C api has been pretty useful so
> > far, but I've not got everything ported yet (speech output is ported, but
> > not job management, etc.).
>
> The most stable and the definition API is the textual SSIP protocol API,
> where you should also always look for reference and list of SD
> capabilities, the C API is not complete. So if you do not want to use
> libspeechd, you can write your own or fork it into kttsd codebase. I
> however thing that if it is possible,
> it would be best to fix/modify/extend the C API and use it. We must
> however be
> careful about backwards incompatible changes as other projects
> (brltty and Speakup--speechd-up) are using it as well.

Yes, I agree. I'll continue to do so for now as it seems to be working fine so 
far for what I've ported to it. =)

> > 3. What would be needed for speech-dispatcher to possibly use Phonon as
> > it's backend.  Mostly just for kde users probably, but I'm just curious
> > what would be involved to make that happen I guess.
>
> Roughly said it is just adding another audio interface to speechd.
> It alreasy supports ALSA/Pulse, NAS etc. So you should look into
> speechd/src/audio/. You would also have to add the relevant
> options and their handling to speechd.conf.
>
> I would however think that it is best to stick to one of the current
> backends (ALSA/Pulse) and improve them. I warn you that the
> requirements on Audio are very heavy from speech synthesis
> for accessibility purposes, and developing an audio interface
> is not easy in terms of overcoming all the problems you encounter
> with performance, bugs in the audio frameworks under high load
> etc. Basically ALSA is ok now except that libasound sometimes
> crashes speechd, Pulse audio doesn't really work very well yet
> unless you have the latest version of everything, compile your
> own kernel for small latency  etc (last checked with PA 0.9.13).
>
> So if you would like to work on this, the best contribution
> would be to work with Lennart Poettering from Pulse Audio
> to fix the remaining issues in Pulse Audio.

Ok, I understand. That is more of a long term goal for me anyway, to get 
phonon support/backend into speech-dispatcher.  I'll talk with the Phonon 
maintainer about his thoughts in this regard also (kttsd from latest release 
works much better using alsa directly than using phonon with the xine backend, 
it works ok with the gstreamer backend though)

> > 4. I'm wondering about making kttsmgr contain a gui to wrap spd-conf so
> > getting kttsd up and running with freshly installed speech-dispatcher and
> > festival/espeak, etc. would be straightforward for those not wanting to
> > use the command-line, would this be very tricky?
>
> I think it should be rather easy. It would be best if you coordinate
> your work with Luke Yelavich, as he will be working on very
> similar things for Ubuntu/Gnome.
>
> spd-conf itself could use some improvements too, so I think you
> must count with the fact that you would also need to develop
> spd-conf itself. It's however a small and easy python application
> separated from the rest of the codebase.

If Luke Yelavich is on this list I'd love to start coordinating with him 
directly also.  I am also frequently on gnome's irc on the #a11y channel if 
either/any of you would like to contact me there.

Currently I think wrapping spd-conf itself should be manageable, it is fairly 
straightforward as I've used it to set up my speech-dispatcher on a couple of 
machines.  It's all python? that will simplify things a great deal also 
(assuming I can remember any python skills =)

> > Would the gui just modify the
> > speech-dispatcher config files directly and restart the daemon? or
> > interact with speech-dispatcher directly and let it modify its own
> > configuration files?
>
> The later would be better of course but I don't think we can easily
> do that in the current Speech Dispatcher architecture. We have also
> started developing TTS API Provider (and TTS API) as a second generation
> project some years ago, you can find it on www.freebsoft.org, but got stuck
> before completion for lack of resources and generally lack of interest
> in the accessibility community (mainly that our work with Gary stopped
> and the Orca
> team didn't want to contribute to that). Maybe the situation would be
> changing now. Sun can't continue to develop Gnome Speech as Gnome
> is moving away from Corba, so they either need to use Speech Dispatcher,
> help to develop TTS API Provider or create something from scratch.
> Now you seem to have interest for KDE. So perhaps you should look
> into TTS API Provider as well, just to understand where we were
> heading. Because Speech Dispatcher, while it works ok, has its limitations,
> some things (like development of output modules) are harder than they
> need to be etc.
>
> Have a nice day & sorry I can't help with the technicalities now,
> Hynek

Thanks, I'll definitely look into TTS API Provider.  I looked at it some time 
ago (just glancing) and saw it was incomplete, so didn't look much further. 
I'm going to try to contact Gary also to see if he has any advice for me in 
pursuing this.  He has contacted me once before, so I don't think his job 
prevents him from giving advice, but I could be wrong there.

Anyway, thanks again,

Jeremy Whiting

cc-ing the list now too =)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]