Re: [Accessibility] Call to Arms

accessibility

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Accessibility] Call to Arms

From:	Eric S. Johansson
Subject:	Re: [Accessibility] Call to Arms
Date:	Mon, 26 Jul 2010 14:44:25 -0400
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.7) Gecko/20100713 Thunderbird/3.1.1

 On 7/25/2010 10:52 PM, Richard Stallman wrote:

     I was speaking shorthand. It's not an add-on to
     NaturallySpeaking. It is an add-on to the communications framework
     between recognition application and user application.

Something like that might be independent enough of the recognizer
to be a valid project.  But there ARE free software packages for
speech recognition.  So people should develop it to work with them.
If users can also run it with NaturallySpeaking, that is ok,
as long as we don't suggest it.


Sorry, I really have to correct this and correct it hard.

***there are no large vocabulary continuous speech usable speech recognitionengines out there today ***From what I can tell, Simon is the closest and it's pretty far. Sphinx is agreat tool to keep grad students busy. To keep the speech recognition rate highenough, you need to keep your vocabulary in the 1000 word range. to keep youraccuracy high enough you need to keep your recognition vocabulary in the 1000word range. I spoke with the Sphinx numeral four developer about using this whenI was part of the oopen-source speech recognition initiative and he told us,it's IVR only. Don't even think about using it for dictation.

When we did a survey of all the available packages, the closest one we found wasthe MIT dugout package. But its creator admitted it was missing all of thelanguage model, acoustic modeling etc. that it needed and it was better than allthe alternatives other.

Need a first step for this whole process should be collecting a corpus fortraining and experimenting with different recognition parameters. you need tohave one before you can ship a working recognition system. Hopefully you can getit done a couple of years. Dragon Systems took something on the order of a yearor two with a heavy interview/recording schedule for the baseline and then keptgradually improving it.

However, I think we should not include such things in THIS project,
because we need to focus energy on the goal of making those free
recognizers better.  For us, replacing important proprietary software
takes priority over advancing the capabilities of software.

Richard, you use the language of someone who has no clue about how difficultthis problem is. I've been friends with speech recognition developers and livingwith speech recognition for 15 years. I have an idea of the problems they'veencountered. I have no idea how to solve them nor do I fully understand thembut, I've learned enough to have a clue. I'm not saying you don't know anythingabout the problem but your language and expectations expressed are frighteningme because expectations based on what I'm seeing have been responsible for thefailure of more than one speech recognition tools program, something much lesscomplex than the recognizer/line with model/acoustic model/audioprocessing/predictive search engines/correction systems/training... that all gointo a full recognition system. and we still haven't started talking about howbadly screwed up the Linux audio sound system is. If you want good speechrecognition, you better need to rewrite the entire audio system to make it workbetter for speech recognition. This really is a big chunk of work you arebiting off.

ask yourself this question: why was NaturallySpeaking the only large vocabularycontinuous speech recognition product on the market? (hint: it's really f-n hardand it's a small market)

I want this to succeed but it's got to have real expectations and mostimportantly serve the needs of the users because unlike any other Project you'veever been on, the users are the most important thing. Yes, I know this iscounter the free software foundation philosophy but being injured, working withother injured people, I can't see myself looking at this project in any otherway but compassionate. doing otherwise is just wrong according to myspiritual/ethical/moral/greedy self-interest foundations.

I really apologize for being blunt. You have been one of my heroes for a longtime but I am willing to kick even my heroes in the shins if I think he isgoing really wrong and I think you are going really wrong. If would help any,I'm could come down to lunch and talk about some of these issues the next timeyou're in Boston. if I remember the location correctly, We could probably askthe guy (gs) to the left of your office to join us and act as amoderator/referee :-). He knows me through ATMoB.

for the meantime, I'm going to have to drop this but, it is extremely important.I am willing to help out with requirements for the toolset and basic thinkingabout what the user needs to do until we get to the point where I can write codeusing speech recognition. Are you okay with that?


--- eric

PS, not sure where to fit this in but it's an example to think about. VR-mode isa bridge between Emacs and NaturallySpeaking. It gives full voice control andediting control like NaturallySpeaking doesn't proprietary programs. If it wouldwork more consistently, I would be using Emacs instead of proprietary programswhich work better with speech recognition.

here's another thing. If I had VR mode working, I would be able to write amoderate amount of Python code with a bare-bones recognition system.

This is the kind of incremental migration to a freer software environment thatI'm hoping for. First you modify the applications, then come with proper bridgedesign, you pull out the evil proprietary stuff and replace it with a good freestuff. right now, it's proprietary software all the way. I have no choice if Iwant to work or play. I really hate it and I want to be working with freesoftware that works well with disabled users.

[Prev in Thread]

Current Thread

[Next in Thread]

speech recognition air and apology with long-winded explanation Re: [Accessibility] Call to Arms, (continued)

Prev by Date: Re: [Accessibility] Call to Arms
Next by Date: Re: [Accessibility] Thinking Further About Speech Recognition
Previous by thread: Re: Fwd: [Accessibility] Call to Arms
Next by thread: Re: [Accessibility] Call to Arms
Index(es):
- Date
- Thread