[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Accessibility] Call to Arms

From: Eric S. Johansson
Subject: Re: [Accessibility] Call to Arms
Date: Tue, 27 Jul 2010 21:04:09 -0400
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv: Gecko/20100713 Thunderbird/3.1.1

 On 7/27/2010 7:34 PM, Bill Cox wrote:
Just seeing RMS and Eric exchanging e-mails is frightening!

If you see what Richard and I are going to be doing for Halloween. :-) I think I'm going to get to play the evil bloodsucking software capitalist. I don't stand a chance against the St. :-)

Eric seems to have some excellent ideas about what sort of tools we
should build.  I really like his concept of "discoverability".  Emacs
is popular partly because after only a few minutes of training, you
learn how to learn everything else directly in Emacs.  There are
thousands of features and commands, yet you don't have to know them
all, and emacs helps you find them when you need them.  A major
problem with writing code with voice macros, like I used to do, is I
had to learn over a thousand voice commands.  That's a large mental
load.  If a window would show me all the commands that could be spoken
next whenever I pause, sorted by how often they're used in that
context, it would be a huge improvement.

So, I'm finding Eric's participation quite useful.  That's the single
measure I rate people on.  I do not currently agree with Eric that we
should first make Naturally Speaking work well in wine, and then focus
on the interface tools.  However, I do see his point, and I could see
changing my mind in the future if we can't get alternatives like
Julius working well enough.

Here's another, shorter term alternative exchange for more resource consumption. It's a valid trade-off but it's not as freeing as the wine solution.

Run NaturallySpeaking in a virtual machine with a mumble mumble mumble OS.

Yeah sucks. But it's fast and won't take much in the way of project resources. The only thing we would need to fix (assuming we use virtual box) is the way we get audio from the host side to the guest side. I advocate using USB and fixing the current problems with USB and whatever VM system we choose.

the next problem will be getting the output back to the host. This is nothing more than the two machine problem I've spoken about here (I think). I think the two machine problem is important enough that it should be solved right at the very beginning.

I don't like it, it's ugly but it will work and I'm hoping it's ugly enough that people will migrate en masse to the free version when it's working.
The problem with using NS is that NS is limited, and unfixable.  It
has great continuous speech recognition, but it wants me to speak in
grammatically correct English all the time.  What I want instead is a
voice recognition engine that understands context, and allows me to
speak in C, Python, bash, etc.  If I'm in a bash console, and I say "C
D documents", I want that to recognise the cd command, and any word
that bash would normally auto-complete, like documents in this case.
Instead, NS used to give me text like "Seedy documents".  I want the
vocabulary limited to what's valid in the context.  I believe with
such features, it should be possible to achieve pretty good
productivity without high-speed highly accurate large-vocabulary
recognition.  Also, by actually using the FOSS speech-recognition
engines, I think we'll be able to accelerate improving them.  Relying
on NS in wine will just delay the FOSS alternative, and as I said, NS
is limited.  I'm much more interested in developing the next
generation of accessibility tools than living with the crap I can
currently buy.

Bill, you are speaking the keyboard. you don't say cd, you say change directory because it's easier on the voice, the mind, and it's much more robust against misrecognition errors. You also say push directory and pop directory.

But surprisingly, this won't work with a simple macro system. the shell environment needs to tell the recognition environment its context and this is where the "interrupting cow" model comes in.

Knock knock?
Who's there?
Interrupting cow

While in this case the cow (recognition target) doesn't interrupt blindly while the user speaking, it interrupts when it detects an end of utterance. So if you were to say "change directory", the interrupting cow would present a list of what you could say at that point based on your location in the file hierarchy plus whatever macros you want. That list is placed into the existing grammar and you continue by saying where you want to go. If you stop there, and the process is repeated at the new file location. When you're done you can say "remember as <name>" So the next time you want to go there, you have a shorthand name.

Change directory  /etc. moin nelpag

Generates /etc/moin/nelpag.

Remember directory nelpag wiki

Push directory nelpag wiki

making a little more sense? notice that this technique is also more misrecognition resilient. You could build up the path in a separate window as you watch the navigation results in another. Then, if you see you have one wrong recognized word, you could then, using speech recognition editing techniques, change that word easily.

Now, another example I've used for speech recognition friendly design is the Web framework akasha (see launchpad) it uses simple notation of [followed by a word. That keyword is sufficient to define what needs to be done in that context. It's very speakable and even user friendly (or so I like to flatter myself). I believe this is one possible path for us to take a coding project. It might end up looking like Lisp which is a scary thought.

While some may view the requirement of sticking to an English language vocabulary a hindrance, it's a huge benefit to the English-speaking user. Using an English-language grammar reduces vocal vocal strain because the brain is our a prepared to move your throat and vocal apparatus in a smooth and efficient way. It reduces cognitive load because your order thinking grammar in that structure into force yourself to remember a different grammar is difficult.

Embrace the grammar bill. the implementation maybe the dark side but the concepts are not.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]