[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Accessibility] Can you help write a free version of HTK?

From: Bill Cox
Subject: Re: [Accessibility] Can you help write a free version of HTK?
Date: Mon, 12 Jul 2010 13:39:43 -0700

Hi, Eric.  I really like some of the ideas you mentioned.  Comments below.

On Mon, Jul 12, 2010 at 10:03 AM, Eric S. Johansson <address@hidden> wrote:
>  On 7/12/2010 4:24 AM, Bill Cox wrote:
> My rationale for requiring the equivalent to NaturallySpeaking is twofold.
> First is programmatic control over the environment and second is vocal load.
> NatuurallySpeaking has far better control over its environment through ssome
> toolkit like dragonfly then DragonDictate ever did. I believe that's because
> of the environments (Windows versus DOS). Having used DragonDictate, I found
> it incredibly wearing on my throat

I agree.  I used Dragon Dictate in Windows to drive emacs.  Emacs
provided the programmability, which is essential.  At the time,
without emacs, I do not know what I would have done.

> In any case, there's one thing to remember about programming by voice. You
> need the same vocabulary size for coding as you do for writing. This two
> reasons for that which I will let you think about and I will address later.

I only had a hundred or so commands I would actually use in any given
emacs window.  In comments, I needed full recognition of English, and
Naturally Speaking improved that over Dragon Dictate, but it didn't
make much difference in my productivity.  With a smaller number of
commands available at any time, we could have a tool like you suggest
that shows you what commands are available at any given time.

> However, I will suggest that writing e-mails faster is not a bad
> thing. I write fiction for a hobby and if I can improve accuracy, I can
> write more because editing sucks.

Writing faster is a very good thing.  However, NS does it so well, I'd
rather not focus on it.  People who can't type can suck it up and get
NS for emails and other documents until a FOSS version is comparable.
In the meantime, tools like dragonfly and vocola are helping people
write code, and we could do similar things in Vinux, and possibly take
the ideas further.

> when you look at a piece of rough text and try to change it, you really see
> the lack of inventive or creative effort necessary to make editing easier.
> Because I don't use speech recognition enabled editors, I can't say
> something like "select a sentence containing "brilliance of her smile" and
> have a sentence placed into a dictation box for editing. And yes, I
> deliberately used an odd number of quote marks because, why do you need to"
> was on the end of the line in a command mode. Also, it insistence of using
> the Windows selection mechanism (drag with mouse) makes it difficult to
> select a small number of words if your hands are like mine. You really want
> something I can Emacs Mark in point so that you can use a tablet or even a
> mouse and say "leave market" and "end region". Yes, I left the previous
> sentence uncorrected just because was too much work to drive the mouse.

How do you drive the mouse?  There is a very cool project in Linux
land that tracks your head movement.  I think we should be able to
make progress towards implementing features like you describe above
using the at-spi interface, the way the Orca screen reader does.  We
may even want to leverage some of the Orca infrastructure for things
like controlling the mouse in Firefox.

> I believe innovation comes from people like us. Back in the bad old days of
> Dragon Systems, disable users would be brought in occasionally to experiment
> with different interfaces or talk about their experience with the product. I
> would make some radical changes if I had sufficient hands to write the UI.
> For example, I would make dictation box with filters on both the input and
> output so you could modify code to look like English text thereby enabling
> familiar editing patterns in a dictation box. And I'm output, I would
> retranslate the text back into code. But also I want plug-ins on dictation
> box to make it possible to edit other things.

This sounds extremely cool, and doing it without modifying the
applications would be outstanding.  The reason I had 1,600 macros was
primarily to convert from my natural speech into cryptic keystroke and
edits.  If the tool could show me the natural speech it might expect
me to say, as well as convert my code to regular words, it could
dramatically lighten the cognitive load.

> You are far more optimistic than I am. My experience try to get Emacs
> updated and dtach modified for crip use has not been successful at
> attracting help even though they are far more useful on day one then a new
> speech recognizer.

I really never had any help at coding by voice.

> As for a pool of experts, we can try mining the OSSRI BOARD OF Directors for
> possible candidates. That's something we'll have to talk to Susan about.
>> When I do simple estimates, I just can't see how we don't have enough
>> potential volunteers to do this.  I just can't believe that 99.9% of
>> us with RSI injuries or visual impairments are the sort of people to
>> sit on our butts and do nothing.  From what I've seen, a fair
>> percentage of us happen to be decent programmers, and are the sort
>> that refuse to believe we have limitations.
> I can unfortunately. Because programming by voice has been so difficult and
> the hostility of employers to anyone using something like speech recognition
> in open office plan, many programmers, including myself, have left the
> field. Some migrated to completely different fields such as bicycle design
> and others, like myself, have become self-employed as it's the only way to
> insulate oneself from corporate stupidity and the egregious workloads that
> injured us in the first place.
>> Perhaps I have a strong voice, but I spoke non-stop to my computer for
>> 10 hours a day for over three years, and found that all I had to do
>> was sip water constantly.  I programmed by voice using macros,
>> eventually writing over 1,600 of them, mostly to control emacs.  I
>> think it was the best way to continue my career, without giving into
>> my typing limitations.
> You are a very different person than I am. I was able to program in Python
> using Emacs with less than 50 macros. I could not remember 1600 of them.
>  something about RSI and its treatment messes with your memory.  Most
> developers I've known would not be able to remember 1600 macros as well as
> the entire body of code they are working with.  When I have written code, I
> have changed how I write classes as a way of accommodating my memory
> deficits.  I also tried to write a small number of macros that were easy on
> the voice.  as I said before also many developers suffer vocal strain at a
> far lower level of effort than you have put yourself through.  memory
> shortcomings are something else we will need to accommodate. I think this is
> the driving force behind the methods I've developed for exploring a speech
> interface. I can't remember what I'm supposed to say next so, the system
> should prompt me and gave me the ability to navigate within that prompt.
>  the great example is change directory. It's a delightful intellectual
> exercise as well as demonstration of the flexibility of a discoverable
> speech interface
>> I am very interested in ideas like you suggest for enabling
>> applications without modifications, and doing anything that reduces
>> vocal and cognitive load.  We need new ideas, and I agree with your
>> point about not needing another useless type-by-voice project.  Part
>> of the problem is that many of these projects are funded by well
>> meaning institutions, but implemented by people interested in research
>> and their own careers.  I think the code we write would be far better
>> focused on our own needs.
> Okay, this is a conversation why have far more time and possibly one message
> per topic. Should pop up in the next week or so.
>> Sorry, but I have to ask: if you can dictate e-mail, why can't you
>> write code?
> that's a real good question. I think the best answer is:
>       If it's too difficult to do, it's not worth doing until it's simple.
> this is the classic programmer hubris, laziness, arrogance all rolled into
> one. It's actually design philosophy for me even before I was injured. If
> it's hard to do, you're doing something the wrong way. You don't understand
> the problem. You don't even know you're an idiot. When you sit down and
> answer all of the question the back of your mind creates and manifests as
> "I'm not comfortable with this" only then should you start thinking about
> implementation.
> Now, I did write Python byte code. I created a Web framework with a markup
> language that accommodates disabled users. It will work with speech
> recognition but it will also, theoretically, be accessible to blind,
> text-to-speech users. It's simple, the current implementation is a bit of a
> pig but I just wanted to prove the concept of the usability of a disabled
> user focused markup language.
> It's on launchpad under the name "akasha"
> Python is the only language I've seen so far that isn't completely hostile
> to unenhanced speech recognition.  I can't manipulate C., Java, or any other
> language with the same ease.  I consider the whole C. language family is so
> ungodly hostile to speech recognition it's the take a huge interface layer
> to cross between the two.
> I bet you're asking why. An overabundance of special characters with special
> spacing. I shouldn't have to do that. The environment should know enough
> about what I'm saying to put things in the right place. Jumble cap
> misspelled words used for symbols. Again, why should I have to spell that. I
> should really say the nearest English equivalent and the tool translates.
>  These two features alone will significantly drop the vocal load of
> programming by voice. They will reduce the cognitive load of trying to
> remember how to generate that symbol. Done right you will be able to edit a
> misrecognition in the middle of a misspelled word, possibly even before you
> inject it into your code. By using the default code and simple style, code
> generation will be easier on so many levels.
> I could say more but I will spare you. :-)
>> Anyway, you don't have to type code to contribute.  I
>> would like to hear more about your models.  I'm want to put together
>> an e-mail list to discuss programming by voice, and the direction we
>> should take in implementing and improving the tools we need.  Your
>> input is welcome!  Would it be better to host that e-mail list in
>> vinux land, or in land?  Regardless, I would like to work in
>> Vinux to enable programming by voice at some basic level, and then I'd
>> like to get lots of voice coders on board to make it better.
> Models later when I have more time. Probably this weekend coming up. Like I
> said, there is already a list but, I think I would choose the vinux world as
> being more culturally/philosophically on board with what we are trying to do
> regarding accessibility approaches.
> I'm out of time for today. I'll try to get back to the rest of this later.
> _______________________________________________
> Accessibility mailing list
> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]