Common Database for Abbreviations, Acronyms, etc.

From: Peter Grasch
Subject: Common Database for Abbreviations, Acronyms, etc.
Date: Mon, 24 Sep 2012 13:12:10 +0200


I am currently working on building an AT-SPI 2 plugin for the Simon
speech recognition software and was looking into how to go from strings
like "Write to Mr. Smith" (written in the application) to "Write to
Mister Smith" (command to expect from the user).

While this functionality is already available in e.g. Orca, it seems
that this problem is in fact solved redundantly on many different layers
(screen reader, engine, etc.).

With something like abbreviation expansion, a large data set is
definitely key so adding *another* implementation to Simon is imho the
wrong approach.

For that reason, I would like to propose a dedicated, shared library
that offers translations from "application text" to "spoken text". This
library could also offer a configuration interface to allow applications
like Orca to provide a way for users to manage this dictionary.

As you probably have noticed, I posted this message to the teams of
Orca, KDE accessibility, Speech dispatcher, eSpeak and Festival. 
Coordinating such a large number of diverse teams is always difficult
but I'd really like to hear from all of you: Would you be interested in
helping to build such a library? If such a library would exist, would
you consider using it instead of your current implementation?

Best regards,

Ps.: I'm subscribed to all the mailing lists I sent this to. You can
either reply there or send me private mails. If we can manage to
established a team of interested people we can open a separate mailing

