[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[aspell] Portable Spell Checker Interface Library (take 2)
From: |
Kevin Atkinson |
Subject: |
[aspell] Portable Spell Checker Interface Library (take 2) |
Date: |
Sun, 5 Mar 2000 04:44:35 -0500 (EST) |
Here is a better description of my proposed generic spell checker library
for those who are interested.
Feedback most welcome.
Portable Spell Checker Interface Library
Kevin Atkinson
address@hidden
March 5, 2000
(First Draft)
1 Goal
The goal of the library is to provide a generic interface to Spell
checker libraries installed on the system.
2 Overview
The Pspell library contains two main classes and several helper
classes. The two main classes are PspellConfig and PspellMaster. The
PspellConfig class is used to set inital defaults and to change spell
checker specific options. The PspellManager class does most of the
real work. It is resposable for managing the dictionaries, checking if
a word is in the dictrionary, and comming up with suggestions among
other things. There are many helper classes the important ones are
PspellWordList, PspellMutableWordList, Pspell*Emulation, and
PspellObject. The PspellWordList classes us used for accessing the
suggestion list, as well as the personal and suggestion word list
currently in use. The PspellMutableWordList is used to manage the
personal, and perhapes other, word lists. The Pspell*Emulation classes
are used for iterating through a list. Finally, the PspellObject class
is simply a base class which most other classes inherite from. It
containers methods for cloning objects as well as acceing error
messages.
3 Usage
When your application first starts you should get a new configuration
class with the command:
PspellConfig * spell_config = new_pspell_config();
which will create a new PspellConfig class. It is allocated with new
and it is your responsibility to delete it with delete (not free).
Once you have the config class you should set some variables. The most
important one is the language variable. To do so use the command:
spell_config->replace("lang", "english");
which will set the default language to use to english. Other things
you might want to set is the preferred spell checker to use, the
scratch path for dictionary's, and the like.
When ever a new document is created a new PspellManager class should
be also be created. There should be one manager class per document. To
create a new manager class use the command.
PspellManager * spell_checker = new_pspell_manager(spell_config);
which will create a new PspellManager class using the defaults found
in spell_config. If for some reason you want to use different defaults
simply clone spell_config and change the setting like so:
PspellConfig * spell_config2 = spell_config->clone();
spell_config2->replace("lang","dutch");
PspellManager * spell_checker = new_pspell_manager(spell_config2);
delete spell_config2;
Once the manager class is created you can use the check method to see
if a word in the document is correct like so:
bool correct = spell_checker->check(<word>);
<word> can be any one of const char *, const unsigned short *, or
const unsigned int *. Strings of const char * are expected to use
iso8859-1 or some other 256 bit character set as determined by the
current language in use. Stings of const unsigned short * and const
unsigned int * are expected to be in Unicode.
If the word is not correct than the suggest method can be used to come
up with likely replacements.
PspellWordList & suggestions = suggest(<word>);
PspellStringEmulation * elements = suggestions.elements();
const char * word;
while ( (word = elements.next()) != NULL )
// add to suggestion list
delete elements;
(It is also possible to access elements as const unsigned short *, or
const unsigned int *. See the class reference section for how to do
so.)
Once a replacement is made the store_repl method should be used to
communicate the replacement pair back to the spell checker (see
section 5.1 for why). It usage is as follows:
spell_checker->store_repl(<misspelled word>, <correctly spelled
word>);
If the user decided to add the word to the session or personal
dictionary the the word can be be added using the add_to_session or
add_to_personal methods respectfully like so:
spell_checker->add_to_session|personal(<word>);
It is better to let the spell checker manager these these words rather
than doing it your self so that the words have a change of appearing
in the suggestion list.
Finally, when the document is closed the PspellManager class should be
deleted like so.
delete spell_checker;
Do not use free as it is not allocated with malloc.
4 Class Reference
Methods that return a bool generally return false on error and true
other wise. To find out what went wrong use the error_num and
error_message methods. Unless otherwise stated methods that return a
const char * will return null on error. The charter string returned is
only valid until the next method which returns a const char * is
called.
STRING is used to represent one of const char *, unsigned short *, or
unsigned int *.
4.1 PspellObject
PspellObject * clone() const
void assign(const PspellObject *)
if the two objects are not of the exact same type the assign method is
undefined.
int error_num()
const char * error_message()
string valid until the next error
Object()
4.2 PspellConfig
public PspellObject
The PspellConfig class is used to hold configuration information it
has a set of keys which it will except. Inserting are even trying to
look at a key that it does not know will produce an error. Extra
accepted keys can be added with the set_extra. method.
void set_extra(const PspellKeyInfo * begin, const PspellKeyInfo * end)
const PspellKeyInfo * keyinfo(const char * key) const
PspellKeyInfoEmulation * possible_elements(bool include_extra = true)
const
const char * get_default(const char * key) const
PspellStringPairEmulation * elements() const
bool insert(const char * key, const char * value)
Insert will NOT overwrite an existing entry
bool replace(const char * key, const char * value)
bool remove(const char * key)
All the retrieve methods will
1. return the default if the value is not set
2. give an error if the key is not requested as known
3. give an error if the value is not in the right format
const char * retrieve (const char * key) const
const char * retrieve_list (const char * key) const
bool retrieve_list (const char * key, PspellMutableContainer &) const
int retrieve_bool(const char * key) const
return -1 on error, 0 if false, 1 if true
int retrieve_int(const char * key) const
return -1 on error
PspellConfig * new_pspell_config()
returns a new config class for setting things up before a manager
class is created
4.3 PspellManager
public PspellObject
This class is responsible for keeping track of the dictionaries coming
up with suggestions and the like Its methods are NOT meant to be used
my multiple threads and/or documents.
Most all if the manipulation of options is done via the Config class,
thus this class has precious few methods.
PspellConfig & config()
const PspellConfig & config ()
this config returned is NOT the same object as the one you pass in.
const char * lang_name() const
bool check(STRING) cons
bool add_to_personal(STRING)
bool add_to_session(STRING)
PspellWordList & master_word_list() const
PspellWordList & personal_word_list() const
PspellWordList & session_word_list() const
because the word lists may potently have to convert from non-uni to
uni or vise versa the pointer returned by the emulation is only valid
to the next call.
bool save_all_wls()
void clear_session()
PspellWordList & suggest(STRING)
the suggestion list and the elements in it are only valid until the
next call to suggest.
bool store_repl(STRING mis, STRING cor)
PspellManager * new_pspell_manager(const PspellConfig * config)
returns a new manager class, allocated with new,based on the settings
in config
4.4 PspellWordList
public PspellObject
bool empty() const
int size() const
StringEmulation * elements() const
ShortUniStringEmulation * short_uni_elements() const
UniStringEmulation * uni_elements() const
4.5 PspellMutableWordList
public PspellWordList
boll add(STRING)
bool clear_all()
bool save()
PspellMutableWordList * new_pspell_personal_word_list(PspellConfig *)
returns a new personal word list so that you can manage it
4.6 Pspell*Emulation
public PspellObject
All emulations have the following two methods.
<type> next()
bool at_end() const
where <type> is specific to the particulate emulation given by the
following table
Name Type
PspellStringEmulation const char *
PspellShortUniStringEmulation const unsigned short *
PspellUniStringEmulation const unsigned int *
PspellKeyInfoEmulation PspellKeyInfo *
PspellStringPairEmulation PspellStringPair
4.7 Other minor classes.
class PspellMutableContainer {
public:
virtual void insert(const char *) = 0;
virtual void remove(const char *) = 0;
virtual void clear() = 0;
PspellMutableContainer();
};
enum PspellKeyInfoType {Bool, String, Int, List};
struct PspellKeyInfo {
const char * name;
PspellKeyInfoType type;
const char * def;
const char * desc; // null if internal value
};
class PspellStringPair {
const char * first;
const char * second;
};
5 Rational
5.1 store_repl method
This method is needed because Aspell (http://aspell.sourceforge.net/)
is able to learn from users misspellings. For example on the first
pass a user misspells beginning as beging so aspell suggests:
begging, begin, being, Beijing, bagging, ....
However the user then tries "begning" and aspell suggests
beginning, beaning, begging, ...
so the user selects beginning. However than, latter on in the document
the user misspelles it as begng (NOT beging). Normally aspell will
suggest.
began, begging, begin, begun, ....
However becuase it knows the user mispelled beginning as beging it
will instead suggest:
beginning, began, begging, begin, begun ...
I myself often misspelled beginning (and still do) as something close
to begging and two many times wind up writing sentences such as
"begging with ....".
6 Feedback
As always feedback is most appreciated. I can be contacted at
address@hidden
7 Other Formats
This document is available in several other formats:
Format Location
HTML http://pspell.sourceforge.net/interface.html
Text http://pspell.sourceforge.net/interface.txt
TEX http://pspell.sourceforge.net/interface.tex
PS http://pspell.sourceforge.net/interface.ps
Dvi http://pspell.sourceforge.net/interface.dvi
LyX http://pspell.sourceforge.net/interface.lyx
About this document ...
Portable Spell Checker Interface Library
This document was generated using the LaTeX2HTML translator Version
99.2beta6 (1.42)
Copyright (C) 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based
Learning Unit, University of Leeds.
Copyright (C) 1997, 1998, 1999, Ross Moore, Mathematics Department,
Macquarie University, Sydney.
The command line arguments were:
latex2html -no_subdir -split 0 -no_navigation -local_icons
-show_section_numbers interface.tex
The translation was initiated by Kevin Atkinson on 2000-03-05
----------------------------------------------------------------------
Kevin Atkinson 2000-03-05
[sflogo]
---
Kevin Atkinson
address@hidden
http://metalab.unc.edu/kevina/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [aspell] Portable Spell Checker Interface Library (take 2),
Kevin Atkinson <=