aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[aspell] Portable Spell Checker Interface Library (take 2)


From: Kevin Atkinson
Subject: [aspell] Portable Spell Checker Interface Library (take 2)
Date: Sun, 5 Mar 2000 04:44:35 -0500 (EST)

Here is a better description of my proposed generic spell checker library
for those who are interested.  

Feedback most welcome.


               Portable Spell Checker Interface Library               
                                                                      
                            Kevin Atkinson                            
                          address@hidden                           

                            March 5, 2000                             
                            (First Draft)                             


1 Goal

The goal of the library is to provide a generic interface to Spell
checker libraries installed on the system.

2 Overview

The Pspell library contains two main classes and several helper
classes. The two main classes are PspellConfig and PspellMaster. The
PspellConfig class is used to set inital defaults and to change spell
checker specific options. The PspellManager class does most of the
real work. It is resposable for managing the dictionaries, checking if
a word is in the dictrionary, and comming up with suggestions among
other things. There are many helper classes the important ones are
PspellWordList, PspellMutableWordList, Pspell*Emulation, and
PspellObject. The PspellWordList classes us used for accessing the
suggestion list, as well as the personal and suggestion word list
currently in use. The PspellMutableWordList is used to manage the
personal, and perhapes other, word lists. The Pspell*Emulation classes
are used for iterating through a list. Finally, the PspellObject class
is simply a base class which most other classes inherite from. It
containers methods for cloning objects as well as acceing error
messages.

3 Usage

When your application first starts you should get a new configuration
class with the command:

    PspellConfig * spell_config = new_pspell_config();

which will create a new PspellConfig class. It is allocated with new
and it is your responsibility to delete it with delete (not free).
Once you have the config class you should set some variables. The most
important one is the language variable. To do so use the command:

    spell_config->replace("lang", "english");

which will set the default language to use to english. Other things
you might want to set is the preferred spell checker to use, the
scratch path for dictionary's, and the like.

When ever a new document is created a new PspellManager class should
be also be created. There should be one manager class per document. To
create a new manager class use the command.

    PspellManager * spell_checker = new_pspell_manager(spell_config);

which will create a new PspellManager class using the defaults found
in spell_config. If for some reason you want to use different defaults
simply clone spell_config and change the setting like so:

    PspellConfig * spell_config2 = spell_config->clone();
    spell_config2->replace("lang","dutch");
    PspellManager * spell_checker = new_pspell_manager(spell_config2);
     
    delete spell_config2;

Once the manager class is created you can use the check method to see
if a word in the document is correct like so:

    bool correct = spell_checker->check(<word>);

<word> can be any one of const char *, const unsigned short *, or
const unsigned int *. Strings of const char * are expected to use
iso8859-1 or some other 256 bit character set as determined by the
current language in use. Stings of const unsigned short * and const
unsigned int * are expected to be in Unicode.

If the word is not correct than the suggest method can be used to come
up with likely replacements.

    PspellWordList & suggestions = suggest(<word>); 
    PspellStringEmulation * elements = suggestions.elements();
    const char * word;
    while ( (word = elements.next()) != NULL ) 
      // add to suggestion list
    delete elements;

(It is also possible to access elements as const unsigned short *, or
const unsigned int *. See the class reference section for how to do
so.)

Once a replacement is made the store_repl method should be used to
communicate the replacement pair back to the spell checker (see
section 5.1 for why). It usage is as follows:

    spell_checker->store_repl(<misspelled word>, <correctly spelled 
    word>);

If the user decided to add the word to the session or personal
dictionary the the word can be be added using the add_to_session or
add_to_personal methods respectfully like so:

    spell_checker->add_to_session|personal(<word>);

It is better to let the spell checker manager these these words rather
than doing it your self so that the words have a change of appearing
in the suggestion list.

Finally, when the document is closed the PspellManager class should be
deleted like so.

    delete spell_checker;

Do not use free as it is not allocated with malloc.

4 Class Reference

Methods that return a bool generally return false on error and true
other wise. To find out what went wrong use the error_num and
error_message methods. Unless otherwise stated methods that return a
const char * will return null on error. The charter string returned is
only valid until the next method which returns a const char * is
called.

STRING is used to represent one of const char *, unsigned short *, or
unsigned int *.

4.1 PspellObject

PspellObject * clone() const

void assign(const PspellObject *)

if the two objects are not of the exact same type the assign method is
undefined.

int error_num()

const char * error_message()

string valid until the next error

Object()

4.2 PspellConfig

public PspellObject

The PspellConfig class is used to hold configuration information it
has a set of keys which it will except.  Inserting are even trying to
look at a key that it does not know will produce an error. Extra
accepted keys can be added with the set_extra. method.

void set_extra(const PspellKeyInfo * begin, const PspellKeyInfo * end)

const PspellKeyInfo * keyinfo(const char * key) const

PspellKeyInfoEmulation * possible_elements(bool include_extra = true)
const

const char * get_default(const char * key) const

PspellStringPairEmulation * elements() const

bool insert(const char * key, const char * value)

Insert will NOT overwrite an existing entry

bool replace(const char * key, const char * value)

bool remove(const char * key)

All the retrieve methods will

 1. return the default if the value is not set
 2. give an error if the key is not requested as known
 3. give an error if the value is not in the right format

const char * retrieve (const char * key) const

const char * retrieve_list (const char * key) const

bool retrieve_list (const char * key, PspellMutableContainer &) const

int retrieve_bool(const char * key) const

return -1 on error, 0 if false, 1 if true

int retrieve_int(const char * key) const

return -1 on error

PspellConfig * new_pspell_config()

returns a new config class for setting things up before a manager
class is created

4.3 PspellManager

public PspellObject

This class is responsible for keeping track of the dictionaries coming
up with suggestions and the like Its methods are NOT meant to be used
my multiple threads and/or documents.

Most all if the manipulation of options is done via the Config class,
thus this class has precious few methods.

PspellConfig & config()
const PspellConfig & config ()

this config returned is NOT the same object as the one you pass in.

const char * lang_name() const

bool check(STRING) cons

bool add_to_personal(STRING)

bool add_to_session(STRING)

PspellWordList & master_word_list() const
PspellWordList & personal_word_list() const
PspellWordList & session_word_list() const

because the word lists may potently have to convert from non-uni to
uni or vise versa the pointer returned by the emulation is only valid
to the next call.

bool save_all_wls()

void clear_session()

PspellWordList & suggest(STRING)

the suggestion list and the elements in it are only valid until the
next call to suggest.

bool store_repl(STRING mis, STRING cor)

PspellManager * new_pspell_manager(const PspellConfig * config)

returns a new manager class, allocated with new,based on the settings
in config

4.4 PspellWordList

public PspellObject

bool empty() const

int size() const

StringEmulation * elements() const

ShortUniStringEmulation * short_uni_elements() const

UniStringEmulation * uni_elements() const

4.5 PspellMutableWordList

public PspellWordList

boll add(STRING)

bool clear_all()

bool save()

PspellMutableWordList * new_pspell_personal_word_list(PspellConfig *)

returns a new personal word list so that you can manage it

4.6 Pspell*Emulation

public PspellObject

All emulations have the following two methods.

<type> next()

bool at_end() const

where <type> is specific to the particulate emulation given by the
following table



        Name                           Type                           
        PspellStringEmulation          const char *                   
        PspellShortUniStringEmulation  const unsigned short *         
        PspellUniStringEmulation       const unsigned int *           
        PspellKeyInfoEmulation         PspellKeyInfo *                
        PspellStringPairEmulation      PspellStringPair               




4.7 Other minor classes.

    class PspellMutableContainer {
    public:
      virtual void insert(const char *) = 0;
      virtual void remove(const char *) = 0;
      virtual void clear() = 0;
      PspellMutableContainer();
    };
     
    enum PspellKeyInfoType {Bool, String, Int, List};
     
    struct PspellKeyInfo {
      const char * name;
      PspellKeyInfoType  type;
      const char * def;
      const char * desc; // null if internal value
    };
     
    class PspellStringPair { 
      const char * first;
      const char * second; 
    };

5 Rational


5.1 store_repl method

This method is needed because Aspell (http://aspell.sourceforge.net/)
is able to learn from users misspellings. For example on the first
pass a user misspells beginning as beging so aspell suggests:

    begging, begin, being, Beijing, bagging, ....

However the user then tries "begning" and aspell suggests

    beginning, beaning, begging, ...

so the user selects beginning. However than, latter on in the document
the user misspelles it as begng (NOT beging). Normally aspell will
suggest.

    began, begging, begin, begun, ....

However becuase it knows the user mispelled beginning as beging it
will instead suggest:

    beginning, began, begging, begin, begun ...

I myself often misspelled beginning (and still do) as something close
to begging and two many times wind up writing sentences such as
"begging with ....".

6 Feedback

As always feedback is most appreciated. I can be contacted at
address@hidden

7 Other Formats

This document is available in several other formats:



         Format  Location                                             
         HTML    http://pspell.sourceforge.net/interface.html         
         Text    http://pspell.sourceforge.net/interface.txt          
         TEX     http://pspell.sourceforge.net/interface.tex          
         PS      http://pspell.sourceforge.net/interface.ps           
         Dvi     http://pspell.sourceforge.net/interface.dvi          
         LyX     http://pspell.sourceforge.net/interface.lyx          




About this document ...

Portable Spell Checker Interface Library

This document was generated using the LaTeX2HTML translator Version
99.2beta6 (1.42)

Copyright (C) 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based
Learning Unit, University of Leeds.
Copyright (C) 1997, 1998, 1999, Ross Moore, Mathematics Department,
Macquarie University, Sydney.

The command line arguments were:
latex2html -no_subdir -split 0 -no_navigation -local_icons
-show_section_numbers interface.tex

The translation was initiated by Kevin Atkinson on 2000-03-05
----------------------------------------------------------------------

Kevin Atkinson 2000-03-05
[sflogo]

---
Kevin Atkinson
address@hidden
http://metalab.unc.edu/kevina/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]