|Subject:||[aspell-devel] Persian (Farsi) spell checking|
|Date:||Thu, 11 Aug 2005 15:06:45 +0430|
Along with development of Farsi dictionary for aspell we have developed a special utility which we would like to add it to the apsell source files. This is an introductory email to introduce you with the topic and gather your initial suggestions.
As I had previously tried to explain it in my early emails on the topic, we assume that spelling tools for Persian language should consider some related morphological rules. This is true because in Persian we build much words based on a single stem. Based on this discussion we believe that no spelling tool for Persian language may ignore these rules. We also proceed to state that there are 2 ways that these consideration may come into the picture.
1. The morphological rules may take their role while building the dictionary file. In this approach, the dictionary will be still just a database of correct words. But there would be special language-aware tools that help in building the dictionary. These tools work like some expanders that accept the single stem and produce the many forms that can be built upon this stem.
2. The morphological rules are applied on fly while the actual spelling are made. In this approach, the dictionary will be much like a lexicon that contains only the stems. While spelling the word will be analyzed with some stemmizing tool. In this analysis the word will be corrected if a its is built correctly by a morphological rule and the stem is correctly found in the Lexicon.
Based on these considerations we made our plan to add the support of morphological rules to aspell. We decided to do it in two separated steps regarding the 2 methods stated above. We just felt that maybe the amount of source code and changes required by the second method may be more than what you expect, so we decided to proceed the 1st method as our first move. This way we've developed a special word expander that accept Persian (Farsi) stems with some controlling flags on them. Based on these flags, the expander can expand the stems to a standard dictionary file.
Now we are concerned with the way these codes can be added to standard aspell codes. We do will that the patch be added to aspell codes to expand the IMPORT switch capabilities.
Please let us know how do you think about all these?
Gostareh Negar & Pariansoft
Tel: +98 (21) 8731192
Fax: +98(21) 8731191
|[Prev in Thread]||Current Thread||[Next in Thread]|