[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Aramorph-users] tagging
From: |
Pierrick Brihaye |
Subject: |
Re: [Aramorph-users] tagging |
Date: |
Thu, 11 May 2006 14:54:10 +0200 |
User-agent: |
Thunderbird 1.5.0.2 (Windows/20060308) |
Hi,
[message switched from HTML to plain text]
yousef Elarian a écrit :
We need a tagger to tag the output (stemmed by AraMorph's Analyzer and
Stemmer) of the words that Buckwalter's analyzer couldn't analyze to add
them to the database.. Any suggestions?
Sure. Such stems are already tagged with the NO_RESULT token type as
explained here :
http://www.nongnu.org/aramorph/english/lucene.html
Many instances are common typos that can easily be eliminated by
normalizing the rest of ALEF characters to bare ALEF.
You should consider working on the feedAlternativeSpellings() method.
see
http://cvs.savannah.nongnu.org/viewcvs/aramorph/src/java/gpl/pierrick/brihaye/aramorph/AraMorph.java?annotate=1.9&root=aramorph,
lines 562 sq.
Cheers,
p.b.