[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Aspell-user] Hyphens and apostrophes in words
From: |
Lars Aronsson |
Subject: |
Re: [Aspell-user] Hyphens and apostrophes in words |
Date: |
Tue, 21 May 2013 22:40:27 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130510 Thunderbird/17.0.6 |
On 05/21/2013 04:55 PM, Kevin Atkinson wrote:
Aspell does not support what I think you are after very well. That is
why it is not enabled by default in the English dictionary. For some
insight on why see:
http://aspell.net/man-html/Words-With-Symbols-in-Them.html
I think that you started out with the goal of making a
spell checker, and for that you need tokenization, and
so you need to distinguish letters and space and
punctuation. But for grammar checking or translation,
you need to be able to find phrases and patterns, not
just tokens. Perhaps the basic layer of a larger, more
generic natural language processing library should have
a matching algorithm that isn't based on tokenization.
I have no idea if such libraries exist or what the current
state of the art is. But I think it's natural for a dictionary
to contain both words (cat, dog) and phrases (kill two
birds with one stone), so that "kill two dogs with one
stone" will generate a warning that perhaps the user
meant to write "birds" there. That would certainly be
another type of application than Aspell, but perhaps
a more useful one.
--
Lars Aronsson (address@hidden)
Aronsson Datateknik - http://aronsson.se