[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Splitting text into words and non-words
From: |
Kevin Atkinson |
Subject: |
Re: Splitting text into words and non-words |
Date: |
Sat, 02 Jan 1999 23:42:08 +0000 |
Asger Alstrup Nielsen wrote:
> > And you can't always count words between two punctuation charters as
> > code because then a string like
> >
> > . Howevr,
> >
> > would mark Howevr as code when it is clearly part of a sentence.
>
> No, since punctuation is dropped, this would be spellchecked.
Sorry I ment to say: And treating punctuation charters as code symbols
won't work either...
>
>
> My approach to the problem was that I wanted the spellchecker to only throw
> away "words" that are known to be non-words, if possible, and keep others that
> we are not sure of. But mostly, I just wanted to give you some feedback
> because you asked for it. Of course it is possible to do better. It seems
> your approach is suitable for that.
Sorry if I belittled you, that was not my intention. I just ment to
suggest an alternate approach.
And thanks for the feedback.
--
Kevin Atkinson
address@hidden
http://metalab.unc.edu/kevina/