aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aspell-user] Hyphens and apostrophes in words


From: Ciarán Ó Duibhín
Subject: Re: [Aspell-user] Hyphens and apostrophes in words
Date: Sun, 19 May 2013 15:56:52 +0100

Thanks for replying, Carlo.

I will repeat my tests, I begin with problem no 2. Remember, I am using aspell 0.50.3 for Windows.

I open the command prompt window, go to folder C:\Program Files\Aspell\data, I change in "en.dat" from "special ' -*-" to "special ' ***".

I put the words "some odd words 'twas more" in a text file "variant.dat" and run "aspell --lang en create master ./variant < variant.dat" followed by "copy variant. ..\dict\variant".

I put some test text into "testaspell.txt": some odd words xtekt 'twas more

Now I run "aspell check --master variant testaspell.txt" and two words are queried: xtekt and 'twas. For 'twas, the suggested replacement is 'twas !

This is the same result I got before. Have I left something out? Would this work differently on other systems or versions?

Ciarán Ó Duibhín.

----- Original Message ----- From: "Carlo Traverso" <address@hidden>
To: <address@hidden>
Cc: <address@hidden>
Sent: Sunday, May 19, 2013 11:19 AM
Subject: Re: [Aspell-user] Hyphens and apostrophes in words



"ciaran" == =?iso-8859-1?B?Q2lhcuFuINMgRHVpYmjtbg==?= <iso-8859-1> writes:

   ciaran> I'd like to know which, if any, spellcheckers can be
   ciaran> configured to act like this.  (The examples are from
   ciaran> English but the real need comes from other languages.)
   ciaran> Asking here about aspell particularly, of course.

   ciaran> First, if necessary, allow the dictionary to contain words
   ciaran> with apostrophe "'" and hyphen "-" in any position. (I am
   ciaran> aware of the side-effects of this and am not worried by
   ciaran> them.)

   ciaran> Now, when checking text:

   ciaran> 1. Accept a word containing a hyphen if EITHER the
   ciaran> dictionary contains the whole word including the hyphen
   ciaran> ("hotch-potch") OR if the dictionary contains both parts
   ciaran> separately ("half-moon").

   ciaran> 2. With a dictionary containing "'twas" but not "twas",
   ciaran> accept "'twas".

   ciaran> 3. With a dictionary containing "well" but not "'well",
   ciaran> not accept "'well".

aspell can do 2 and 3, (but you have to recompile the English
dictionary after changing the handling of ' in the .dat file; and of
course add the acceptable words; this is the aspell way to do your
"First" point).

For 1, you should modify the .dat file again allowing - in the middle
of a word, add the composed words, and pass the spell-checker twice,
once with the modified dictionary, (to accept the words with -) once
with the original one (or rather the one modified in the first step)
to accept the two components. The first pass will refuse the words
with - not included, the second pass will split their components and
check again.

I don't think that it is possible to do it with one pass, combining
the two dictionaries in one .multi file since the .dat have to be
different (and hence the word tokens will be different).

Carlo Traverso




reply via email to

[Prev in Thread] Current Thread [Next in Thread]