aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [aspell] Affix or something else


From: Trond Eivind Glomsrød
Subject: Re: [aspell] Affix or something else
Date: 31 Jan 2001 11:07:30 -0500
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.0.96

Kevin Atkinson <address@hidden> writes:

> Subject: [aspell] The Deal on Affix Compression
> Date: Fri, 12 Mar 1999 19:39:29 -0500
> 
> I realize that affix compression is important for languages with a
> lot of affix compression however it is not vital.  The reason is that
> without affix compression all you have to do is list all all of the
> possible combinations. I release that this wastes space however it
> is doable.
> 
> For example the word list that comes with Aspell has
>   70,598 words
> After running it through the munchlist script it has
>   30,953 words
> Which leads to a ratio of
>   2.3
> 
> Now a polish word lists has the numbers.
>  1,041,430
>    146,626
>   7.1
> 
> Which means that the polish language affix compression saves about 3.1
> times more space than it would for the English dictionary.  Not that
> big of a deal.


You're downplaying the significance of it:

address@hidden i386]# ls -l /usr/lib/aspell/english*
-rw-r--r--    1 root     root      2424832 nov 30 19:09 
/usr/lib/aspell/english-lrg-only
-rw-r--r--    1 root     root      2355200 nov 30 19:09 
/usr/lib/aspell/english-med-only
lrwxrwxrwx    1 root     root           14 jan 26 11:39 
/usr/lib/aspell/english.multi -> american.multi
address@hidden i386]# ls -l /usr/lib/aspell/polish  
-rw-r--r--    1 root     root     35622912 aug 20 11:52 /usr/lib/aspell/polish
address@hidden i386]# ls -l /usr/lib/aspell/czech 
-rw-r--r--    1 root     root     64434176 aug 20 11:26 /usr/lib/aspell/czech
address@hidden i386]#

This size has made us leave multiple languages out, FTTB - Polish,
Czech, Esperanto.

Here are the compressed sizes:

address@hidden i386]# ls -l aspell-*
-rw-r--r--    3 root     root      3271749 aug 30 18:13 aspell-0.32.5-1.i386.rpm
-rw-r--r--   92 root     root      3597773 aug 30 18:13 aspell-ca-0.1-6.i386.rpm
-rw-r--r--    3 root     root     30218438 aug 30 18:13 aspell-cs-0.2-3.i386.rpm
-rw-r--r--   92 root     root      5911215 aug 30 18:13 aspell-da-0.2-3.i386.rpm
-rw-r--r--   92 root     root      4535531 aug 30 18:13 
aspell-de-0.1.1-7.i386.rpm
-rw-r--r--    3 root     root       600163 aug 30 18:13 
aspell-devel-0.32.5-1.i386.rpm
-rw-r--r--    3 root     root        52426 aug 30 18:13 
aspell-en-ca-0.32.5-1.i386.rpm
-rw-r--r--    3 root     root        52486 aug 30 18:13 
aspell-en-gb-0.32.5-1.i386.rpm
-rw-r--r--    3 root     root     10192261 aug 30 18:14 aspell-eo-0.1-6.i386.rpm
-rw-r--r--    3 root     root      7324582 aug 30 18:14 aspell-es-0.1-8.i386.rpm
-rw-r--r--   92 root     root      3059498 aug 30 18:14 aspell-fr-0.3-6.i386.rpm
-rw-r--r--   92 root     root       728573 aug 30 18:14 aspell-it-0.1-6.i386.rpm
-rw-r--r--   92 root     root      3346648 aug 30 18:14 aspell-nl-0.1-6.i386.rpm
-rw-r--r--   92 root     root      5261253 aug 30 18:14 aspell-no-0.1-8.i386.rpm
-rw-r--r--    3 root     root     16603359 aug 30 18:14 aspell-pl-0.1-6.i386.rpm
-rw-r--r--   92 root     root      1903164 aug 30 18:14 aspell-sv-0.1-8.i386.rpm
address@hidden i386]#


-- 
Trond Eivind Glomsrød
Red Hat, Inc.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]