aspell-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[aspell-devel] aspell 0.60 prezip.c & compress.c improvements


From: Jose Da Silva
Subject: [aspell-devel] aspell 0.60 prezip.c & compress.c improvements
Date: Fri, 17 Sep 2004 17:41:03 -0700
User-agent: KMail/1.6.1

Hi,
Please accept these additional fixes.
Explanations for each, below.
 thanks.

---1--prezip.c---
Speed improvement:
Not necessary to test  w[l] != '\0' if already tested p[l] != '\0' because 
the next test is for p[l] == w[l]

---2--prezip.c---
fflush needed to flush out remaining binary since there is no trailing '\n'

---3--prezip.c---
Autosense & removal of trailing CR,
Currently, compress is immune to the carriagereturn-linefeed differences 
between DOS based text and Linux/Mac/other based text lists, but prezip 
processes the carriage return if you mix lists which means DOS-based versions 
of prezip are going to create error-filled or dirty text lists
The added code should hopefully take care of differences of inputting from 
mixed DOS-based and non-DOS-based lists while still being able to work on 
wordlists that use an internal CR as a valid character.

test samples, one file with CR and one with no CR, both produce 60byte files:
prezip -z <q_cr.txt >q1.pwl
prezip -z <q_no_cr.txt >q2.pwl

diff q1.pwl q2.pwl   = no differences = what you want  :-)

prezip -d <q1.pwl >q.txt    = 84 bytes for linux-based aspell
prezip -d <q1.pwl >q.txt    = 91 bytes for DOS-based aspell


---1--compress.c---
compress.c needs additional fix:
        #define BUFSIZE 256

should become something like:
        /* BUFSIZE must be 256  to work correctly */
        #define BUFSIZE 256

...so that potential modifications don't change BUFSIZE != 256 since it will 
introduce potential errors, where:
(1) a number higher than 256 will introduce an error for word compression 
larger than 255 since the length is encoded as 1char={0...255}
(2) a number lower than 256 will be a potential problem for large words 
getting uncompressed... example BUFSIZE = 10 will have an error with a 
wordlength=20 chars long.

---other, misc.---
Canadian English spelling missing "blonde"

Attachment: q_no_cr.txt
Description: Text document

Attachment: q_cr.txt
Description: Text document

Attachment: diff_prezip.txt
Description: Text document

Attachment: prezip.c
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]