[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: expand/unexpand: add tests, refactor common code
From: |
Assaf Gordon |
Subject: |
Re: expand/unexpand: add tests, refactor common code |
Date: |
Sat, 16 Jul 2016 21:52:08 -0400 |
Hello,
> On Jun 27, 2016, at 06:56, Pádraig Brady <address@hidden> wrote:
>
> On 27/06/16 06:17, Assaf Gordon wrote:
>> Hello Pádraig and all,
>>
>>> On Jun 25, 2016, at 07:20, Pádraig Brady <address@hidden> wrote:
>>>
>>> As part of this, or at least before looking at multibyte changes,
>>> it would be worth considering this proposal for changing the
>>> unexpand algorithm: http://bugs.gnu.org/23335
>>
>> The above bug-report addresses this TODO item:
>> ===
>> unexpand: [http://www.opengroup.org/onlinepubs/007908799/xcu/unexpand.html]
>> printf 'x\t \t y\n'|unexpand -t 8,9 should print its input, unmodified.
>> printf 'x\t \t y\n'|unexpand -t 5,8 should print "x\ty\n"
>> ===
>
> I think the second command is wrong there actually?
> Surely it should print "x\t\t y\n"
Digging a bit deeper about various 'unexpand' implementation, it seems there
are more differences.
Attached is a summary of most of coreutil's unexpand tests on various systems.
The trivial cases give the same results, but more tricky cases (e.g. the
'blanks' and 'posix' tests) do differ.
The test script is here: http://files.housegordon.org/tmp/test-unexpand-2.sh
(the last 'ff' octet for AIX can be ignored, I suspect a bug in AIX's unexpand
when lines are not '\n' terminated).
Example (the inputs are 'blank-1' and 'blank-11' from
<coreutils>/tests/misc/unexpand.pl):
blanks-1 AIX-1 09 62 09 09 63 09 09 09 64
blanks-1 Darwin-14.4.0 20 62 09 20 63 09 09 20 64
blanks-1 FreeBSD-10.1-RELEASE 20 62 09 20 63 09 09 20 64
blanks-1 Linux-3.16.0-4-amd64 09 62 09 09 63 09 09 09 64
blanks-1 SunOS-5.11 20 62 20 20 63 20 20 20 64
blanks-11 AIX-1 09 09 34
blanks-11 Darwin-14.4.0 09 34
blanks-11 FreeBSD-10.1-RELEASE 09 34
blanks-11 Linux-3.16.0-4-amd64 09 09 34
blanks-11 SunOS-5.11 09 20 34
And so I wonder if it's best to leave unexpand's algorithm as-is, for the sake
of backwards-compatability (if someone is expecting coreutils' expected
behavior),
and then focus back on multibyte character processing in 'expand' (with or
without using the refactoring patches).
unexpand-comparison.txt.xz
Description: Binary data
regards,
- assaf
- Re: expand/unexpand: add tests, refactor common code,
Assaf Gordon <=