[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: cut with multibyte support for delimiter
From: |
Sebastian Kisela |
Subject: |
Re: cut with multibyte support for delimiter |
Date: |
Fri, 22 Sep 2017 15:05:15 +0200 |
Hi Assaf!
One thing I noticed is that the tests fail on my computer.
> I see things like:
> ====
> cut-multibyte.pl: test mbd-newline-24: stdout mismatch, comparing
> mbd-newline-24.2 (expected) and mbd-newline-24.O (actual)
> *** mbd-newline-24.2 Mon Sep 18 20:02:46 2017
> --- mbd-newline-24.O Mon Sep 18 20:02:46 2017
> ***************
> *** 1 ****
> ! aꝤb
> --- 1 ----
> ! a$Ꝥb
> ====
> It is the extra dollar sign before the multibyte character which hints
> to me it is related to the interaction between Perl
> (which converts \xNN sequences) and the shell command line
> (where you've used the $'\xNN' syntax).
>
> The test was:
> ['mbd-newline-24', "-d'\n'", '-f1,2', "--ou=\$'\xEA\x9D\xA4'",
> {IN=>"a\nb\n"}, {OUT=>"a\xEA\x9D\xA4b\n"}],
>
>
Thanks for testing! I was able to reproduce it and it should be just fine
with -d'\x{NN}'
mentioned bellow, which I used.
> Also,
> I'm not sure if coreutils currently allows the newer $'\xNN' construct
> in tests - this might be too new to be supported everywhere (comments,
> anyone? I'll also try to look for them in other tests).
>
> In any case, Perl itself can easily generate UTF-8 characters and send
> them as-is to the program being tested, I think that will suffice.
>
>
> Planning ahead, since this is going to be a large addition,
> we'll need to ask you for copyright assignment for your code contributions.
>
> You can read more about it here:
> https://www.gnu.org/licenses/why-assign.en.html
> https://www.fsf.org/licensing/assigning.html
>
> To begin the process, please fill the information here:
> https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/
> Copyright/request-assign.future
>
> and send it to address@hidden .
>
Thanks, I will.
Attached patch with fixed tests. (I should probably add even more tests
anyway)
Sebastián.
cut-multibyte-delimiter.tar.gz
Description: GNU Zip compressed data