[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#24015: [PATCH] sort: make -h work with -k and blank used as thousand
From: |
Pádraig Brady |
Subject: |
bug#24015: [PATCH] sort: make -h work with -k and blank used as thousands separator |
Date: |
Sun, 17 Jul 2016 20:51:48 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 17/07/16 17:02, Kamil Dudka wrote:
> * src/sort.c (find_unit_order): Allow to skip only one occurrence
> of thousands_sep to avoid finding the unit in the next column in case
> thousands_sep matches as blank and is used as column delimiter.
> * tests/misc/sort-h-thousands-sep.sh: Add regression test for this bug.
> * tests/local.mk: Reference the test.
> Reported at https://bugzilla.redhat.com/1355780
> ---
> src/sort.c | 12 ++++++----
> tests/local.mk | 1 +
> tests/misc/sort-h-thousands-sep.sh | 45
> ++++++++++++++++++++++++++++++++++++++
> 3 files changed, 54 insertions(+), 4 deletions(-)
> create mode 100755 tests/misc/sort-h-thousands-sep.sh
>
> diff --git a/src/sort.c b/src/sort.c
> index f717604..a2cadda 100644
> --- a/src/sort.c
> +++ b/src/sort.c
> @@ -1904,12 +1904,16 @@ find_unit_order (char const *number)
> to be lacking in units.
> FIXME: add support for multibyte thousands_sep and decimal_point. */
>
> - do
> + while (ISDIGIT (ch = *p++))
> {
> - while (ISDIGIT (ch = *p++))
> - nonzero |= ch - '0';
> + nonzero |= ch - '0';
> +
> + /* Allow to skip only one occurrence of thousands_sep to avoid finding
> + the unit in the next column in case thousands_sep matches as blank
> + and is used as column delimiter. */
> + if (*p == thousands_sep)
> + ++p;
> }
> - while (ch == thousands_sep);
This is an improvement.
Though I now also see an existing inconsistency where we treat trailing blanks
in this case.
I.E. this inconsistency with:
$ printf '%s\n' '1 M' '2 K' | LANG=en_US git/coreutils/src/sort -h
1 M
2 K
$ printf '%s\n' '1 M' '2 K' | LANG=sv_SE git/coreutils/src/sort -h
2 K
1 M
We should probably not allow/consider a blank after the last digit
as part of the number here. I.E. the first output is correct,
treating the input as 2 separate fields.
> diff --git a/tests/misc/sort-h-thousands-sep.sh
> b/tests/misc/sort-h-thousands-sep.sh
> new file mode 100755
> index 0000000..a1e02de
> --- /dev/null
> +++ b/tests/misc/sort-h-thousands-sep.sh
> @@ -0,0 +1,45 @@
> +#!/bin/sh
> +# exercise 'sort -h' in locales where thousands separator is blank
> +
> +# Copyright (C) 2016 Free Software Foundation, Inc.
> +
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
> +print_ver_ sort
> +
> +tee in > exp1 << _EOF_
> +1 1k 4 003 1M
> +2k 2M 4 002 2
> +3M 3 4 001 3k
> +_EOF_
> +
> +cat > exp2 << _EOF_
> +3M 3 4 001 3k
> +1 1k 4 003 1M
> +2k 2M 4 002 2
> +_EOF_
> +
> +cat > exp3 << _EOF_
> +3M 3 4 001 3k
> +2k 2M 4 002 2
> +1 1k 4 003 1M
> +_EOF_
> +
A testing for the case I highlighted would be good.
> +for i in 1 2 3; do
> + LC_ALL="sv_SE.utf8" sort -h -k $i "in" > "out${i}" || fail=1
> + compare "exp${i}" "out${i}" || fail=1
> +done
We'd have to skip_ the test if sv_SE wasn't available.
Maybe something like:
test "$(LC_ALL=sv_SE locale thousands_sep)" = ' ' ||
skip_ 'The swedish locale with blank thousands separator is unavailable'
This deserves an entry in NEWS also.
thanks!
Pádraig
- bug#24015: [PATCH] sort: make -h work with -k and blank used as thousands separator, Kamil Dudka, 2016/07/17
- bug#24015: [PATCH] sort: make -h work with -k and blank used as thousands separator,
Pádraig Brady <=
- bug#24015: [PATCH] sort: make -h work with -k and blank used as thousands separator, Pádraig Brady, 2016/07/17
- bug#24015: [PATCH v2 1/3] sort: deduplicate code for traversing numbers, Kamil Dudka, 2016/07/18
- bug#24015: [PATCH v2 3/3] sort: with -h, disallow thousands separator between number and unit, Kamil Dudka, 2016/07/18
- bug#24015: [PATCH v2 2/3] sort: make -h work with -k and blank used as thousands separator, Kamil Dudka, 2016/07/18
- bug#24015: [PATCH v2 1/3] sort: deduplicate code for traversing numbers, Pádraig Brady, 2016/07/18
- bug#24015: [PATCH v2 1/3] sort: deduplicate code for traversing numbers, Pádraig Brady, 2016/07/18
- bug#24015: [PATCH v2 1/3] sort: deduplicate code for traversing numbers, Kamil Dudka, 2016/07/19