--- Begin Message ---
Subject: |
[PATCH] sort: fix a bug with sort -u and xmemcoll0, and tune keycompare |
Date: |
Thu, 15 Jul 2010 16:33:10 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.10) Gecko/20100527 Thunderbird/3.0.5 |
Well, after my brave words about xmemcoll0 unlikely to be misused in 'sort'
I found a misuse of it. With sort -u, write_bytes replaces the trailing NUL
with a newline but it doesn't change it back after writing. In some cases
"sort -u" will output a line and then later use it as an argument for
comparison, which will mess up if the trailing NUL has been replaced.
I pushed the following patch to work around the problem.
Chen, can you please verify that "sort -u" does not access the same
line from multiple threads, even saved lines? If it does, then even this patch
is not enough: we would need to alter write_bytes so that it does not
modify its argument at all.
This patch also tunes keycompare slightly. I should have
broken it up into a different patch, but it sort of snuck in.
By the way, any objection if I put 'const' after the types that it
qualifies, e.g., "char const *" rather than "const char *"? That
was the usual style here.
Thanks.
>From 5a31febb1b363d9a3a59ed60d5f2051d520dd407 Mon Sep 17 00:00:00 2001
From: Paul R. Eggert <address@hidden>
Date: Thu, 15 Jul 2010 16:24:03 -0700
Subject: [PATCH] sort: fix a bug with sort -u and xmemcoll0, and tune keycompare
* src/sort.c (keycompare): Use xmemcoll0, as it avoids
a couple of stores.
(write_bytes): Leave the buffer the way we found it,
as it might be used again for a later comparison,
if -u is used.
---
src/sort.c | 12 +++++++-----
1 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/src/sort.c b/src/sort.c
index 7d31878..960df74 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2497,7 +2497,9 @@ keycompare (const struct line *a, const struct line *b,
bool show_debug)
}
}
- diff = xmemcoll (copy_a, new_len_a, copy_b, new_len_b);
+ copy_a[new_len_a++] = '\0';
+ copy_b[new_len_b++] = '\0';
+ diff = xmemcoll0 (copy_a, new_len_a, copy_b, new_len_b);
if (sizeof buf < size)
free (copy_a);
@@ -2664,13 +2666,11 @@ write_bytes (struct line const *line, FILE *fp, char
const *output_file)
{
char *buf = line->text;
size_t n_bytes = line->length;
-
- *(buf + n_bytes - 1) = eolchar;
+ char *ebuf = buf + n_bytes;
/* Convert TABs to '>' and \0 to \n when -z specified. */
if (debug && fp == stdout)
{
- char const *ebuf = buf + n_bytes;
char const *c = buf;
while (c < ebuf)
@@ -2678,7 +2678,7 @@ write_bytes (struct line const *line, FILE *fp, char
const *output_file)
char wc = *c++;
if (wc == '\t')
wc = '>';
- else if (wc == 0 && eolchar == 0)
+ else if (c == ebuf)
wc = '\n';
if (fputc (wc, fp) == EOF)
die (_("write failed"), output_file);
@@ -2688,8 +2688,10 @@ write_bytes (struct line const *line, FILE *fp, char
const *output_file)
}
else
{
+ ebuf[-1] = eolchar;
if (fwrite (buf, 1, n_bytes, fp) != n_bytes)
die (_("write failed"), output_file);
+ ebuf[-1] = '\0';
}
}
--
1.7.1
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#6643: [PATCH] sort: fix a bug with sort -u and xmemcoll0, and tune keycompare |
Date: |
Sun, 17 Apr 2011 11:09:41 +0200 |
Paul Eggert wrote:
> On 07/15/10 17:46, Pádraig Brady wrote:
>> I'll add a few basic tests I think to run under $mb_locale
>
> Thanks. Jim also reminded me that I should add at least one test for
> this bug, so I just pushed the one enclosed below. The test is
> locale-dependent and is not guaranteed to catch the bug on every host,
> but it did catch the bug in my RHEL 5 build host and that's good
> enough.
Thanks. Closing.
--- End Message ---