[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision
From: |
Paul Eggert |
Subject: |
bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils) |
Date: |
Sun, 08 Aug 2010 23:21:29 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.11) Gecko/20100713 Thunderbird/3.0.6 |
On 08/08/10 05:24, Bruno Haible wrote:
> sort: reduce number of strxfrm calls
Thanks for that suggestion. Amusingly enough, it made 'sort -R'
slower on the first benchmark I tried it on, which was 'sort -R *'.
But that's an unfair benchmark, since '*' expanded to executables and
other non-text files. Overall, it's a good idea. However, the code
need not be quite that long, since there's no need to do size_t
overflow checking. I pushed this:
>From 0061819c7e1bbc26586cc5977ea96da016f7cea2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Sun, 8 Aug 2010 23:14:38 -0700
Subject: [PATCH] sort: speed up -R with long lines in hard locales
* src/sort.c (compare_random): Guess that the output will be
3X the input. This avoids the overhead of calling strxfrm
twice on typical implementations. Suggested by Bruno Haible.
---
src/sort.c | 18 +++++++++++++-----
1 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/src/sort.c b/src/sort.c
index dcfd24f..148ed3e 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2024,6 +2024,7 @@ compare_random (char *restrict texta, size_t lena,
char stackbuf[4000];
char *buf = stackbuf;
size_t bufsize = sizeof stackbuf;
+ void *allocated = NULL;
uint32_t dig[2][MD5_DIGEST_SIZE / sizeof (uint32_t)];
struct md5_ctx s[2];
s[0] = s[1] = random_md5_state;
@@ -2047,6 +2048,16 @@ compare_random (char *restrict texta, size_t lena,
/* Store the transformed data into a big-enough buffer. */
+ /* A 3X size guess avoids the overhead of calling strxfrm
+ twice on typical implementations. Don't worry about
+ size_t overflow, as the guess need not be correct. */
+ size_t guess_bufsize = 3 * (lena + lenb) + 2;
+ if (bufsize < guess_bufsize)
+ {
+ bufsize = MAX (guess_bufsize, bufsize * 3 / 2);
+ buf = allocated = xrealloc (allocated, bufsize);
+ }
+
size_t sizea =
(texta < lima ? xstrxfrm (buf, texta, bufsize) + 1 : 0);
bool a_fits = sizea <= bufsize;
@@ -2062,9 +2073,7 @@ compare_random (char *restrict texta, size_t lena,
bufsize = sizea + sizeb;
if (bufsize < SIZE_MAX / 3)
bufsize = bufsize * 3 / 2;
- buf = (buf == stackbuf
- ? xmalloc (bufsize)
- : xrealloc (buf, bufsize));
+ buf = allocated = xrealloc (allocated, bufsize);
if (texta < lima)
strxfrm (buf, texta, sizea);
if (textb < limb)
@@ -2119,8 +2128,7 @@ compare_random (char *restrict texta, size_t lena,
diff = xfrm_diff;
}
- if (buf != stackbuf)
- free (buf);
+ free (allocated);
return diff;
}
--
1.7.2
- bug#6789: MD5 is broken, (continued)
- bug#6789: MD5 is broken, Bruno Haible, 2010/08/08
- bug#6789: MD5 is broken, Paul Eggert, 2010/08/09
- bug#6789: MD5 is broken, Pádraig Brady, 2010/08/09
- bug#6789: MD5 is broken, Bruno Haible, 2010/08/14
- bug#6789: MD5 is broken, Pádraig Brady, 2010/08/14
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Bruno Haible, 2010/08/08
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Paul Eggert, 2010/08/09
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Bruno Haible, 2010/08/10
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Paul Eggert, 2010/08/11
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Bruno Haible, 2010/08/09
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils),
Paul Eggert <=
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Bruno Haible, 2010/08/09
- bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils), Paul Eggert, 2010/08/10