coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Speedup wc -l


From: Bernhard Voelker
Subject: Re: [PATCH] Speedup wc -l
Date: Wed, 18 Mar 2015 18:24:20 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0

On 03/18/2015 04:57 PM, Pádraig Brady wrote:
+  wc -l is now up to 6 times faster with short lines.

diff --git a/src/wc.c b/src/wc.c
index 8cb5163..8125100 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -264,6 +264,8 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, 
off_t current_pos)
      {
        /* Use a separate loop when counting only lines or lines and bytes --
           but not chars or words.  */
+      bool long_lines = false;
+      bool check_len = true;
        while ((bytes_read = safe_read (fd, buf, BUFFER_SIZE)) > 0)
          {
            char *p = buf;
@@ -275,12 +277,39 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, 
off_t current_pos)
                break;
              }

+          char *end = p + bytes_read;
+          char *line_start = p;
+
+          /* Avoid function call overhead for shorter lines.  */
+          if (check_len)
+            while (p != end)
+              {
+                lines += *p++ == '\n';
+                /* If there are more than 300 chars in the first 10 lines,
+                   then use memchr, where system specific optimizations
+                   may outweigh any function call overhead.  */
+                if (lines <= 10)
+                  {
+                    if (p - line_start > 300)
+                      {
+                        long_lines = true;
+                        break;
+                      }
+                  }
+              }
+          else if (! long_lines)
+            while (p != end)
+              lines += *p++ == '\n';
+

Doesn't this run into the memchr loop in both cases
(which the compiler seems to optimize away, but looks odd)?

+          /* memchr is more efficient with longer lines.  */
            while ((p = memchr (p, '\n', (buf + bytes_read) - p)))
              {
                ++p;
                ++lines;
              }
+
            bytes += bytes_read;
+          check_len = false;
          }
      }
  #if MB_LEN_MAX > 1

In my first tests I can second the speedup for short lines.

Where do you have the magic 30 from?
Tests here show that the effect reverses with line length
between 10 and ~27, i.e., the new version is ~80% slower.
Beyond that, both versions behave the same, as expected.
I have to test more this night.

Thanks & have a nice day,
Berny



reply via email to

[Prev in Thread] Current Thread [Next in Thread]