coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#13530: head: memory exhausted when printing all from stdin but l


From: Jim Meyering
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin but last P/E bytes
Date: Mon, 27 May 2013 03:47:35 +0200

Jim Meyering wrote:

> Pádraig Brady wrote:
>
>> unarchive 13530
>> stop
>
> Thanks.
>
> ...
>>> @@ -358,6 +356,14 @@ elide_tail_bytes_pipe (const char *filename, int fd, 
>>> uintmax_t n_elide_0)
>>>
>>>            if (buffered_enough)
>>>              {
>>> +              if (n_elide_0 != n_elide)
>>> +                {
>>> +                  error (0, 0, _("memory exhausted while reading %s"),
>>> +                         quote (filename));
>>> +                  ok = false;
>>> +                  goto free_mem;
>>> +                }
>>> +
> ...
>> Oh right it's coming back to me a bit now.
>> So by removing these upfront checks where possible,
>> it only makes sense of the program could under different
>> free mem available, fulfil the operation up to the specified limit.
>> In this case though, the program could never fulfil the request,
>> so it's better to fail early as is the case in the code currently?
>
> Well, it *can* fulfill the request whenever the request is degenerate,
> i.e., when the size of the input is smaller than N and also small enough
> to be read into memory.
>
> Technically, we could handle this case the same way we handle it
> in tac.c: read data from nonseekable FD and write it to a temporary file.
> I'm not sure it's worth the effort here, though.

FYI, here's an updated patch.
Added comments saying some of the above.

>From cdc8b879df442ac945565ec9d77d3d1154d0b820 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sun, 26 May 2013 09:49:22 -0700
Subject: [PATCH] head: with --bytes=-N don't fail early for large N on 32-bit
 systems
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/head.c (elide_tail_bytes_pipe): This command would fail
unnecessarily on 32-bit systems, where SIZE_MAX < 1E:
  head --bytes=-E < /dev/null
Remove the test that would make us fail for N larger than SIZE_MAX.
Instead, when that happens, pretend we've been given an N slightly
smaller than SIZE_MAX, and if we actually manage to read/allocate
that much data, fail only at the end.
* tests/misc/head-c.sh: Use $OFF_T_MAX, rather than "E" (an exabyte).
The former exposed a bug in an early version of this patch.
Thanks to Pádraig Brady for the test improvement.
---
 src/head.c           | 32 +++++++++++++++++++++++---------
 tests/misc/head-c.sh |  3 ++-
 2 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/src/head.c b/src/head.c
index 00e1be1..a7f3aa0 100644
--- a/src/head.c
+++ b/src/head.c
@@ -202,13 +202,18 @@ copy_fd (int src_fd, FILE *o_stream, uintmax_t n_bytes)
 static bool
 elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t n_elide_0)
 {
-  size_t n_elide = n_elide_0;
-
 #ifndef HEAD_TAIL_PIPE_READ_BUFSIZE
 # define HEAD_TAIL_PIPE_READ_BUFSIZE BUFSIZ
 #endif
 #define READ_BUFSIZE HEAD_TAIL_PIPE_READ_BUFSIZE

+  /* If n_elide_0 is too large, then we cannot always allocate enough memory
+     to do the requested job.  In that case, set n_elide to a value near
+     SIZE_MAX but small enough so that below, when we round it up, it does
+     not overflow.  */
+  size_t n_elide = MIN (n_elide_0, (SIZE_MAX - READ_BUFSIZE
+                                    - SIZE_MAX % READ_BUFSIZE));
+
   /* If we're eliding no more than this many bytes, then it's ok to allocate
      more memory in order to use a more time-efficient algorithm.
      FIXME: use a fraction of available memory instead, as in sort.
@@ -221,13 +226,6 @@ elide_tail_bytes_pipe (const char *filename, int fd, 
uintmax_t n_elide_0)
   "HEAD_TAIL_PIPE_BYTECOUNT_THRESHOLD must be at least 2 * READ_BUFSIZE"
 #endif

-  if (SIZE_MAX < n_elide_0 + READ_BUFSIZE)
-    {
-      char umax_buf[INT_BUFSIZE_BOUND (n_elide_0)];
-      error (EXIT_FAILURE, 0, _("%s: number of bytes is too large"),
-             umaxtostr (n_elide_0, umax_buf));
-    }
-
   /* Two cases to consider...
      1) n_elide is small enough that we can afford to double-buffer:
         allocate 2 * (READ_BUFSIZE + n_elide) bytes
@@ -358,6 +356,22 @@ elide_tail_bytes_pipe (const char *filename, int fd, 
uintmax_t n_elide_0)

           if (buffered_enough)
             {
+              if (n_elide_0 != n_elide)
+                {
+                  /* This happens when:
+                     - size_t is smaller than uintmax_t
+                     - we've been told to elide more than ~SIZE_MAX bytes
+                       from a non-seekable input,
+                     - we have actually buffered nearly SIZE_MAX bytes of input
+                     Technically, we could write the buffered data to a
+                     temporary file and continue on, but that'd involve
+                     a lot more code for relatively little gain.  */
+                  error (0, 0, _("memory exhausted while reading %s"),
+                         quote (filename));
+                  ok = false;
+                  goto free_mem;
+                }
+
               if (fwrite (b[i_next], 1, n_read, stdout) < n_read)
                 {
                   error (0, errno, _("write error"));
diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
index 37a86ce..8b4df5c 100755
--- a/tests/misc/head-c.sh
+++ b/tests/misc/head-c.sh
@@ -19,6 +19,7 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ head
 require_ulimit_v_
+getlimits_

 # exercise the fix of 2001-08-18, based on test case from Ian Bruce
 echo abc > in || framework_failure_
@@ -31,6 +32,6 @@ esac
 # Only allocate memory as needed.
 # Coreutils <= 8.21 would allocate memory up front
 # based on the value passed to -c
-(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
+(ulimit -v 20000; head --bytes=-$OFF_T_MAX < /dev/null) || fail=1

 Exit $fail
--
1.8.3



reply via email to

[Prev in Thread] Current Thread [Next in Thread]