[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: csplit 'write error' missing errno?
From: |
Pádraig Brady |
Subject: |
Re: csplit 'write error' missing errno? |
Date: |
Fri, 23 Oct 2015 01:08:44 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 23/10/15 00:40, Pádraig Brady wrote:
> On 22/10/15 21:01, Jim Meyering wrote:
>> On Thu, Oct 22, 2015 at 12:22 PM, Assaf Gordon <address@hidden> wrote:
>>> Hello,
>>>
>>> In a long-running pipeline, we've encountered the following error message
>>> with 'csplit':
>>> csplit: write error for ‘[FILENAME]`
>>>
>>> It is very likely stemming from out-of-disk-space situation, but because
>>> there was no errno printed, it's hard to know for sure, making
>>> troubleshooting a bit frustrating.
>>>
>>> The error is printed in 'close_output_file()', in response to checking
>>> 'ferror()'.
>>>
>>> I wonder if it would be beneficial to add the errno information to the
>>> error message.
>>>
>>> >From a cursory look it seems 'close_output_file()' is (almost?) always
>>> >called after
>>> 'dump_rest_of_file()' or 'save_line_to_file()' which themselves use
>>> 'fwrite()'.
>>>
>>> So it could perhaps be assumed that if if 'ferror()' indicates an error on
>>> the output stream,
>>> then the last operation was fwrite?
>>> The downside is that if my assumption doesn't hold, it could print a
>>> misleading errno information.
>>>
>>> The following patch is a suggestion (not fully tested):
>>>
>>> ===
>>> diff --git a/src/csplit.c b/src/csplit.c
>>> index d966df5..d2a0228 100644
>>> --- a/src/csplit.c
>>> +++ b/src/csplit.c
>>> @@ -1004,7 +1004,7 @@ close_output_file (void)
>>> {
>>> if (ferror (output_stream))
>>> {
>>> - error (0, 0, _("write error for %s"), quote (output_filename));
>>> + error (0, errno, _("write error for %s"), quote
>>> (output_filename));
>>> output_stream = NULL;
>>> cleanup_fatal ();
>>> }
>>> ===
>>
>> Hi Assaf,
>>
>> Thanks for investigating.
>>
>> That's a fundamental limitation of streams.
>> When detecting that error via ferror, the errno value is not
>> guaranteed to be useful. From what I recall of the fwrite
>> specification, even immediately after a failed fwrite, errno is not
>> guaranteed to be useful, but in practice, it always is,
>> so coreutils programs do rely on that.
>
> Right, and whether to check all writes is a
> a trade off between maintainability and precise errors.
> Checking every output function in addition to checking ferror on close
> (as is done in close_stdout for example) will ensure _precise_ errors in
> all edge cases, while just relying on ferror will always diagnose errors,
> though in certain edge cases not with a precise error message. as you've seen.
> Details at:
> http://www.gnu.org/ghm/2011/paris/slides/jim-meyering-goodbye-world.pdf
>
>
>>> I'll be able to troubleshoot and provide more information in couple of days.
>>> A more elaborate solution is to save the errno after 'fwrite()' in a
>>> variable,
>>> and print that variable. I can send such a patch if that's a better idea.
>>
>> Yes, this sounds better.
>
> Yes the tradeoff discussed above is a bad one in csplit's case
> as it only has a single fwrite() call. We should check the output
> of that and call into cleanup_fatal() to exit early to avoid redundant
> processing when one can't write the current output file. This could
> be tested with something like: timeout 10 yes | csplit ... /dev/full
Actually it's a bit awkward to test. You might have to resort
to tests using require_gcc_shared_ and wrap write() to return ENOSPC.
cheers,
Pádraig.