[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: some concern about the fix of " tail: consistently output all data f
From: |
Reuti |
Subject: |
Re: some concern about the fix of " tail: consistently output all data for truncated files" |
Date: |
Wed, 9 Nov 2016 20:42:47 +0100 |
Am 09.11.2016 um 09:00 schrieb Zizka, Jan (Nokia - CZ/Prague):
>> -----Original Message-----
>> From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
>> Sent: Wednesday, November 09, 2016 8:51 AM
>> To: Zizka, Jan (Nokia - CZ/Prague) <address@hidden>; Lian, George
>> (Nokia - CN/Hangzhou) <address@hidden>; Pádraig Brady
>> <address@hidden>; address@hidden
>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
>> (Nokia - CN/Hangzhou) <address@hidden>
>> Subject: RE: some concern about the fix of " tail: consistently output all
>> data
>> for truncated files"
>>
>>> Can you tell any real use case where the changed tail behaviour would fail
>>> and print old content as you describe? I mean some realy use case not the
>>> behaviour caused by GlusterFS bug.
>>
>> Not found from real environment, but we can design one program to do this:
>> A program write a log file, and it want to keep its first 1K bytes
>> always.
>> When the file reach its limit (e.g. 10K bytes), it truncates its content
>> to 1KB, then start to write content again.
>>
>> In this case, with new version, the beginning 1KB data will be printed by
>> tail
>> always when the truncate happen.
>
> yes I'm sure one can always find some artificial case, but can you think of
> any real
> usecase? Because I could not think for any kind of real use case.
I used `tail -f` in the past to feed the output of the logfile of IBM's Tivoli
Storage Manager to a remote syslog.
ITSM can truncate the logfile by keeping only the last e.g. 8 days (no
rotation), hence the file is getting shorter at one point in time.
(Nowadays I implemented this in syslog-ng directly to read the files and
forward it to a remote syslog-ng server. And yes: syslog-ng has this behavior
to output the first part of the file again in case it gets truncated. But as I
look at it only in case of a problem, it wasn't a reason for me to switch back
again.)
-- Reuti
(A sophisticated behavior would be to memorize the already output lines, and in
case the file gets shorter to scan for a block of at least N matching lines to
synchronize again - no double output, no missing lines.)
> Moreover what may happen is that in case of file rotation with old design
> that part
> of the data will be missing in tail output. And that is real usecase.
>
> Jan
>
>>
>>
>> Br, Jimmy
>>
>> -----Original Message-----
>> From: Zizka, Jan (Nokia - CZ/Prague)
>> Sent: Wednesday, November 09, 2016 3:41 PM
>> To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>;
>> Lian, George (Nokia - CN/Hangzhou) <address@hidden>; Pádraig
>> Brady <address@hidden>; address@hidden
>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
>> (Nokia - CN/Hangzhou) <address@hidden>
>> Subject: RE: some concern about the fix of " tail: consistently output all
>> data
>> for truncated files"
>>
>>> -----Original Message-----
>>> From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
>>> Sent: Wednesday, November 09, 2016 8:19 AM
>>> To: Zizka, Jan (Nokia - CZ/Prague) <address@hidden>; Lian, George
>>> (Nokia - CN/Hangzhou) <address@hidden>; Pádraig Brady
>>> <address@hidden>; address@hidden
>>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
>>> (Nokia - CN/Hangzhou) <address@hidden>
>>> Subject: RE: some concern about the fix of " tail: consistently output all
>> data
>>> for truncated files"
>>>
>>> Hi,
>>>
>>> Let's not mix 2 problems here.
>>
>> yes and I was not mixing the two :)
>>
>>>
>>> 1. glusterfs problem => We'll continue the investigation.
>>>
>>> 2. tail problem, let's discuss it separately from glusterfs bug, just from
>>> its
>>> own design.
>>> New version: when find file size reduce, print content from 0 to the
>>> reduced_size.
>>> Old version: when find file size reduce, stay in the end of the
>>> reduced size and wait for new content.
>>> Both 2 ways has its limitation, neither of them are perfect or precisely.
>>> Here I just want to say the older version is better than new version in my
>>> understanding.
>>> Refer to man manual, the '-f' option is designed to print the file which is
>>> on
>>> append mode, but not designed for the file which might have truncate
>>> happen on it.
>>> "tail" should focus on what is added, but not on the data from original
>>> printed size part of the file.
>>
>> yes exactly. And in case file is truncated or replaced tail has to assume it
>> is
>> with
>> new content which was added.
>>
>> Can you tell any real use case where the changed tail behaviour would fail
>> and print old content as you describe? I mean some realy use case not the
>> behaviour caused by GlusterFS bug.
>>
>> Jan
>>
>>> =============================
>>> # man tail
>>> TAIL(1) User Commands
>>> TAIL(1)
>>>
>>>
>>> NAME
>>> tail - output the last part of files
>>> ...
>>> -f, --follow[={name|descriptor}]
>>> output appended data as the file grows;
>>> ...
>>> =============================
>>>
>>> Br, Jimmy
>>>
>>> -----Original Message-----
>>> From: Zizka, Jan (Nokia - CZ/Prague)
>>> Sent: Wednesday, November 09, 2016 3:08 PM
>>> To: Zhang, Bingxuan (Nokia - CN/Hangzhou) <address@hidden>;
>>> Lian, George (Nokia - CN/Hangzhou) <address@hidden>; Pádraig
>>> Brady <address@hidden>; address@hidden
>>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Bao, Xiaohui
>>> (Nokia - CN/Hangzhou) <address@hidden>
>>> Subject: RE: some concern about the fix of " tail: consistently output all
>> data
>>> for truncated files"
>>>
>>>> -----Original Message-----
>>>> From: Zhang, Bingxuan (Nokia - CN/Hangzhou)
>>>> Sent: Wednesday, November 09, 2016 6:36 AM
>>>> To: Lian, George (Nokia - CN/Hangzhou) <address@hidden>;
>> Pádraig
>>>> Brady <address@hidden>; address@hidden
>>>> Cc: Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
>>>> (Nokia - CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia -
>>>> CN/Hangzhou) <address@hidden>
>>>> Subject: RE: some concern about the fix of " tail: consistently output all
>>> data
>>>> for truncated files"
>>>>
>>>> Hi,
>>>>
>>>> I wonder the original requirement of "tail", what is the purpose of this
>>> tool?
>>>> Referred to:
>>>> tail - output the last part of files
>>>>
>>>> Here when "tail" found the some file length become small, is it really
>> need
>>>> to print old content?
>>>
>>> but tail cannot know if that is old content. The truncate detection was
>>> added there
>>> to overcome problem when someone overwrites the file being tailed, in
>>> which case
>>> it should indeed start dumping the file from beggining.
>>>
>>>> My opinion is that ignore those old content is better alternative.
>>>
>>> OK but how would you do that as tail doens't know that it is old content ...
>>>
>>>>
>>>> It is possible those "old content" is written newly (e.g. truncate to 0,
>>>> then
>>>> write small content).
>>>> It is also possible those "old content" is really old (e.g. truncate to
>>>> small
>>>> size).
>>>>
>>>> So "tail" can do perfect design here to trace every piece of data write to
>>> the
>>>> file.
>>>> But it should focus on only the data to the last with current reality.
>>>>
>>>> So my opinion is "revert to previous design" is better choice then
>> currently.
>>>> What you think?
>>>
>>> If the change is reverted then you will get regressions on the cases for
>> which
>>> this
>>> was added so that is definately not an option.
>>>
>>> What should be fixed is GlusterFS instead of trying to make workarounds
>> for
>>> its
>>> misbehaviour. As Pádraig also noted:
>>>
>>>> This stale st_size behavior, giving a smaller value _after_ a read,
>>>> seems quite problematic to lots of apps though, not just tail(1).
>>>
>>> this will affect other applications and tools not only tail. If you make
>>> some
>>> kind of
>>> workaround in tail for this and GlusterFS is not fixed then this problem
>>> will
>>> stay
>>> hidden and will hit some other application sooner or later.
>>>
>>> Jan
>>>
>>>
>>>>
>>>>
>>>> Br, Jimmy
>>>>
>>>> -----Original Message-----
>>>> From: Lian, George (Nokia - CN/Hangzhou)
>>>> Sent: Wednesday, November 09, 2016 9:36 AM
>>>> To: Pádraig Brady <address@hidden>; address@hidden
>>>> Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou)
>> <address@hidden>;
>>>> Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
>> (Nokia
>>> -
>>>> CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia - CN/Hangzhou)
>>>> <address@hidden>
>>>> Subject: RE: some concern about the fix of " tail: consistently output all
>>> data
>>>> for truncated files"
>>>>
>>>> Hi,
>>>>> What network file system type is this?
>>>>
>>>> The file systems is GlusterFS of Redhat,
>>>>
>>>>> This stale st_size behavior, giving a smaller value _after_ a read,seems
>>>> quite problematic to lots of apps though, not just tail(1).
>>>> I agree, but I still suppose more application will do get st_size first
>>>> then
>> do
>>>> seek and read which will not over the size of file.
>>>>
>>>> We also have submit the issue to GlusterFS community, but till now, they
>>>> can't find the root cause in glusterfs.
>>>>
>>>> I still complain to "tail application", even if there has some issue on
>>>> glusterfs,
>>>> but "tail" eat all the space of the disk (by continues pseudo-truncate for
>> a
>>>> large syslog file) , I suggest "tail" could do some change to prevent it.
>>>>
>>>> Thanks & Best Regards,
>>>> George
>>>>
>>>> -----Original Message-----
>>>> From: Pádraig Brady [mailto:address@hidden]
>>>> Sent: Tuesday, November 08, 2016 7:29 PM
>>>> To: Lian, George (Nokia - CN/Hangzhou) <address@hidden>;
>>>> address@hidden
>>>> Cc: Zhang, Bingxuan (Nokia - CN/Hangzhou)
>> <address@hidden>;
>>>> Li, Deqian (Nokia - CN/Hangzhou) <address@hidden>; Zizka, Jan
>> (Nokia
>>> -
>>>> CZ/Prague) <address@hidden>; Bao, Xiaohui (Nokia - CN/Hangzhou)
>>>> <address@hidden>
>>>> Subject: Re: some concern about the fix of " tail: consistently output all
>>> data
>>>> for truncated files"
>>>>
>>>> On 08/11/16 02:50, Lian, George (Nokia - CN/Hangzhou) wrote:
>>>>> Hi,
>>>>>>> Add one more suggestion, if we have not a perfect solution to
>> consider
>>>> all the case of truncate, could we add an option to tail, such like tail
>>>> -no-
>>>> truncate
>>>>>>> If tail run with this option, than application not consider any
>> truncate
>>>> case.
>>>>>>>
>>>>>>> For example, I suppose syslog output file will not have any truncate
>>> case
>>>> in our environment, then the tail could use the option to avoid the mis-
>>>> truncated case?
>>>>>
>>>>>> Note for case 2) above, we only update fspec->size _after_ the read,
>>>>>> so I'm not sure how practical the race with reading a _smaller_ st_size
>>>> after that is?
>>>>>> I.E. the heuristic is fairly good I think,
>>>>>> so an option may be overkill.
>>>>>> We'd have to see a demonstratable issue to consider such an option.
>>>>>
>>>>> We have an issue now for tail a syslog file which stored in a network-
>>> based
>>>> file system. A automated cased need tail the syslog about one hour to
>> get
>>>> the syslog of that period,
>>>>> in that period of one hour , happen 6 times of un-expected file
>>> truncated
>>>> issue, so the output of tail has 6 times full syslog file, so the output
>>>> file is
>>> so
>>>> huge and eat all of the disks.
>>>>> The network-based file system maybe not so easy to change to meet
>> the
>>>> current implement of "tail" application.
>>>>> So I need helps from yours :)
>>>>>
>>>>> And which your mean for demonstratable? The issue we encounter
>>> could
>>>> be easy to reproduce, maybe the file-system is not so strict like ext4 file
>>>> system,
>>>>> but I still suggest "tail" application could do some change to adapt this
>>>> kinds network-based file system?
>>>>
>>>> It's important info that you have seen the issue.
>>>> What network file system type is this?
>>>> We might just revert this change if the issue is widespread enough.
>>>>
>>>> This stale st_size behavior, giving a smaller value _after_ a read,
>>>> seems quite problematic to lots of apps though, not just tail(1).
>>>>
>>>> thanks,
>>>> Pádraig.
>
>
- RE: some concern about the fix of " tail: consistently output all data for truncated files", (continued)
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zhang, Bingxuan (Nokia - CN/Hangzhou), 2016/11/09
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zizka, Jan (Nokia - CZ/Prague), 2016/11/09
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zhang, Bingxuan (Nokia - CN/Hangzhou), 2016/11/09
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zizka, Jan (Nokia - CZ/Prague), 2016/11/09
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zhang, Bingxuan (Nokia - CN/Hangzhou), 2016/11/09
- Re: some concern about the fix of " tail: consistently output all data for truncated files", Bernhard Voelker, 2016/11/09
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zizka, Jan (Nokia - CZ/Prague), 2016/11/09
- Re: some concern about the fix of " tail: consistently output all data for truncated files",
Reuti <=
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Zizka, Jan (Nokia - CZ/Prague), 2016/11/10
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Lian, George (Nokia - CN/Hangzhou), 2016/11/09
- Re: some concern about the fix of " tail: consistently output all data for truncated files", Bernhard Voelker, 2016/11/08
- RE: some concern about the fix of " tail: consistently output all data for truncated files", Lian, George (Nokia - CN/Hangzhou), 2016/11/08