coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] stat: don't explicitly request file size for filenames


From: Andreas Dilger
Subject: Re: [PATCH] stat: don't explicitly request file size for filenames
Date: Tue, 9 Jul 2019 21:43:34 +0000

On Jul 5, 2019, at 04:18, Jeff Layton <address@hidden> wrote:
>
> On Thu, 2019-07-04 at 21:18 +0000, Andreas Dilger wrote:
>> On Jul 4, 2019, at 04:43, Jeff Layton <address@hidden> wrote:
>>> On Thu, 2019-07-04 at 06:37 -0400, Jeff Layton wrote:
>>>> On Wed, 2019-07-03 at 20:24 +0000, Andreas Dilger wrote:
>>>>> When calling 'stat -c %N' to print the filename, don't explicitly
>>>>> request the size of the file via statx(), as it may add overhead on
>>>>> some filesystems.  The size is only needed to optimize an allocation
>>>>> for the relatively rare case of reading a symlink name, and the worst
>>>>> effect is a somewhat-too-large temporary buffer may be allocated for
>>>>> areadlink_with_size(), or internal retries if buffer is too small.
>>>>>
>>>>> The file size will be returned by statx() on most filesystems, even
>>>>> if not requested, unless the filesystem considers this to be too
>>>>> expensive for that file, in which case the tradeoff is worthwhile.
>>>>>
>>>>> * src/stat.c: Don't explicitly request STATX_SIZE for filenames.
>>>>> Start with a 1KB buffer for areadlink_with_size() if st_size unset.
>>>>> ---
>>>>> src/stat.c | 6 ++++--
>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/src/stat.c b/src/stat.c
>>>>> index ec0bb7d..c887013 100644
>>>>> --- a/src/stat.c
>>>>> +++ b/src/stat.c
>>>>> @@ -1282,7 +1282,7 @@ fmt_to_mask (char fmt)
>>>>>  switch (fmt)
>>>>>    {
>>>>>    case 'N':
>>>>> -      return STATX_MODE|STATX_SIZE;
>>>>> +      return STATX_MODE;
>>>>>    case 'd':
>>>>>    case 'D':
>>>>>      return STATX_MODE;
>>>>> @@ -1491,7 +1491,9 @@ print_stat (char *pformat, size_t prefix_len, 
>>>>> unsigned int m,
>>>>>      out_string (pformat, prefix_len, quoteN (filename));
>>>>>      if (S_ISLNK (statbuf->st_mode))
>>>>>        {
>>>>> -          char *linkname = areadlink_with_size (filename, 
>>>>> statbuf->st_size);
>>>>> +          /* if statx() didn't set size, most symlinks are under 1KB */
>>>>> +          char *linkname = areadlink_with_size (filename, 
>>>>> statbuf->st_size ?:
>>>>> +                                                1023);
>>>>>          if (linkname == NULL)
>>>>>            {
>>>>>              error (0, errno, _("cannot read symbolic link %s"),
>>>>
>>>> Looks reasonable to me.
>>>>
>>>> Reviewed-by: Jeff Layton <address@hidden>
>>>
>>> Actually, I'll retract that just yet...
>>>
>>> I'm not sure that statbuf->st_size will be initialized to 0 if statx
>>> didn't fill out the stx_size field. You may need to accompany this patch
>>> with one that zeroes out the stx struct in do_stat.
>>
>> The kernel will explicitly zero out struct kstat at the beginning of the 
>> callchain
>> so that only valid values filled in by the filesystem are returned, in order 
>> to
>> avoid leaking any kernel memory to userspace:
>>
>>    int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
>>                          u32 request_mask, unsigned int query_flags)
>>    {
>>            struct inode *inode = d_backing_inode(path->dentry);
>>
>>            memset(stat, 0, sizeof(*stat));
>>            stat->result_mask |= STATX_BASIC_STATS;
>>            request_mask &= STATX_ALL;
>>            query_flags &= KSTAT_QUERY_FLAGS;
>>
>>
>> Both the kernel generic_fillattr() and statx_to_stat() copy all of the 
>> fields,
>> regardless of whether the corresponding bits are set in the request_mask or 
>> not,
>> so this should be totally safe and there is no benefit to zero the struct in
>> userspace first.
>
> I don't think it's wise to rely on undocumented behavior. That could
> easily change in the future, and what if (e.g.) BSD grows a statx
> implementation that doesn't do that?
>
> I'd suggest at least zeroing out the stx_size field first.

I looked at this, but there isn't much value to zero out the field since it
is unconditionally copied from the kernel upon return.  For the unusual case
that the kernel doesn't touch the stx_size (or other) field then I've added
a zero initializer for the statx struct.  Updated patch attached.

Even so, it isn't a huge deal if the stx_size is random garbage.  At worst it
will be too small and need a few readlink() iterations to get the right size,
and if too large it will be capped by areadlink_with_size(), so no harm is
done.  With the separate changes to areadlink_with_size() to shrink the buffer
after allocation even this would not be significant.

Cheers, Andreas
--
Andreas Dilger






Attachment: 0001-stat-don-t-explicitly-request-file-size-for-filename.patch
Description: 0001-stat-don-t-explicitly-request-file-size-for-filename.patch


reply via email to

[Prev in Thread] Current Thread [Next in Thread]