bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Support in sort for human-readable numbers


From: Vitali Lovich
Subject: Re: Support in sort for human-readable numbers
Date: Wed, 7 Jan 2009 02:17:39 -0500

On Tue, Jan 6, 2009 at 8:01 PM, Pádraig Brady <address@hidden> wrote:
> Vitali Lovich wrote:
>> On Tue, Jan 6, 2009 at 12:26 PM, Pádraig Brady <address@hidden> wrote:
>>> Vitali Lovich wrote:
>>>> On Tue, Jan 6, 2009 at 10:19 AM, Pádraig Brady <address@hidden> wrote:
>>>>> I like the idea.
>>>>>
>>>>> So it doesn't support sorting these correctly for example:
>>>>>
>>>>> 999MB
>>>>> 998MiB
>>>>> 1GiB
>>>>> 1030MiB
>>>>>
>>>>> I.E. a mixture of ^2 and ^10 are not supported,
>>>>> nor overlapping number ranges.
>>> I'm not complaining about the above. Just clarifying.
>>>
>>>>> +  /* FIXME: maybe add option to check for longer suffixes (i.e. 
>>>>> gigabyte) */
>>>>>
>>>>> You should allow at least G, GiB and GB formats.
>>>>> Probably should print error if more than one of those
>>>>> formats used, since that's not supported.
>> Perhaps - but for sort, at least from my thinking of how I would
>> implement this, the additional logic (at least to behave correctly on
>> all inputs) would be somewhat complicated.
>
> I thought it would be easy just to ignore a trailing i?B?
>
>> Can you please explain why
>> you believe this belongs in sort
>
> because I think it's a common enough format and getting
> more common since it's an IEC defined standard.
>
>> and wouldn't be better served by
>> pre-processing the text before sort & post-processing it after as
>> necessary?
>
> that's a little awkward and inefficient.
>
>> Supporting all the various ways the human_readable can be output is
>> just not practical or even useful
>
> just ignore an optional trailing iB is all I'm suggesting.
> If it's difficult or inefficient then don't worry about it.
Right, but you have to deal with terminating characters and whatnot.
I mean it's not super difficult obviously.  I'm just wondering why
that logic even belongs in sort.  The rule of thumb is - the less code
you write, the fewer bugs you'll have.

>
>>>>> Yep if you're not supporting overlapping number ranges then
>>>>> you can skip the number comparison totally if the suffixes don't match.
>> Debatable.  You'd still have to scan the string to find the end of the
>> number to find the suffix.  And if you get a miss (i.e. same
>> suffix-level), then you'll have to scan the strings again, performing
>> the comparison.
>
> don't worry about this for the moment.
Well there's absolutely no benefit.  Sure - if I knew where the suffix
was ahead of time, that's an obvious optimization.  Perhaps it might
make sense to keep track of the suffix between calls to sort (cause
the strings are compared multiple times so caching this value might
actually make sense, although the architecture isn't there for this).
>
> thanks,
> Pádraig.
>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]