[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Support in sort for human-readable numbers

From: Vitali Lovich
Subject: Re: Support in sort for human-readable numbers
Date: Wed, 7 Jan 2009 04:22:25 -0500

Oh, and if my prior e-mail seems a bit abrupt, I apologize.  I don't
mean to be rude.  It is getting late, and I probably should've waited
until morning to compose my reply.

On Wed, Jan 7, 2009 at 4:21 AM, Vitali Lovich <address@hidden> wrote:
> On Wed, Jan 7, 2009 at 2:52 AM, Paul Eggert <address@hidden> wrote:
>> I looked at the patch
>> <http://launchpadlibrarian.net/20858726/sort_human_v3.diff> and have a
>> few comments:
>> * There's no documentation, either in the .texi file or in the --help
>>  string.  That's often the hardest part to get right.
> I agree - that's why I'm waiting to get the implementation stabilized
> before trying to write it (along with the necessary test cases).  That
> will be done, but there's no point documenting right now since the
> behaviour hasn't been finalized.
>> * The patch slows down "sort -n" a bit, and (worse) complicates the code
>>  for "sort -n".  There are a lot of gotos.
> Oh no - scary gotos.  Seriously - gotos serve a place, and there is
> appropriate usage, especially when it comes to minimizing code
> duplication within a function (dont believe me?  go look at the linux
> kernel or any significant piece of C code).  In fact, gotos are
> actually quite fast (if you know CPUs, you'll realize it's a simple
> jmp instruction which does nothing to the CPU pipeline - conditional
> jumps are way more problomatic).  Some of those gotos are unnecessary
> (not removing code duplication), but I think they make the code easier
> to read and I'm quite positive that any decent compiler will optimize
> away any unnecessary jumps.  And code readability is far more
> important than that last 0.1% improvement.
> I challenge you to show me one benchmark where the behaviour is
> demonstrably slower by even a significant percentage (let's say 3%)
> and is not caused by the branching of which comparison to use.
>> Better, I think, would be
>>  to (1) search for a trailing suffix character, and compare based on
>>  that, and then (2) fall back on numeric comparison only if the
>>  trailing suffix characters match.  This will certainly make plain
>>  "sort -n" faster and will result in a less-intrusive change to the
>>  existing code; arguably it will make the new code faster in many
>>  cases, too.
> Did you read my explanation above for your exact "optimization"?
> Please explain to me, in a somewhat detailed manner, how this improves
> behaviour because I'm failing to see the algorithm.
> Here's what I think you're suggesting:
> for (each character c in string a)
>  if (c is not a number)
>    break
> for (each character d in string b)
>  if (c is not a number)
>    break
> if (suffix starting at c != suffix starting at d)
>   return which one is greater
> perform the regular sort -n comparison between a & b
> You must be thinking that that last step is somehow magical and free,
> cause I see it as once again looping over the string.  You may be
> thinking to yourself - yeah, but there's no expensive comparison
> between the two strings.  Ok - I'll grant you that.  So you've sped up
> the best case where the suffixes don't match (more on this later).
> Now let's look at the worst case - the suffixes match.  Thus you have
> to scan over the string again but this time performing your numeric
> comparison.  So you've made your worst case slower.  I have a feeling
> that this worst case is also likely to be the average case.
> Also, let's take it that you have managed to speed-up the best-case.
> But have you really?  What have you saved - the cost of a subtraction
> n times (n is the length of the number) over a span of x comparisons.
> Do you honestly believe that n * x will be sufficiently big in any
> reasonable test case that this saving will actually be realized? x by
> the way will be approximately m log m (where m is the number of
> strings).  So your cost savings over the course of your call to sort
> will be n * m log m, for m strings of length n.  Let's take a
> reasonable n = 3 (after all - higher n on average means you will be
> using a different suffix anyways).  Let's saying you are sorting a
> million strings.  So you've saved about 3 000 000 log 3 000 000 = 19
> 431 363 comparisons.  Let's be generous here and say that subtraction
> takes 20 clock cycles (I don't feel like digging up my FPGA work to
> figure out how many clock cycles my implementation took - let alone
> the far better ones used in modern processors).  On a modern 2.0 GHz
> machine, that'll be about 40 ns.  So for a million strings, you've
> managed to save a whopping 777254520 nanoseconds.  Wait a minute -
> that's 0.7 seconds.  So every million strings - your optimization, in
> the best case, will shave off 0.7 seconds.  Excellent.  Oh - and m has
> a much larger impact, so sorting more strings will be more of a factor
> than sorting longer strings.
> Now repeat the above analysis for the worst case, and I'm quite
> confident you'll see a much larger hit (although probably still not
> significant at all) in the worst case (suffixes equal).
> Hopefully I didn't emabarass myself with a glaring mistake - it's really late.
>> * Please don't take the address of an often-used local variable; this
>>  hurts optimization, since it often inhibits optimizing that variable
>>  into a register.
> Could you please provide me with context about where this is happening
> (I can't find it in the code).  A line number in the patch would be
> great.
>> * I suggest ignoring any suffix after the 'k' or 'M' or 'G' or whatever;
>>  it's simpler.
> And extremely incorrect - please read the discussion above.
> Essentially, you then run into the problem of malformed input -
> 2Klingons would be treated the same as 2K Klingongs.  Granted, this is
> a contrived example, but I hope you see my point.
>> * I also suggest allowing only the upper-case version (e.g., 'P', not
>>  'p'), as this will allow later support for ISO prefixes like 'p' (for
>>  "pico") better.  (It's OK to allow both 'k' and 'K' to mean kilo,
>>  though.)
> Agreed, although in this context they are numeric suffixes (pico is
> the unit prefix, but units play no role in this discussion).  The
> problem though then becomes that I suddenly have to worry about which
> suffixes can be both and which must be one or the either.
> Additionally, you've got problems like mu for micro, which I'm not
> even sure can be represented in ASCII.  Also, h & da are hardly used I
> think (& even better, da breaks the pattern of 1 character suffixes).
> So supporting the full set seems questionable.
>> * exponent_order should be a simple indexed lookup like "table[a]";
>>  there shouldn't be a need for a loop checking for "inverse_table[i] ==
>>  a".
> Is the memory-speed tradeoff really worth it?  The table isn't that
> big to begin with.  So you're trading off 256-bytes for (in the best
> case) a savings of, like what, 5 subtractions per suffix comparison?
> I'd probably concede your point on this tomorrow after a nights sleep.
>> Hope this helps.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]