bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] asort/asorti documentation issues


From: Andrew J. Schorr
Subject: Re: [bug-gawk] asort/asorti documentation issues
Date: Mon, 18 Nov 2013 12:05:18 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Arnold,

On Fri, Nov 15, 2013 at 11:15:02AM +0200, Aharon Robbins wrote:
> > > How does this look as a patch?
> >
> > This (obviously) looks much better, since it's no longer incorrect. :-).  
> > Thanks for coming up with a patch. 
> >
> > In addition to saying that "@unsorted" makes no sense in this context,
> > would it be helpful to point out that only the "@val_.*" choices are
> > useful?  In other words, the "@ind_.*" and "@unsorted" values should
> > all be avoided.
> 
> Good point.

Actually, I was wrong about that.  The actual situation is quite a bit
more complicated.

> > Particularly when using asorti, one might be tempted to use an argument of
> > "@ind_.*", but it appears not to work for me.  Or am I confused about this?
> 
> Can you investigate this a bit?  I'm up to my ears in Real Life,
> and you're familiar enough w/the code that you should be able to see
> what's really going on.  John and Pat did that stuff back during the
> 4.0 development and I haven't really delved into it.

I was indeed confused.

First of all, I should note that the documentation in the "Array Sorting
Functions" section is closer to the truth than what it says under
"String Functions".  I wonder if it is wise to try to document these functions
in both places...

Having examined the code, I think this is how it works:

- If a 3rd argument is not supplied, default to "@val_type_asc" for asort
  and "@ind_str_asc" for asorti.  Side note: I don't see this documented
  anywhere.
- Call assoc_list to do the sorting.  This function first calls the "alist"
  method to flatten the associative array into a linear C array of
  <index>, <value> pairs.  It then calls qsort to sort that array.  So
  the sort function is free to examine the <index> or <value> or both!
- The asort_actual function then extracts the values it wants from the
  sorted array returned by assoc_list.  For regular asort, it takes the
  values, and for asorti, it takes the indices.

This has interesting implications.  One can call asorti with a 3rd argument
that examines the values, not the indices.  The returned array will contain
the index values, but they will be in the sort order of the values.  This
capability was very useful for me in a script I just wrote.

> And if you want to take my doc patch and finish it off, I'd be
> doubly grateful...

This is a big job. :-)  As I now think of it, both asort and asorti can sort on
the indices or the values or some combination.  The returned array value will
contain the values or indices, respectively, but it says nothing about what was
sorted in the first place.  In other words, it is not accurate to say that
"asorti sorts the indices".  The array is sorted (and the sort function may be
comparing values or indices or both), and then the indices are returned.

So, for example, under "Sorting Array Values and Indices with gawk", it
now says:

   Often, what's needed is to sort on the values of the _indices_
   instead of the values of the elements.  To do that, use the `asorti()'
   function.

I do not think that is an accurate description of what's going on.

Does this all make sense?  The sorting capabilities are quite powerful,
but the documentation is lagging badly.  Thoughts?

Regards,
Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]