findutils: some possible enhancements

bug-findutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

findutils: some possible enhancements

From:	Wolfgang Friebel
Subject:	findutils: some possible enhancements
Date:	Thu, 16 Aug 2001 16:31:46 +0200 (MET DST)

Long time ago I made some proposals to enhance the findutils package.
As I looked now in the latest release 4.1.7 at least one of the ideas was
still listed on the TODO list (sorting find).

As I do believe, the changes I made to the findutils package (4.1) are at
least worth considering, I would like to list my ideas once more.
If something of it sounds interesting to you, I can try to provide patches
against the current release.

My proposals were (context diffs against 4.1 are in
ftp://ftp.ifh.de/pub/unix/gnu/findutils-4.1.enhanced.tar.gz):

   locate: With the -d option in locate a default path should be provided.
        (compiled in default and environment variable)
        Otherwise for the average user this option is of little use as the
        databases path is not generally known and takes much time to type
        in. (We have a convention to have one database for each machine
        and locate -d hostname ... lets you search for files on a given
        machine.)

   bigram/code (old method):
        Most of the code for the bigram program is already contained in the
        code utility. The idea is to recalculate the frequency of bigrams
        on the fly in the program code whilst coding the database with an
        old bigrams table and to replace the old bigrams file with a new one.
        This is justified by the fact, that the database contents tends to
        be rather stable. Even an empty table can serve as a starting point.
        Then the coding step has simply to be repeated or the database stays
        somewhat (40%) larger. This would make the bigram utility obsolete and
        the coding process more transparent. The sort command as a source of
        failures in the calculation of bigrams is avoided.
        The sorted output of find can thus directly be piped into the code
        command without creating huge temporary file lists, a further point
        of possible failures.

   find: The last remaining weak point in the updatedb shell script is the
        use of sort to get a sorted filelist. Especially for large file
        servers sort runs "out of sort space". The simplest way to avoid this
        step is to generate already a sorted list of files by sorting
        directories on request within find and then to descend the sorted tree.
        To control the find behaviour I introduced two new find directives
        -sort and -isort (for case insensitive sort) and added sort code to
        find. This results in a further simplification of the updatedb script.
        In conjunction with AFS it is useful to have an option that
        directs find to stay on an AFS volume. I have added an option
        -xvol in analogy to -xdev (stay on device).
        For some task it was useful to have a sorted find output where in
        a given directory files are printed first and only then subdirs.
        While this might sound esotheric, I added nevertheless an option
        -dirslast.

Major changes in my modified findutils version 4.1

* find:
* New options -isort and -sort to get sorted output.

* locate:
** updatedb takes advantage of options -sort and -print0 of find.
** optional compression of database by gzip.
** locate understands gzipped databases.
** A default search path for databases can be specified by environment
   variable LOCATE_PREFIX.
** Added code to make consistency checks for old database format.
** Option -0 or --null introduced to be compliant with xargs --null


Best regards
-- 
Wolfgang Friebel
    Deutsches Elektronen-Synchrotron DESY |  Phone:  +49 33762 77372  |
    Platanenallee 6                       |  Fax:    +49 33762 77216  |
    D-15738 Zeuthen  Germany              |  E-Mail: address@hidden   |

[Prev in Thread]

Current Thread

[Next in Thread]

findutils: some possible enhancements, Wolfgang Friebel <=

Prev by Date: Re: find bug, plus and PATCH for wwwdocs/bin/preprocess
Next by Date: weird find behaviour
Previous by thread: find bug, plus and PATCH for wwwdocs/bin/preprocess
Next by thread: findutils alpha 4.1.7 cygwin issues
Index(es):
- Date
- Thread