[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SoC application Updated

From: Walter Mundt
Subject: SoC application Updated
Date: Mon, 08 May 2006 16:03:26 -0400
User-agent: Mail/News 1.5 (X11/20060228)

I've updated my Summer of Code application.

This is the last day for submissions, so if anyone has any final items they think I need to address, now is the time to bring them up. Updated application text follows:

Project Name: GNU findutils - slocate compatibility and other enhancements

  Enhance locate
    1. Enhance locate to understand the database format used by
         Implement a replacement for the current updatedb shell script
           which does pretty much the same thing but is less ugly.
           Don't introduce a dependency on anything not in the base
           system install (i.e. /bin/sh and C are OK, but Perl
           probably isn't).
         Add updatedb functionality to traverse the filesystem as
           root, preserving enough permissions information to allow
           us to provide the same functionality as slocate. Use the
           same database format as slocate unless there is a reason
           not to.
  Enhance find
    Add tests which allow [acm]time to be compared against a specified
      timestamp, as opposed to the timestamp of a file (-newer) or an
      age (-mtime). Add relevant tests to the test suite and document
      the changes.
    Instrument find to allow us to improve the guesses that parser.c
      makes for struct predicate . est_success_rate. Measure the (lack
      of) performance increase in find 4.3.x with optimisation turned
  Enhance xargs
    1.Implement an optional feature in which xargs figures out how
      long a command line it can pass to exec() without necessarily
      believing ARG_MAX (because for example with the Linux kernel
      this can be an underestimate).

Benefits to the Community

Each of these enhancements will have their own benefits, so:
slocate compatibility: slocate compatibility will reduce user
  confusion and add important new capabilities to a tool that
  is installed by default on a vast number of Linux distributions.
updatedb replacement: a new updatedb will be easier to maintain,
  and may be faster as well.  Adding new capabilities or locate
  database enhancements will be less of a chore once the tool
  that produces the database is better-designed.
xargs enhancement: This enhancement appears primarily to be a
  performance enhancement for large-scale xargs usage.  However,
  as the easiest enhancement to implement, the lesser benefit
  is also acceptable.

  - NOTE: all patches are to include updates to all relevant
  - Patch to xargs to add optional automated ARG_MAX recaculation.
  - Patch for find to add new options for checks of [acm]time vs.
    a particular time/date.  Names and syntax to be discussed with
    project mentor(s).  To include test cases as needed.
  - Patch to find to add est_success_rate
  - New updatedb, either a C program or a clean shell script.  The new
    version will be capable of generating both current locate and
    slocate-style databases.
  - Patch for locate to add slocate compatibility and (in the presence
    of a slocate-style database) functionality.
  - Extra: if all of the above get done with time to spare, work on an
    additional patch to locate/updatedb to add ACL and support to the
    security-checking mechanism.

Implementation Plan:

Start with the xargs patch, which should be relatively quick.  Discuss
and prototype the new find predicates next; if there are issues, work
on this in parallel with the next item.  Continue working with find,
on the est_success_rate improvements.  After that, compare the locate
and slocate updatedb implementations and decide exactly how to
approach writing the new updatedb.  Finally, write the new version of
updatedb and the supporting changes to locate in parallel.

Post-Completion Plans:
How much of a role I take with the findutils project afterwards will depend largely on how much I enjoy working with the code over the summer. However, as a minimum I will maintain the new updatedb and database-reading code for at least 6-8 months after the SoC, or until any issues seem to be ironed out, whichever is longer.


I find this project appealing because, as a regular user of all of
these tools, I can really see myself making use of them.  They're
also in a domain of which I have a very good understanding.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]