[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GNU Summer of Code application
GNU Summer of Code application
Wed, 03 May 2006 05:11:44 -0400
Thunderbird 184.108.40.206 (Windows/20060308)
Earlier I submitted a SoC application to work on the suggested findutils
Leslie Polzer (who I presume is reviewing many of the GNU apps since she
doesn't appear in the bug-findutils archives) submitted a question on
How much of a role I take afterwards will depend largely on how much I
enjoy working with the findutils code over the summer. However, I'm
certainly willing to commit to maintaining the new updatedb and
database-reading code for at least 6-8 months after the SoC, or until
any issues seem to be ironed out, whichever is longer.
What are your plans when you have finished this project?
Will you take a maintenance role and/or developer role?
I've decided to put the app-as-submitted up on the mailing list for
discussion/critique, so that I can make improvements before I resubmit
with that answer (unless I get some feedback indicating that this won't
Here is the application as I submitted it, except for some line-wrapping
Project Name: GNU findutils - slocate compatibility and other enhancements
1. Enhance locate to understand the database format used by slocate.
Implement a replacement for the current updatedb shell script
which does pretty much the same thing but is less ugly. Don't
introduce a dependency on anything not in the base system
install (i.e. /bin/sh and C are OK, but Perl probably isn't).
Add updatedb functionality to traverse the filesystem as root,
preserving enough permissions information to allow us to
provide the same functionality as slocate. Use the same
database format as slocate unless there is a reason not to.
Add tests which allow [acm]time to be compared against a specified
timestamp, as opposed to the timestamp of a file (-newer) or an
age (-mtime). Add relevant tests to the test suite and document
Instrument find to allow us to improve the guesses that parser.c
makes for struct predicate . est_success_rate. Measure the (lack
of) performance increase in find 4.3.x with optimisation turned
1.Implement an optional feature in which xargs figures out how long
a command line it can pass to exec() without necessarily believing
ARG_MAX (because for example with the Linux kernel this can be an
Benefits to the Community
Each of these enhancements will have their own benefits, so:
slocate compatibility: slocate compatibility will reduce user
confusion and add important new capabilities to a tool that
is installed by default on a vast number of Linux distributions.
updatedb replacement: a new updatedb will be easier to maintain,
and may be faster as well. Adding new capabilities or locate
database enhancements will be less of a chore once the tool
that produces the database is better-designed.
xargs enhancement: This enhancement appears primarily to be a
performance enhancement for large-scale xargs usage. However,
as the easiest enhancement to implement, the lesser benefit
is also acceptable.
- NOTE: all patches are to include updates to all relevant
- Patch to xargs to add optional automated ARG_MAX recaculation.
- Patch for find to add new options for checks of [acm]time vs.
a particular time/date. Names and syntax to be discussed with
- Patch to find to add est_success_rate instrumentation/improvements.
- New updatedb, either a C program or a clean shell script. The new
version will be capable of generating both current locate and
- Patch for locate to add slocate compatibility and (in the presence
of a slocate-style database) functionality.
- Extra: if all of the above get done with time to spare, work on an
additional patch to locate/updatedb to add ACL and support to the
security-checking mechanism. Alternately/additionally, add an
option to locate to attempt to stat database hits to check for
“real” access if all the database-supported permission checks pass.
Start with the xargs patch, which should be relatively quick. Discuss
and prototype the new find predicates next; if there are issues, work
on this in parallel with the next item. Continue working with find,
on the est_success_rate improvements. After that, compare the locate
and slocate updatedb implementations and decide exactly how to
approach writing the new updatedb. Finally, write the new version of
updatedb and the supporting changes to locate in parallel.
I find this project appealing because, as a regular user of all of
these tools, I can really see myself making use of them. They're
also in a domain of which I have a very good understanding.
I'm suited to work on this project because I have a thorough
understanding of C and shell scripting, as well as experience
in using these tools. I'm also a competent generalist programmer:
I competed in this years International Collegiate Programming
Contest finals. Finally, I do have some experience with Free
Software: I worked on the TWiki collaboration tool. (see twiki.org)
- GNU Summer of Code application,
Walter Mundt <=