[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-recutils] GSoC: Ideas for Recutils

From: Michał Masłowski
Subject: Re: [bug-recutils] GSoC: Ideas for Recutils
Date: Sat, 24 Mar 2012 14:07:25 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

> I've read that you need a student to add support for indexes to
> recutils. I think I'm enough experienced to do it. But I also consider
> it's not a very difficult task. Is there maybe some problems to realize
> this ability in recutils?

I don't know if GSoC projects are expected to be difficult, I see many
possibilities to implement and measure for this task so it might take as
much time as a more difficult task.

For complex queries there are many ways to use indices, there are also
different performance benefits of tree or hash indices.  This depends on
data.  Maybe the index could be built in a way optimized for previously
done queries, without any manual specification of what to store there.

Since any write practically requires rewriting the database (indices are
optional), maybe index formats which needs a complete rebuild on change
wouldn't be too slow for use with recutils, although they aren't used in
traditional database systems.

Writing good performance tests, which might approximate what a real
useful program does with a big database, is probably necessary for this
task.  I don't know existing uses of recutils with database sizes for
which this task would be significant.

The only problem which I already found is that the database is
completely read and parsed for use, changing this would be needed to
make indices useful with recsel.  I don't expect this to be more
difficult than other parts of the task.

The ideas page mentions determining if the index is up to date, I don't
see other practical solutions than using filesystem metadata of the
database file (checksumming the file contents should be much slower than
doing a simple query using a tree index).

(I'm writing this as a student interested in implementing this; I don't
have practical experience in implementing databases, I know C and I can
implement structures useful for indices.)

Attachment: pgpHLN0VAdQUm.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]