monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: Monotone speedup by adding additional database


From: Markus Schiltknecht
Subject: Re: [Monotone-devel] Re: Monotone speedup by adding additional database indices?
Date: Fri, 12 Oct 2007 12:37:27 +0200
User-agent: Icedove 1.5.0.12 (X11/20070730)

Hi,

Lapo Luchini wrote:
Taking a better look at indexes of table revision_certs:

CREATE TABLE revision_certs (
    hash not null unique, -- hash of remaining fields separarated by ":"
    id not null,          -- joins with revisions.id
    name not null,        -- opaque string chosen by user
    value not null,       -- opaque blob
    keypair not null,     -- joins with public_keys.id
    signature not null,   -- RSA/SHA1 signature of "address@hidden:val]"
    unique(name, id, value, keypair, signature)
);
CREATE INDEX revision_certs__id_name_value ON revision_certs (id, name,
value);
CREATE INDEX revision_certs__name_value ON revision_certs (name, value);

Those are not two indexes, they are four indeed:
UNIQUE(hash)
UNIQUE(name, id, value, keypair, signature)
KEY   (id);
KEY   (name, value);

Question is: is there a reason the "big unique index" has that order?
I guess so. If that's not the case, we could simply change it to:
UNIQUE(id, name, value, keypair, signature)
and avoid both KEY(id) and the proposed KEY(id, name, value).

AFAICT, the only difference an ordering change of a unique constraint makes, is the difference in the structure of the underlying index. So we'll have to check which indices we really need.

It looks like we mostly need '(id)' and '(name, value)'. I'm going to quickly try these indices, which seem to satisfy most cases:

.. UNIQUE(name, value, id, keypair, signature)
CREATE INDEX revision_certs__id (id);

Side questions: why an UNIQUE on "all data" and an UNIQUE on "hash of
all data"? Assuming the hash doesn't casually collide (and we assume it
all the time) and the ':' separator is adequate they are a bit redundant.

Why do we store that hash in the database at all? Is it so terribly expensive to calculate that we need to cache it in the database? Am I right assuming that we only need that hash during netsync?

I tend to agree with Lapo, that we don't need another unique constraint on that hash.

Regards

Markus





reply via email to

[Prev in Thread] Current Thread [Next in Thread]