[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: [PATCH] pre-cache database file to improve response

From: Joe Wilson
Subject: [Monotone-devel] Re: [PATCH] pre-cache database file to improve response time
Date: Tue, 21 Mar 2006 19:50:41 -0800 (PST)

Hi Nathaniel,

Don't worry, I have given up trying to convice you of the benefits
of pre-loadind the db file into OS cache. I just wanted to clear
up a few points.

On Mon, 20 Mar 2006 23:12:18 -0800 Nathaniel Smith wrote:
> Some other considerations:
>   -- some people have quite large databases; significantly larger than
>      100MB.  Even significantly larger than main memory, in entirely
>      plausible cases.

In such cases you could pre-cache only when the database is under
a certain size threshold as determined by the user or for certain
category of commands that are likely to hit most pages.
The majority of projects would likely have ~ 100 meg database,
so why pessimize the common case?

>   -- there's really no reason to have any confidence that this will
>      help more than it hurts; certainly it helps you on your
>      benchmark, but there are a vast number of situations out there,
>      and my experience with this sort of trick is that it's pretty
>      random and unreliable; sometimes it can cause really
>      disproportionate slowdowns.  For this case, how do we know?
>      (Random example of a different situation: running monotone on a
>      multi-user system.)
>   -- 'checkout' is the best case for this patch, and a fairly uncommon
>      operation; the patch actually pessimizes for 'status', 'diff',
>      'commit', etc., which are far more common.  (Because these are
>      bound by reading the workspace, not the db, and this patch
>      will tend to trash the OS's cache of the workspace.)

I honestly see no pessimization of the other commands you've listed.
I see some (admittedly marginal) improvement even with a cold cache
and cold filesystem. Once the db file is in the OS cache there is
no noticable difference.

>   -- we're putting significant work into making monotone's disk
>      format less seeky, which is the real solution to these sorts of

I do agree that monotone could strongly benefit from a strategy 
to increase the locality of reference of its data.

>      issues, so in the best case this is only a short-term patch
>      anyway... if that turns out not to be the case, then hacks like
>      this can be re-evaluated.
>   -- it's "bad style" -- which, regardless of whether it works or not,
>      tends to give people a bad taste in their mouth, and bias them
>      against monotone.  I think there's a reasonable chance that
>      people would actually rather use a slower program that doesn't
>      do things they perceive as offensive, than a faster one that
>      does...

I find this to be a strange and unconvincing argument.
Monotone developers may find it offensive, but many of its 
users who currently have issue with the speed of monotone would not.
If you have the free RAM available, you might as well put it to
use to speed up your operation. Let the user have the freedom
to decide. Why should a user needlessly wait for a command if
they've got 1 or 2 gigs of RAM at their disposal?

> Have you tried running 'vacuum' and 'analyze' on your db?  It would be
> interesting to know if you still see similar speedup.
> You might also try the 8192 page_size... that was committed to
> mainline recently as well.  (Old db's won't be automatically migrated
> and will continue to work fine, anyone who cares can do a dump/reload
> to get the new page size.)

I'm familiar with the PRAGMA page_size=8192 thing. I've been using
it since I first suggested it to this mailing list. Naturally, it
performs better than the 1K page size, but it still performs much
slower on cold databases than the pre-cache hack.

Ironically, ANALYZE and VACUUM just serve to effectively 
pre-cache the monotone database file as well. That's largely 
why monotone users see a speedup. (Okay, VACUUM does reduce 
the size of the file, which is good).

I was going to suggest using CROSS JOINs and manual FROM clause 
table ordering as alternative to relying on SQLite's query 
optimizer/ANALYZE command, but when looking at the Monotone 
database code I did not see any use of SELECTs of more than
a single table. If this is the case, then ANALYZE will not help
monotone whatsoever. Try running ANALYZE and then dropping
the sqlite_stat1 table and confirm that you've made your
very own offensive pre-cacher!  ;-)

For the record, I am using Windows on an NTFS filesystem. I saw
some speculation on IRC that I was using a Mac. Sadly, this is 
not the case.

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]