monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] net.venge.monotone.experiment.performance


From: Nathaniel Smith
Subject: Re: [Monotone-devel] net.venge.monotone.experiment.performance
Date: Fri, 4 Aug 2006 23:07:40 -0700
User-agent: Mutt/1.5.12-2006-07-14

On Wed, Aug 02, 2006 at 05:38:58PM -0700, Eric Anderson wrote:
> Nathaniel Smith writes:
>  > Could you say a few more words to convince me of the correctness of
>  > your approach?  I don't totally understand the existing pending write
>  > stuff to comment more knowledgeably, but sqlite already has
>  > bounded-memory write buffering, so if we're not using it we probably
>  > have some reason, and this code makes it so we do silently use it in
>  > some cases, and not in others. 
> 
> The change to buffer pending writes was introduced in
> 54bba40d63b8f4bd8103cf1049b3045a162e540a with a comment that it gives
> about 20% performance gain on initial pull.  This is because (I
> assume) 1) it gets to batch together multiple updates to the database,
> 2) because anything that gets canceled before being written is almost
> free, and 3) multiple writes of the same thing can be handled faster
> in the pending write map than in the database.

The main benefit, actually, is that the common case for writing
files/rosters during netsync is:
  we get new version V
  we delete old version U's full-text from the db, write a backwards
    delta from V -> U, compress V's fulltext, and write that
    compressed fulltext to the db
  we get new version W
  we delete old version V's full-text from the db, write a backwards
    delta from W -> V, compress W's fulltext, and write that
    compressed fulltext to the db
  (repeat many times)

By buffering up writes like this, we reduce the cost of these
insert/delete/insert/delete cycles -- if the old full-text hasn't been
written out yet when the next arrives, we can reduce copying and
stuff.  I wonder if we could defer the compression until that point
too...

> I'm not sure what you mean about "this code makes it so we do silently
> use it in some cases, and not in others".  The patch makes it so that
> we won't ever buffer more than constants::db_max_pending_writes_bytes
> in memory.

Right, I had a chance to actually read over the code today, and I see
I was confused about what was going on.  Your code was in fact fine.
:-)

I merged it to mainline, plus made some formatting fixups (for GNU
style braces), factoring otu of duplicated code, added a comment on
the precondition for cancel_pending_write (and simplified the code,
which was a bit over-defensive -- note that safe_erase I()'s that the
item is in the map), and removed the comments about the write buffer
size vs. the netsync commit interval, which are logically separate.
(The netsync commit interval is mostly tuned to choose how much we
spend waiting for the disk to fsync(), while the write buffer size is
tuned to choose how much stuff we end up with in memory; also, they're
not even measuring the same bytes, really -- netsync looks at bytes
transferred, while the write cache includes reconstructed full
texts...)

While I was there, I went ahead and applied your xdelta.cc
string-growing stuff from .performance as well.  There were a few
braces I had to fix up by hand here too.

In the future, it would be convenient if you could commit different
changes in parallel rather than sequence (possibly even as different
branches), so that they could be merged individually; as it was, I had
to use pluck and remove some extraneous changes, so you may have some
extra work merging back to .performance.

Cheers,
-- Nathaniel

-- 
"...these, like all words, have single, decontextualized meanings: everyone
knows what each of these words means, everyone knows what constitutes an
instance of each of their referents.  Language is fixed.  Meaning is
certain.  Santa Claus comes down the chimney at midnight on December 24."
  -- The Language War, Robin Lakoff




reply via email to

[Prev in Thread] Current Thread [Next in Thread]