monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Netsync performance improvement patch


From: Eric Anderson
Subject: Re: [Monotone-devel] Netsync performance improvement patch
Date: Mon, 15 Aug 2005 01:36:24 -0700

Nathaniel Smith writes:
 > On Sun, Aug 14, 2005 at 10:16:59PM -0700, Eric Anderson wrote:
 > >  > I assume that a boost::circular_buffer isn't suitable since
 > >  > that doesn't have contiguous storage?
 > > 
 > > Yes, the rest of the monotone code makes assumes that buffers are
 > > contigous that the various buffers are all contigous.  In theory
 > > switching to entirely non-contigous buffers would be even better from
 > > a copy/memory usage standpoint, but that fix was going to be very
 > > invasive, and if this patch couldn't get in, one that changes way more
 > > seemed even less likely to get in.
 > 
 > This is a bit tangential, but could you give a pointer to some
 > examples of the code you mean?  I'm just curious what you mean here.

The best example is in hmac.cc which casts str.data() to a
Botan::byte; this cast assumes contiguousness of the data.

I was imprecise in my statement that monotone assumes buffers are
contiguous.  Monotone uses strings to represent buffers and data.  C++
strings are contigous, and the code takes advantage of this in that
operations like indexing into the string are fast enough to be free.

For example, the functions in netio.hh that use string[variable] could
work correctly if the buffer was not contigous, but substantially more
slowly.  Tracking the offset through the buffer as an integer offset
and extracting strings relative to the position would also go more
slowly.

The main problem as I saw it with a non-contigous implementation was
that it would be more complex than the contigous one, and that
complexity would only give a benefit if the non-contigous structure
was pushed much further into the code.  Examples of complexity: []
now has to step through the list, decrementing the offset as
appropriate, substr() has to manually construct the return string by
copying bit by bit into the output string.  

Netcmd::read extracts the payload as a contigous string, which would
end the benefit of storing the network data in a non-contigous manner.
The benefit could be extended if the payload was represented as a
reference to the non-contigous structure, and any conversion to a
contigous structure was delayed.  However, this delay would require
reference counting on the non-contigous structure so that it could be
properly shared.  

In the long term, using a non-contigous data-structure to represent
"strings" throughout the code is probably the best way to structurally
reduce the memory usage, and could result in a performance boost if the
lack of fast random access is taken into account.

However, in the short term, it seemed that closely matching the
current assumption that buffers are strings, was a better choice.
This choice reduced the code changes to just changes in the variable
types.  The implementation with a character array and pointers to the
front and the back with copying and resizing at the appropriate points
also seemed like a simpler implementation.

Hopefully this was a clearer explanation of what I was trying to say.
        -Eric





reply via email to

[Prev in Thread] Current Thread [Next in Thread]