info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Use of CVS on large scales


From: Paul Sander
Subject: Re: Use of CVS on large scales
Date: Fri, 8 Jun 2001 12:45:55 -0700

--- Forwarded mail from address@hidden

>Do you have specific experience with CVS 'breaking down and becoming unusable'?
>If so, please share that experience here so others may learn from it.

>'Developing at the same time' is a misnomer in this context.

>CVS' transitory file locking (the # files that get left hanging around after a
>transient network blip) occurs during commit only (right Larry or Derek?), and
>thus would be a factor if and only if all developers were attempting to commit
>to the same file at the same time.

Nope.  Read locks are placed in the repository to prevent concurrent write
locks from being created.  If a read lock is left abandoned, then subsequent
commits are disallowed.  If a write lock is left behind, all access is lost
to that directory.

>In practice, this isn't going to occur on large projects.  In fact, as projects
>grow, the likelihood of [read this as Risk of] commit time contention 
>decreases.
>This is due to the naturally-occuring segmentation of work that occurs as
>projects scale up.  If multiple developers are committing against the same
>file(s) - especially at the same time - the project is operating in a
>communication void and is doomed for reasons that have nothing to do with
>version control.

Some shops define large modules, and assign specific parts to individual
developers.  Sometimes developers misuse CVS and perform commits from the
top of the module when it's not needed.  But sometimes merges are done,
and they touch vast portions of the code.  In that case, it's necessary
to lock the entire module.

While it could be argued that modules should be small, in practice there are
many cases where that just isn't practical.

>I argue that CVS scales very well.  I have yet to see or hear concrete evidence
>that contradicts this.  My personal experience with CVS shows that it performs
>as expected up to 70 developers.  I have yet to encounter a project that was
>larger [in number of developers].  In contrast, I have seen Clearcase perform
>poorly with as few as three developers.  Why is this?

I have used CVS with as many as 250 developers on a single project, and
was functional for the most part.  I've used both CVS and ClearCase on
projects involving about 75 developers.  In my experience, ClearCase
performs well.  I find that network bandwidth and latency are the biggest
factors of performance, followed by the load on the view servers.  Use
a slow network and heavily loaded and underpowered server, and it follows
naturally that ClearCase will perform poorly.

I've also had people complain about the speed at which CVS populates a
user's sandbox.  This happens for the same reasons.

The only other place where people complain about performance is with large
builds performed in workspaces stored on remote machines.  This happens both
with CVS and ClearCase.  Builds always run faster when everything the user
needs is on a local filesystem.

If ClearCase performs poorly with only three users, it's not due to ClearCase.
It's a problem with the network, servers, or possibly high-overhead triggers
that the shop has installed.

>- CC's system overhead [mvfs, view server process, vob server comm] makes
>otherwise trivial operations time consuming.   This frustrates developers and
>they then try to circumvent the system.

Again, the network is the biggest factor here, followed by server load.
Using snapshot views will speed builds.  If you're after metadata, then
performance compares to any relational database system (and is perhaps
better because the Raima engine has been optimized for ClearCase).  CVS
is way slower than ClearCase in that regard, if CVS can even deliver the
kind of info you want.

>- The concept of views is poorly understood.  It is confusing and misleading to
>NOT see changes that have been committed to the repository as soon as they are
>committed (or when you do an update).  This particular 'feature' has cost
>projects untold man hours in lost productivity.  Your version control system
>should enable you to make better decisions, not force you to behave in
>unnatural ways to obtain accurate information.

It sounds like there was a poor process implemented at that shop, accompanied
by poor training.  The CVS user model isn't exactly easy to understand for
most developers, either.

By default, the ClearCase user model delivers source code to the users at
the moment it's checked in.  You actually have to work to give the developers
a static working environment, but the effort is not great.

>- The extremely delicate nature of the VOB makes me shutter.  Virtually any
>event outside my control - network transients, power spikes, dips, or loss,
>disk drive latency - could damage the VOB's structure during a write operation.
>You then have to manually weave together a restore of the VOB (and maybe views,
>if they were stored on the VOB server) etc. etc. etc.  This process takes hours
>and hours, and maybe days.  I have specific experience with exactly this
>nightmare - twice in six years.  Losing a VOB is a reality.  That's why the
>cottage industry of CC administration has developed.  As a manager, you want
>your crack team of hired guns ready to deal with this when it happens.

Keep in mnd that hardware failures affect CVS' operation as easily as they
affects ClearCase.  However, I've experienced all of these problems multiple
times over six years and have yet to have a database corruption.  I have had
two source container corruptions, akin to RCS file corruptions, but they were
very easy to fix using a procedure that was documented in the ClearCase admin
guide.

--- End of forwarded message from address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]