info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: more cvs performance questions (I think they are at least interestin


From: Mark D. Baushke
Subject: Re: more cvs performance questions (I think they are at least interesting though!)
Date: Tue, 28 Oct 2003 20:15:24 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Richard Pfeiffer <address@hidden> writes:

> BASICS:
>
> We are running cvs-1.11.  I did migrate us to 1.11.9, but it turned out it
> does not mesh with Eclipse, which is what our developers use.  The latest
> upgrade Eclipse can use is 1.11.6.  From what I read, that has its own
> problems, so 1.11.5 would be the latest we could use.

Have you reported the problems to the Eclipse folks? A separate e-mail
on what problems you ran into may be useful to help other folks some
day.

> 1)
>
> Should cvs even be able to handle this kind of load?  To some of us, it's
> amazing and a credit to cvs that this thing hasn't crashed already.  But,
> to avoid a crash, when we did the metrics and saw what our percentages on
> cpu, switching, kernel, etc., and especially load (46) were, we shut down
> inetd.conf, waited for some cvs processes to complete and the load drop to
> 10 before starting inetd.conf back up.
>
> a)     should we be splitting up our repository and giving each project
> their own?

Well, that might help you scale a bit more

> b)     is there a way to limit the number of pserver calls made at any one
> time?

I am not aware of any on the solaris inetd. However, you could probably
borrow code from the public tcp_wrapper software and have it check the
load on your system and then refuse connections to the pserver for a
time.

> c) Should we be going to a 4x4 machine rather than our current 2x2?

How fast are you growing? Will it give you enough room? If not, then you
might end up needing to shed some of the projects to another machine as
well as moving to more processors on your current hardware.

To be honest, I think you might be better to drop another 14GB of memory
on the system to see if that improves your performance.

>
>
>
>
>
> 2)
>
> Context switching seems to be excessive, especially when we have more than
> 2 or 3 cvs ops running together. In the mornings, it's hitting as much as
> 12K per second, which is definitely a killer on a 2-processor system.
>
> a)     Is this normal?

To be honest, I have not benchmarked cvs in this situation.

>
> b)     Is cvs setup with a ping parameter or some kind of *am I alive*
> setting that hits every 1, 2 or 5 seconds?  If so, can it be reset?

No. The :pserver: client/server protocol assumes a tcp connection, but
to the best of my understanding does not send any kind of a keep-alive
over the link.

>
>
>
>
>
> 3)
>
> Is there any kind of performance bug where just a few processes take up a
> lot of CPU * especially branch commands?  We were getting CPU time
> readings of 41 on one sub-branch process.

I am sure that there are many bugs that remain in cvs, but I am not
aware of any particular performance problems. To create a branch tag,
all of the files that are being tagged will be mmaped() into memory,
modified to have the new tag near the head of the file, written into a
',filename,' and then renamed as 'filename,v' when the operation is
complete before it moves along to the next file in the list to be
tagged.

>
>
>
>
>
>
>
> 4)
>
> In the doc, I read about setting the LockDir=directory in CVSROOT, where I
> assume I create my own dir in the repository (LockDir=TempLockFiles).
>
> We DO NOT have this set as yet, but I think I might like to try it for
> speed sake.  All our developers need write access to the repository, but
> the doc states:
>
> It can also be used to put the locks on a very fast in-memory file system
> to speed up locking and unlocking the repository.
>
>
>
> a)     Just what is an in-memory file system?

Some operating systems have a way to create a mfs (memory file system).
I believe that the closest that Solaris comes is the use of a swap
filesystem which may be memory resident for much of the time.

>
> b)     Is speed garnered because all the lock files are in one directory
> and cvs does not need to traverse the project repository?

No, there are still multiple directories using a LockDir and a traversal
is still done. The difference is that the operations are typically
handled much faster.

The creation of cvs locks is a multi-step process that ends with a
#cvs.lock directory being created for the duration of the lock and then
being removed. For some operations, the creation of the read file, the
creation of the lock file and reading the contents of the directory and
any files needed from the repository and the removal of the lock
directory can take milliseconds. Being able to improve the performance
of lock creation and deletion will improve the overall access time of
the repository.

> c)     Is the speed increase significant?

If you are able to write to memory faster than to your repository, then
the difference in speed between those to mediums is how much faster you
will be able to create your lock. I would guess that in most cases of
a repository over NFS, or slow local disks the use of a memory filesystem
would be faster. The use of a swap system that is always being paged out
to disk may actually be slower if the page disk is slow.

>
> d)     Will there be any problems with having lock files from multiple
> different projects  in the repository flooding this same directory?

No. That is not how things work. Think of the LockDir as a tree that has
the same structure as your repository, but potentially allowing faster
creation and removal of zero length files and empty directories.

The LockDir also allows you to have different access controls than the
repository itself.

> If I need to search for errant locks, the way we are currently set up, I
> can go to the project where I know they exist and do a find for them.  In
> this LockDir case, we are going to have lock files from multiple different
> projects all in one dir. It appears by the statement:  *You need to create
> directory, but CVS will create subdirectories of directory as it needs
> them* that the full path is still used, correct?  (So, it would still be
> an easy search?)

Same search, different root.

>
>
> I then read the link: 10.5-Several developers simultaneously attempting to
> run CVS, that goes along with LockDir.
>
> The beginning states that cvs will try every 30 seconds  to see if it
> still needs to wait for lock.

The backoff if two processes try to lock the same directory at the same
time can be expensive in delay time, but those processes are just doing
a sleep, so it should not horribly impact the load on your machine.

> e)     Any chance this is a parameter that can be decreased * or would
> it's checking more often just create more overhead and slow things down?

I have not played much with it. The value you want to muck with is in
src/cvs.h

#define CVSLCKSLEEP     30              /* wait 30 seconds before retrying */

it is called from lock_wait().

>
>
>
> In the end, it states
>
> if someone runs
>
>   cvs ci a/two.c b/three.c
>
> and someone else runs cvs update at the same time, the person running
> update might get only the change to `b/three.c' and not the change to
> `a/two.c'.
>
> f)     I assume this does not relate only to when LockDir is set.  This is
> the case period, correct?

Yes.

>
> The developers do have to communicate a bit.  But, I guess that's also why
> we have 77 developers running updates all the time.
>
>
>
> 5)
>
> Is it possible/feasible to have multiple pserver sessions, each then
> having it's own port and each going to the same repository, but going one
> level past that and each going to its own project?  (It wouldn't be two
> repositories, though it might look like it, because only one init was ever
> done.)  Would having each project on its own port help in the interest of
> performance?

It seems unlikely.

>
> Example:
>
>
>
>  2401  stream  tcp  nowait  root  /usr/local/bin/cvs
>
>  cvs -f --allow-root=/usr/cvsroot/PROJ1 pserver
>
>
>
>  2402  stream  tcp  nowait  root  /usr/local/bin/cvs
>
> cvs -f --allow-root=/usr/cvsroot/PROJ2 pserver
>
>
>
> Or, switching that around, would there be any benefit to having two
> repositories and connecting both of them to one pserver?

Many folks have multiple --allow-root options on one pserver invokation
to allow multiple disjoint cvs repositories to be served by one server.

I do not believe it to cause any difference with regard to performance.

>
>
>
>
>
> Finally, but not related to performance:
>
> 6)
>
> If a cvs command is killed uncleanly by a crash or by a kill -9, this
> could leave errant locks. I know how to search and remove errant locks to
> get going again.
>
> a)     But, does this also corrupt the project repository you were working
> on?

On the server side, killing a process is unlikely to cause a corruption.
Only stable files are renamed from their modified ,file, form to the
real file,v file. Halting the machine could leave the filesystem in an
inconsistent state, and that could cause the repository to have
corrupted files.

If you do get a corrpted repository file, there is a good possibility
that the corruption will remain unnoticed unless you try to checkout all
of the file versions. This is what the contrib/check_cvs script does.

On the client side, I think it unlikely that you would be able to find
a file that cvs thought was not modified in some way and be able to do
a diff to see the difference.


> b)     If so, how can one find what was corrupted and are there steps one
> can take to refresh or update to the last uncorrupted file?

Well, try using check_cvs to see if the repository looks okay, try
building your sources at a particular version or time to see if it
matches a known good build.

> c)     Or, do you just have to revert to the last backup taken?

I have used backups to fix ,v files that were on a disk that crashed,
but that requires a fairly intimate understanding of rcs file format
internals and is not for the faint of heart.

> If you revert to the last backup, and that occurred at noon, you might
> have a developer who was in the process of checkin when backup occurred.
> You would again have errant locks.  Again, easy enough to remove.  But
> what, if anything, needs to be done to get to the last valid check-in of
> that file?  Does deleting the lock keep any half changes from registering
> in the * ,v * files or elsewhere and automatically give us back the link
> to the uncorrupted revision before ci halted?  Or, are there step one can
> or must take to find this file(s) and get the last uncorrupted revision?
> </! P>

This is a difficult situation indeed. You might want to consider using
something like the cvslock program (on sourceforge.net I think) to lock
your repository for a time while you then do a backup of your repository.

Or you might considering having "hot backups" using rsync or CVSup or
(in the case of a netapp) a .snapshot of the filesystem.

        Good luck,
        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQE/nz7c3x41pRYZE/gRAlYgAKCaKcnNatpUGFwnXeeo6Vl28/m6lACeNc+K
a/83617CfCBAE0yUZw7cO6o=
=KlBF
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]