[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-users] Savannah outage v2

From: Sylvain Beucler
Subject: Re: [Savannah-users] Savannah outage v2
Date: Sun, 31 May 2009 14:57:40 +0200
User-agent: Mutt/1.5.18 (2008-05-17)


A bit of news on our current status.


- On Thursday/Friday night, a disk failed and the RAID bugged, and for
  safety Savannah was shut down.

- On Friday, after getting physical access to the (distant) colocation
  and preparing a backup, we changed the faulty disk and after
  performing some checks, the system appeared fine.  Ward mentioned he
  already saw a single disk make the whole RAID to fail.

- On Friday night the RAID bugged again and we shutdown Savannah again

- On Saturday, a new expedition to the colocation saw that the
  filesystem was corrupted.  Attempt to recover it failed, to the
  point that we now need to reinstall everything.  The cause of this
  corruption is still not known.  Feel free to suggest.

- The current disks were put aside for further recovery attempts.
  We've now reinstalled the base system on 2 new disks, and are
  reinstalling a partial service


Now this is getting gory.

The last backup was performed while RAID was buggy, and lots of files
were reported missing, in particular for CVS/SVN/Git/Hg. Hence the
last backup is incomplete.

And, our last full backup from tape is from end of April. Normally
tape backups are more recent, but there were independent backup
issues. We've not discussed since in detail as we're focusing on
recovering the data asap.

So, while the base of the system and data is there, we're partially
missing May.

Current status

We're reinstalling a partial service.

The frontend can be restored from its state on 29th 02:00
GMT. Probably available today.

sftp-based services should be OK too, but will probably come later.

The missing data is essentially CVS/SVN/Git/Hg.

For the Git/Hg: we plan to install an empty service (maybe today),
where you'll be able to import the last state of your project with a
classic 'push' command. We'll also make available the data from the
April backup (not before tomorrow). You can prepare by having a look
at how 'push' works, for example the '--all' option in Git.

For CVS/SVN: since you probably don't have a backup of the repo, this
is more difficult. When we get the April backup tomorrow, we'll make
it available, so you can check it and agree to reimport it. Meanwhile
we're trying to see if we can recover May from the corrupted disks.

In parallel, we're investigating DRBD to have better protection next

The Savannah Hackers

reply via email to

[Prev in Thread] Current Thread [Next in Thread]