[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Savannah-hackers-public] [gnu.org #622071] colonialone: disk 'sdd' fail
Peter Olson via RT
[Savannah-hackers-public] [gnu.org #622071] colonialone: disk 'sdd' failed
Thu, 07 Oct 2010 14:53:22 -0400
> [beuc - Wed Oct 06 15:21:47 2010]:
> On Wed, Oct 06, 2010 at 03:05:04PM -0400, Peter Olson via RT wrote:
> > > [beuc - Wed Oct 06 14:46:46 2010]:
> > >
> > > Hi,
> > >
> > > Disk 'sdd' is not available anymore at colonialone.
> > >
> > > Smartmontools detected an issue, and mdadm removed it from the
> > > array.
> > >
> > > Can you investigate and possibly replace the failed disk?
> > >
> > > Btw, did you receive the failure notifications?
> > >
> > > Thanks,
> > We took the failed disk out of the RAID array because it appears to
> be a hard failure rather than a
> > glitch (all partitions containing the disk degraded at the same
> > The array contained 4 members and now contains 3 members, all in
> service. We expect to replace it when
> > we next make a trip to the colo.
> > colonialone:~# cat /proc/mdstat
> > Personalities : [raid1]
> > md3 : active raid1 sda6 sdb6 sdc6
> > 955128384 blocks [3/3] [UUU]
> > md2 : active raid1 sda5 sdb5 sdc5
> > 19534976 blocks [3/3] [UUU]
> > md1 : active raid1 sda2 sdb2 sdc2
> > 2000000 blocks [3/3] [UUU]
> > md0 : active raid1 sda1 sdb1 sdc1
> > 96256 blocks [3/3] [UUU]
> > unused devices: <none>
> I'm worried that 'dmesg' shows lots of ext3 errors.
> How can a failed disk in a RAID1x4 array cause *filesystem*-level
> Do we need a fsck or something?
Here are some of the errors from dmesg:
[20930306.805714] ext3_orphan_cleanup: deleting unreferenced inode 86646
[20930306.805714] ext3_orphan_cleanup: deleting unreferenced inode 85820
[20930306.822520] ext3_orphan_cleanup: deleting unreferenced inode 86643
[20930306.829335] ext3_orphan_cleanup: deleting unreferenced inode 86645
[20930306.840398] EXT3-fs: dm-5: 30 orphan inodes deleted
[20930306.840542] EXT3-fs: recovery complete.
[20930307.015205] EXT3-fs: mounted filesystem with ordered data mode.
I found some discussion on the Net that says these messages are a normal
byproduct of making an LVM
snapshot. Are you doing this as part of your backup procedure?
I wrote a script to convert dmesg timestamps to wall clock. These messages are
issued every morning
between 07:58 and 08:15 (or sometimes as late as 08:27).
FSF Senior Systems Administrator
#! /usr/bin/env python
dt = datetime.datetime(2010, 10, 7, 14, 29, 26)
uptime = 20952599
line = sys.stdin.readline()
if not line:
curtime = int(line.split('.').split('['))
delta = datetime.timedelta(0, curtime - uptime)
dt2 = dt + delta
print dt2.isoformat(' '), line,