[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFD] diskfilter stale RAID member detection vs. lazy scanning

From: Andrei Borzenkov
Subject: Re: [RFD] diskfilter stale RAID member detection vs. lazy scanning
Date: Thu, 16 Jul 2015 06:42:01 +0300

В Wed, 15 Jul 2015 20:05:56 +0200
Vladimir 'φ-coder/phcoder' Serbinenko <address@hidden> пишет:

> On 28.06.2015 20:06, Andrei Borzenkov wrote:
> > I was looking at implementing detection of outdated RAID members.
> > Unfortunately it appears to be fundamentally incompatible with lazy
> > scanning as implemented currently by GRUB. We simply cannot stop
> > scanning for other copies of metadata once "enough" was seen. Because
> > any other disk may contain more actual copy which invalidates
> > everything seen up to this point.
> > 
> > So basically either we officially admit that GRUB is not able to detect
> > stale members or we drop lazy scanning.
> > 
> > Comments, ideas?
> > 
> We don't need to see all disks to decide that there is no staleness. If
> you have an array with N devices and you can lose at most K of them,
> then you can check for staleness after you have seen max(K+1, N-K)
> drives. Why?

It's not the problem. The problem is what to do if you see disk with
generation N+1 after you assembled array with generation N. This can
mean that what we see is old copy and we should through it away and
start collecting new one. If I read Linux MD code correctly, that is
what it actually does. And this means we cannot stop scanning even
after array is complete.

Extreme example is three-pieces mirror where each piece is actually
perfectly valid and usable by itself so losing two of them still means
we can continue to work with remaining one.

> Let those disks have generation numbers g_0,...,g_{N-1}. Our goal is to
> find the largest number G s.t. number of indices with
> g_i >= G is at least N-K.
> In most common case when you have seen K+1 disks all of them will have
> the same generation number
> g_0=g_1=...=g_{K}
> Then we know that
> G<=g_0
> Suppose not then all of 0,...,K are stale and we have lost K+1 drives
> which contradicts our goal.
> On the other hand when we have seen N-K devices we know that
> G>=min(g_0,...,g_{N-K-1})
> as with G=min(g_0,...,g_{N-K-1}) we already have N-K disks.
> In cases other than mirror usually K+1<=N-K and so we don't even need to
> scan for more disks to detect staleness.

Yes, that was my idea initially as well. Unfortunately the problem here
is not math. 

> The code will be slightly tricky as it has to handle tolerating
> staleness if there are too little disks but it's totally feasible. Let
> me figure out the rest of math and write a prototype.
> > _______________________________________________
> > Grub-devel mailing list
> > address@hidden
> >
> > 

Attachment: pgpLjQDq57av5.pgp
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]