bug-parted
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Evms] EVMS Conference call (04/10/01) minutes


From: Andreas Dilger
Subject: Re: [Evms] EVMS Conference call (04/10/01) minutes
Date: Wed, 25 Apr 2001 09:48:16 -0600 (MDT)

Andrew writes:
> For FAT (on Windows), the size of the partition must equal
> the size of the file system.  Otherwise, Windows will silently
> destroy your file system.

Yuk, I knew they would break it somehow.

> Why is major/minor an issue?  Who uses major/minor?

Well, LILO for instance.  The LVM user tools (not that they are
relevant to this issue).  Basically, anything that wants to
check a block device belonging to a given class (i.e. SCSI, IDE, etc).

> Andreas writes:
> > AIX LVM did this totally correctly:
> > - Each PV in a VG has a full copy of the VGDA (if only 1 PV in the
> >   VG, it has 2 copies).  A quorum of VGDA copies must agree before
> >   automatic VG activation is possible.
> > - Each VGDA has a timestamp at the beginning and end (possibly even
> >   at the beginning and end of each major data struct).  This ensures
> >   that the data is known to be invalid if the beginning and end time
> >   stamps don't match.  In this case, we _always_ have at least one
> >   other copy (updates made synchronously in sequence) which is known
> >   good.
> 
> What do the timestamps represent?  Last access, or something?
> (If it wasn't part of the quorum, it doesn't get used, or something?)

I believe the timestamps represent the last modification time of the VGDA.
They are used (as an opaque cookie on a single PV) to ensure that all
parts of the VGDA were successfully written to disk.  This is otherwise
difficult to ensure, because hard drives only guarantee that a given sector
makes it to disk atomically.

They are used as timestamps between PVs to determine which VGDAs can make
up the quorum.  For a VG with a single PV, there are two VGDA copies on
that PV, so the timestamp is used to determine which one is complete/newer.
For a multi-PV VG, there are at least 3 VGDA copies, so we use the timestamp
to determine which are identical, and we need a quorum of identical VGDA
copies to determine what represents the state of the VG.  If we have half
of the PVs with one VGDA, and half with a different VGDA, then the timestamp
is used (like in the single PV case) to determine which is newer.

> Actually, the data might not need to be moved, because with
> LVM, it should be possible to "insert" more space at arbitary
> locations in the volume.  But, the granularity of the inserted
> space is usually the size of a physical extent... (although
> this could probably be hacked, by discarding part of it...
> but this probably complicates things too much)

Correct.  For (any) LVM, the LV contents are always logically contiguous
no matter how they are layed out on disk.  You never have to touch
the contents, regardless of what you are doing to the actual on-disk
layout.

> So, I still think we need resize-the-start, for metadata
> at the front.

I think this will be difficult (if not impossible) to do in a general
way.  There are lots of different filesystems, and databases and such
often use raw device access, so there is no hope to "resize the start"
with such beasts.

Given that you previously said that Windows will corrupt an MSDOS
filesystem which is not the same size as the device, I think the
only way to reasonably support a wide range of situations is to have
either compatibility volumes which we don't touch, or to (hopefully
transparently) migrate the data into an LVM-style virtual volume which
allows us to move the data around easily without touching the contents.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert



reply via email to

[Prev in Thread] Current Thread [Next in Thread]