gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] [BUG] FEATURE PLAN: two stage commit


From: Tom Lord
Subject: Re: [Gnu-arch-users] [BUG] FEATURE PLAN: two stage commit
Date: Mon, 12 Jul 2004 12:31:09 -0700 (PDT)


    > From: Zenaan Harkness <address@hidden>

    > On Mon, 2004-07-12 at 15:35, Tom Lord wrote:
    > ...
    > > After those two phases we can clean up a bit.   We can note that B, C,
    > > D, ... are no longer "half committed" or "conditionally committed".
    > > We can mark them as fully committed so that reader don't have to
    > > verify that A has been committed.   But this is just paperwork... it's
    > > the two phases described above that really matter.

    > > Note that the clean-up step is just an optimization.   A reader who
    > > encounters a half-committed B can't decide if that's a completed
    > > commit or not without observing whether or not the corresponding
    > > commit of A is complete.

    > This is fine and satisfies the 'transactional' atomicity requirement.
    > Ideally however, no clients have to be involved in "checking whether a
    > half-commit is actually a full commit". Ie., clients of the database
    > (tla) should only see either:

    > * archive at ZERO, or

    > * archive at ZERO + A + B + ...

    > and nothing inbetween. This might be obvious or meant to be anyway, but
    > I thought I'd clarify the point.

That is an ideal but it's not an ideal that can be achieved
perfectly.

Suppose, as usual, we want to commit (in the same txn) revisions A, B,
C, etc.  That txn can only be implemented by a series of archive
mutations each of which is individually atomic.  So we get to a state
before the 2nd phase commits in which:

        B is marked half-committed
        C is marked half-committed
        ....
        A does not yet exist

A client who observes only B, at that point in time, has made an
ambiguous observation.   It might be the situation illustrated above
or it might be:

        B is marked half-committed
        C is marked half-committed
        ....
        A is committed

In the second case, B's "half-committed" state should be treated as if
it were fully committed.  In the former case, B's "half-committed"
state should be treated like a revision lock (B isn't really committed
yet).

To disambiguate, a client has to check the state of A.

That's what the bookkeeping step is for.   After A is committed,
it goes back and marks the half-committed revisions as full commits.
Strictly speaking, this is an optional step.   The result afterwards
is:  

        B is committed
        C is committed
        ....
        A is committed

and any client who observes only B has now made an unambiguous
observation and can be certain that B is really and truly committed.
Client's no longer have to double-check A.

So, there's is a period of time there, before or during the
bookkeeping step, when clients have to do extra work to read 
the archive.


    > > While B is "half committed" but before A is committed, B functions
    > > like a lock.   Nobody else should commit ahead of B.   The commit of B
    > > has to be canceled or completed before someone else can commit to that
    > > same version.   So between phase 1 and phase 2, the "half commit" of B
    > > is basically an ordinary version lock.

    > I see - you are using the changes as locks.

    > A change is a change.

    > A lock is a different concept.

    > Transactions require locks.

    > Overloading a change to be a lock might be a convenience of
    > implementation, performance, or somethe other reason, but is not the
    > 'ideal interface'.

I considered that.  It would work this way:

Let's suppose that the algorithm were modified so that it really had
three phases:

    1. obtain a lock which we'll call A-lock

    2. half-commit B, C, D, ...

    3. commit A, release A-lock

The commit of A in step 3 must instead fail if, before step 3, A-lock
has been broken.   It'd be natural to make A-lock the revision lock 
for A.   So the algorithm then becomes:

        1. Lock rev A
        2. half-commit B, C, D
        3. finish committing A

A client who wants to kill the txn and force a roll-back of B, C, D,
etc. can do that just by successfully breaking the lock for revision
A.

That's different from the algorithm I posted in which, to kill a txn,
a client would have to create its _own_ revision A.

I rejected that idea because it would change the abstract semantics of
revision locks (although it wouldn't change their de facto semantics
on dumb filesystem implementations).

Currently, at least in theory (e.g., with a smart server), you only
know if your lock for a revision has been broken or not by committing.
The abstract interface permits a trace like this:


        client 1:               client 2:

  1.    lock A                  -
  2.    -                       lock A
  3.    send A data             send A data
  4.    -                       finish (commit) A (succeeds)
  5.    try to finish A (fails)

and an alternative history trace, identical up to step 4, is also 
permitted:


        client 1:               client 2:

  1.    lock A                  -
  2.    -                       lock A
  3.    send A data             send A data
  4.    finish (commit) A       -
  5.    -                       try to finish A (fails)

In other words: that client 2 is able to obtain the lock in step 2
says nothing about whether client 1 is precluded from committing A.  A
server is permitted to speculatively tell client 2 that, yes, it
obtained the lock even if that server told the same thing to client 1:
just so long as when the whole txn for the revision commits only one
locking client actually succeeds.

It's desirable to permit locks to work that way so that, for example,
highly contentious revisions (lots of clients want to create them) can
be managed by a "first client to get all of his data here wins"
strategy (and variations on that).  (Consider a busy wiki or pqm, for
example.)

Given that, my reasoning is that, in practice, composite commits will
usually succeed (if that isn't the case, the infrastructure has flaws
that ought to be fixed).  Consequently, the number of revisions
created just for the sole purpose of killing a composite transaction
will be diminishingly small.  A minor inconvenience in return for a
more flexible archive semantics.


    >> Let's say we want to break that lock.   We can't do that just locally
    >> to B.   Some other client may be convinced that B is half committed
    >> and, on the basis of that assumption, commit A.   If A is committed
    >> but the B commit broken, then we've had an atomic commit of only A, C,
    >> D, E, .... leaving out B: a bug.

    > Lock detection between clients is another reason to hide in-progress
    > transactions from clients and minimize the need for intelligent clients
    > (forget for a moment that each client is a server for its own repository
    > - that's not the point here).

They really can't be hidden unless you presume smart servers.


    >> In terms of traces, a client who wants to atomically commit A, B, C,
    >> ... will do:

    >>       half-commit B
    >>       half-commit C
    >>       half-commit D
    >>       half-commit E
    >>         ....
    >>         commit A              [*]
    > ...

    > What I think you want to do to simplify this is have a server solely
    > responsible for commits to its own repository (this might be implicit
    > already), and have clients take locks within the borders of a
    > transaction.

    > This is another of the ACID properties - isolation. We want to isolate
    > transactions, for which we need locks to provide the needed isolation
    > between clients (please forgive me if I've stuffed up the terminology -
    > I think it's right but I'm no DB guru - yet :):

Me either, but:  we do have isolation, it's just that some of the
computation that implements isolation has to take place client-side
because of our commitment to supporting "dumb servers".

    > Two clients can commit sets of changes that should be mutually exclusive
    > and each "either write or not write" each of its "transactions" changes.

    > What we want is for each client to isolate its transaction against each
    > other transaction.

    > So what we want is to define the locks needed:

    >  - change locks
    >  - branch/ tree locks
    >  - whole repository lock

    > ??? (Just guessing here..)

I'm unable to extract from that what specifically you have in mind.


    > There are rules on locks - see C J Date's classic for the details (I
    > have it in a box somewhere - still unpacking at this point sorry), such
    > as ways to take locks (always take highest level lock that you will or
    > might need first, and finer grained locks subsequently, lock minimal
    > required (obvious), take minimal lock needed (eg. each of above category
    > locks can be READ lock or WRITE lock (and I think there might even be
    > another type of lock).

You're talking, afaict, about things like techniques for deadlock
prevention.  We don't have any (solvable) problems with deadlock and,
if we wanted to use ordered (or hierarchical) locks we'd be in
trouble.  Suppose two clients concurrently want to do a composite
commit:

        client 1 wants to commit A B C D E
        client 1 wants to commit D E X Y Z

if dead-lock occurs, clients detect that automatically, any client can
discover it, and rollback is trivial.    A trace might be:

        c1:                     c2:

        half-B                  half-D
        half-C                  half-E
        fail at half-D
        (txn aborted)

We can't prevent that deadlock in the first place because our set of
revisions, A..E and X..Z can be distributed across multiple archives.
There is no central server or central location in which, for example,
client 2 could atomically lock all of D, E, X, Y, and Z.

Full distribution (no central service) means that we can not prevent
deadlock.  We can detect it and trivially roll it back and that's as
good as can be done.  (There is, I think, a
physics/information-theory-based proof of this.... but I could be
wrong, I guess, since I haven't tried to produce said proof.)


    >>> My current thoughts are just to modify commit & lock-revision to
    >>> be able to leave a all-but-committed changeset in place, and to
    >>> renamed-to-fully-committed from the command line. 

    >> That approach, if I understand what you mean, will fail to have
    >> ACID properties and will therefore be a
    >> _MAJOR_HUGE_VERY_VERY_BIG_AND_SERIOUS_ regression.

    >> Ok, the word "regression" is slightly misleading since it wouldn't
    >> break existing functionality.   But it would add new functionality
    >> that is incredibly broken in a way that existing functionality
    >> carefully avoids.

    > The only problem with a separate locking mechanism is exactly that - it
    > would be 'separate' or added functionality to the existing system.

If I understand the idea of just exposing the half-committed mechanism
and letting people use that directly the problem is that the outcome
of a composite transaction could wind up being that half of the
transaction had effect and half was rolled back.  "Well behaved"
scripts using the raw functionality could, of course, implement
exactly the robust algorithm I've been talking about but they could
also freely screw up.  Why not, then, build that algorithm in, prevent
clients from screwing it up, and avoid adding a new user-visible
revision state ("half committed") to user-visible archive semantics?
(It's ok if the half-committed state is exposed in an advisory way,
the same way that revision locks currently are;  it's just that 
it shouldn't go deeper than that.)


    > Anyway, hopefully the above will contribute in some way.

Of course.  It helps to get "many eyeballs" on the design before it's
finalized.  In this case, it got me to remember and write down why I
decided that a commit was needed to kill a composite transaction
rather than just a `tla lock-revision --break'.

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]