[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

how to kill object servers

Date: Wed, 12 Sep 90 04:43:38 PDT
   From: xanadu!michael (Michael McClary)

   [For clarity of response, the contents of Michael's message has
   been rearranged (as well as leaving parts out).]

   > Note for our concurrent future: the guarantee is
   > made as of the time of reception of the fsync request, not the time of
   > response, but the guarantee isn't in force until the response is sent.

   That doesn't make sense to me.  How can we be said to have made a "guarantee"
   when the request arrives, before we do anything to "write the policy"?  How
   can we "make" a "guarantee" that isn't "in force"?  (Are you saying that the
   request's position in the request stream is a declaration of what is to be

Yes, I am saying that.  Sorry it wasn't clear.

   Also: we have given the user no additional confidence unless we don't
   send the response until after the data hits the disk, and no additional
   security unless we cause the data to be written immediately (or earlier
   than we otherwise would have written it).

I was suggesting that in fact we don't send the response until the
data hits the disk.  Whether the fsync request causes the data to be
written earlier than it otherwise would is an issue like whether a
"register" variable actually end up in a register.  I.e., a server may
also treat an fsync as a hint to write stuff.

   If the "fsync" actually waits until things hit the disk, it does not
   strengthen the guarantee.  It just means that the loss of the transaction
   will be less probable, because the transaction will survive those crashes
   that leave the disk partition usable.  But failure of a single medium still
   causes the transaction to revert.

   It's worse than NFS.  You must complete writes to TWO disks, or a disk-and-
   a-tape, before acknowledging the request is complete.  The transactions that
   precede the commit request and its acknowledgement must survive even a total
   medium failure.

But note that my original letter said:

   Date: Tue, 11 Sep 90 13:17:04 EST
   From: mark


   Note that the guarantee may still be violated if we restore the server
   from a backup tape.

However, I'd forgotten about the separate-media transaction log.  What
I really had in mind was something like: The guarantee only applies
given no media failure.  However, I see no reason that the delayed-
response aspect of my proposed fsync couldn't also be applied to the
disk + the separate-media transaction log.  We could have (let's say)
a tsync which didn't send a response till all requests preceding it
were safe against any single media failure.  fsync & tsync should
either have the same protocol, or prefereably are just different
parameterizations of one request.

The example of tsync does point out (because it can take so long to
signal success) that the interface I had in mind isn't very good.
Instead of having the fooSync request wait directly (which would
prevent other activity in a single-threaded front-end), the fooSync
operation should instead pass-by-proxy a Sensor which will be rung
(sent a no-wait message) later when all requests preceding the fooSync
request are as safe as what the fooSync request requested.  Here's a
start at a proposal:

enum SafetyDegree { ON_DISK, ON_DISK_AND_LOG, .... };
  /* what kinds of SafetyDegree we actually define will obviously */
  /*  need some work.  The above is only intended to be suggestive */

CLASS(....,Heaper) {
    NOWAIT fooSync (SafetyDegree, SafetySensor * sensor) DEFERRED_SUBR;

CLASS(SafetySensor,Heaper) {

With this spec & interface, a fooSync should impose no overheads or
delays (other than a bit more record keeping in the backend).  It
itself need not be at all quick.

What do you think?