Jonathan S. Shapiro wrote:
When I first read about persistance in EROS I though it was great, but
there is still the problem that after a crash you are transported back
in time. Doesn't this create more troubles than it solves? Especially in
networked environments. I'm think about the case where my 'files' are
stored on another computer, in particular. I also read EROS/KesKOS allow
programs to force a snapshot,
I think you need to think of a modern-day database. When a database
crashes, one possible way for it to recover is with a transaction log
that stores all the information from transactions that were taking
place. This way, databases can recover almost right up to the
point-in-time the crash occurred.
If the OS did something similar (maybe not logs but something else... I
need to read the papers), then you could get something similar to
database recovery in the OS. Technically, I don't think this would be
difficult to do. I haven't thought about all the details, but in my
mind you would just have to have some way to store all the process
(task) contexts (servers+data/registers) as they run and keep
checkpointing them. Jon Shapiro or anyone else that has read the
KeyKOS/EROS papers on single-level-stores can probably answer this
question better.
if this happens frequently doesn't this
greatly reduce performance?
I don't know how KeyKOS/EROS handle persistance, but I'm quite curious
how they do it without providing lots of overhead due to overloading
the bus bandwidth (i.e. IDE, SCSI, etc). Perhaps on-the-fly
compression is used? If this is the case, perhaps hardware can be
created that is meant for this specific kind of OS work that speeds up
compress/decompression. Perhaps optimized device drivers help
keep this overhead low since it becomes a first-class function in the
OS. I really don't know.