l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: self-paging


From: Jonathan S. Shapiro
Subject: Re: self-paging
Date: Mon, 05 Dec 2005 11:22:44 -0500

On Mon, 2005-12-05 at 12:16 +0100, Bas Wijnen wrote:
> On Sun, Dec 04, 2005 at 01:49:34PM -0500, Jonathan S. Shapiro wrote:
> > The background fractal problem may be a
> > motivator for a high priority, but it is *not* a motivator for a real time
> > guarantee.
> 
> Oh, I wasn't talking about real-time guarantees.  I was just saying that the
> user should have the option to specify that one process is the most important
> of all, and it should not be hindered in the sense that it is swapped out.
> There should be similar settings for processor time, but I wasn't talking
> about them.  If the user specifies limits which are so high that the movie
> cannot get a real-time guarantee anymore, then the movie shouldn't get it
> (until the user changes the settings).

We're getting closer, but your answer leads me to think that we may have
a terminology problem.

When you dig down deep, all of the real-time scheduling mechanisms that
I have seen are scheduling "hard" resource. The resource may be
provisioned probabilistically, but whatever the details there is an
assumption that the RT scheduler has some amount of hard guarantee that
it is dividing.

It follows immediately that nothing else can be permitted to violate the
*schedulers* guarantee. In practice, this tends to mean that *anything*
that schedules hard resource commitments is real time (in the sense that
it needs to talk to the RT scheduler to get the guarantee).

When you talk about not swapping a process out, you are talking about
pinning its pages. This is a hard guarantee, so indeed you are proposing
a real-time situation.

> In my proposal, every process (or address space, really) has a physical memory
> quota.  It may be 0, which means it is fully swapped out.  A process cannot
> have more pages in memory than it has quota.  When the quota shrinks, there
> are immediately pages swapped out (if the process was using its full quota).
> Because we don't want to ask the process which pages that should be (and in
> particular, we don't want to wait for an answer), the answer has to be
> prepared beforehand.  That's what the list is all about.

You need to pick one system or the other. Either the process makes the
decisions or the system does. If the list is provided in advance it
needs to be provided **to the kernel**, which becomes very tricky in the
face of sharing.

> Why must this be without user intervention?  Because it'd be very annoying for
> the user to turn the knobs all the time.  What the user wants to do is give
> the game high priority.

Nonsense. What the user wants to do is say "make that one run better".
The user has probably never heard of priority. And in any case, this
*is* user intervention. Now we are just arguing about the UI.

The point I'm trying to make is that tuning knobs is actually a good
metaphor. The user wants to say "make that one go faster", but the
problem with this is that they don't really know why it is slow. Turning
up the CPU allotment on a swap-bound process won't help and vice versa.
Ideally, we don't even want to have to talk to the user about this; we
want to just have the user's statement that a certain thing is
important.

> ...  Then the system should just give it the memory it
> wants when it asks for it (at the cost of others, which have a lower
> priority).

I think you are confusing two things here: the allotment vs. the
allocation. The part that is tricky from the scheduling point of view is
the allotment. That was done when the user changed the tuning knob (at
least indirectly, in the sense that the user has told the long-term
scheduler how to rebalance). The allocation is then done according to
process demand.

> > I understand that you want these things to be fast. That is not the same as
> > wanting them to be real-time.
> 
> Indeed.  I wasn't talking about real-time (although giving it all the memory
> it asks for may hinder other applications which do want real-time)...

Please re-read your sentence. Do you now agree that you *were* talking
about real time?

> The difference is between ping-time and throughput on a network line.  I was
> talking about giving the process high throughput, not about giving it a low
> ping-time.

This is a good goal. There's about a billion papers on this, none
conclusive. It's a deep research problem, and at this point it seems
safe to speculate that there simply *isn't* a general solution. I think
that our short term challenge is to come up with the right kernel
mechanisms so that people can experiment successfully with scheduling.


> > > During this thread I got more and more the idea that it isn't cumbersome
> > > at all, really. :-)  The program needs to request a page anyway.  So add
> > > an extra argument to that call which specifies where that page is in the
> > > list...
> > 
> > I think that you are not considering the kernel-level implications well
> > enough.
> 
> I wasn't thinking kernel-level at all, I'd think this would all be in user
> space (in physmem and some global pager).  Is it just that those tasks need to
> be in the kernel for you, or are we misunderstanding each other?  I'll assume
> the former for now, that you need physmem and the global pager in the kernel.
> I'd be interested to know why though.

In EROS/Coyotos, the eviction decisions are made in the kernel guided by
application-defined policy. This is largely because of checkpoint. In
practice, it doesn't seem to restrict the feasible policies.

The problem with your approach is that we must now ask what keeps the
*pager* in core (because the whole point here was response time). Your
answer will be "well, the pager's pager makes this guaranteee...", and
it's "turtles all the way down". Somewhere, something in the kernel has
to be in on the joke.

> >   2. The design that you propose is tricky, because it involves attaching
> >   ordering state to each frame. This is not as easy as it sounds. What
> >   happens when you and I share the same frame, and you say "put it in
> >   position 4" and I say "put it in position 10"? Where does the position
> >   info get stored? Hint: can't be done in the kernel, and then we are going
> >   to get a reductio problem here.
> 
> First of all, note that there is a list of pages per process.  So "position 4"
> is meaningless to the process who puts it at position 10 in its own list.  The
> page will be stored in two lists, at its own position for each.

Yes, the problem is that the kernel is now going to have to ask two
parties for an eviction policy, and (whichever one is chosen) it will
pick the wrong one to ask...

> Every space bank has a certain number of "active" pages.  Those can be of
> three types:
> - currently in memory
> - currently in swap
> - currently nonexistant ("swapped-out" cache)
> The third one exists because the process doesn't need to reallocate it when it
> is "swapped out", it can just remap it, however there's no guarantee about the
> contents.

This has nothing whatsoever to do with space banks! Space banks allocate
disk storage, and the space bank data structure is a disk data
structure!



I'll reply to the rest later -- an appointment just showed up.

shap





reply via email to

[Prev in Thread] Current Thread [Next in Thread]