qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_itera


From: Andrey Gruzdev
Subject: Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate()
Date: Tue, 24 Nov 2020 11:02:09 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 24.11.2020 00:34, Peter Xu wrote:
On Fri, Nov 20, 2020 at 07:53:34PM +0300, Andrey Gruzdev wrote:
On 20.11.2020 19:43, Peter Xu wrote:
On Fri, Nov 20, 2020 at 07:15:07PM +0300, Andrey Gruzdev wrote:
Yeah, I think we can re-use the postcopy queue code for faulting pages. I'm
worring a little about some additional overhead dealing with urgent request
semaphore. Also, the code won't change a lot, something like:

[...]
          /* In case of 'write-tracking' migration we first try
           * to poll UFFD and sse if we have write page fault event */
          poll_fault_page(rs);

          again = true;
          found = get_queued_page(rs, &pss);

          if (!found) {
              /* priority queue empty, so just search for something dirty */
              found = find_dirty_block(rs, &pss, &again);
          }
[...]

Could I ask what's the "urgent request semaphore"?  Thanks,


These function use it (the correct name is 'rate_limit_sem'):

void migration_make_urgent_request(void)
{
     qemu_sem_post(&migrate_get_current()->rate_limit_sem);
}

void migration_consume_urgent_request(void)
{
     qemu_sem_wait(&migrate_get_current()->rate_limit_sem);
}

They are called from ram_save_queue_pages and unqueue_page, accordingly, to
control migration rate limiter.

bool migration_rate_limit(void)
{
[...]
         /*
          * Wait for a delay to do rate limiting OR
          * something urgent to post the semaphore.
          */
         int ms = s->iteration_start_time + BUFFER_DELAY - now;
         trace_migration_rate_limit_pre(ms);
         if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
             /*
              * We were woken by one or more urgent things but
              * the timedwait will have consumed one of them.
              * The service routine for the urgent wake will dec
              * the semaphore itself for each item it consumes,
              * so add this one we just eat back.
              */
             qemu_sem_post(&s->rate_limit_sem);
             urgent = true;
         }
[...]
}


Hmm... Why its overhead could be a problem?  If it's an overhead that can be
avoided, then postcopy might want that too.

The thing is I really feel like the snapshot logic can leverage the whole idea
of existing postcopy (like get_queued_page/unqueue_page; it's probably due to
the fact that both of them want to "migrate some more urgent pages than the
background migration, due to either missing/wrprotected pages"), but I might
have something missing.

Thanks,


I don't think this overhead itself is a problem since its negligible compared to other stuff.. You're undoubtly correct about using postcopy idea in case when wr-fault pages arrive from the separate thread or external source. Then we really need to decouple that separate thread or external requestor from the migration thread.

In this patch series wr-faults arrive in the same migration loop with normal scan, so I don't see direct reason to pass it through the queue, but I might have missed something from your proposal.

Do you mean that if we use postcopy logic, we'll allocate memory and make copies of faulting pages and then immediately un-protect them? Or we just put faulting page address to the queued item and release protection after page content has been saved.

Thanks,

--
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH  +7-903-247-6397
                virtuzzo.com



reply via email to

[Prev in Thread] Current Thread [Next in Thread]