qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/6] rdma: core logic


From: Michael R. Hines
Subject: Re: [Qemu-devel] [PATCH 3/6] rdma: core logic
Date: Fri, 28 Jun 2013 10:11:01 -0400
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130329 Thunderbird/17.0.5

On 06/28/2013 03:33 AM, Paolo Bonzini wrote:
Il 28/06/2013 00:44, address@hidden ha scritto:
+ *
+ * This will have a terrible impact on migration performance, so until future
+ * workload information or LRU information is available, do not attempt to use
+ * this feature except for basic testing.
+ */
+//#define RDMA_ALLOW_REGISTRATION_WINDOW
Out of curiosity, how terrible is the terrible impact?

(I'll be sure to put this in the cover letter next time - sorry about that).

Here's the breakdown for a 14 GB (proprietary) memory intensive
micro-benchmark that was use here in IBM. (14 is the largest I can
create on my older hardware):

These numbers are *after* the bulk phase:

1. x-rdma-pin-all enabled: ~26 gbps (PCI limit, hardware is a few years old). 2. x-rdma-pin-all disabled, unpin disabled: ~15 gbps (fluctuates as high as 20, but usually 15) 3. x-rdma-pin-all disabled, unpin enabled: ~13 gbps (very stable at 13, little or no deviation)

I have to admit that the difference between #2 and #3 is much smaller than I expected.

I suspect this is because of the very large chunk sizes (1MB).

The larger the chunk size, the less of a difference in unpinning costs.


+ * Perform a non-optimized memory registration after every transfer
+ * for demonsration purposes, only if pin-all is not requested.
+ */
+static int qemu_rdma_unregister_waiting(RDMAContext *rdma) {
+#ifdef RDMA_ALLOW_REGISTRATION_WINDOW
IIRC  function can be always enabled, it will just do nothing if
no one registers anything.  It is just this bit that needs the
#ifdef:

No, this is an entirely different unpinning function. It only
has one user, which is the new RDMA_ALLOW_REGISTRATION_WINDOW.

But adding an extra #ifdef doesn't hurt anyway.

+
+        if (!rdma->pin_all) {
+            /*
+             * FYI: If one wanted to signal a specific chunk to be unregistered
+             * using LRU or workload-specific information, this is the function
+             * you would call to do so. That chunk would then get 
asynchronously
+             * unregistered later.
+             */
    >> #ifdef RDMA_ALLOW_REGISTRATION_WINDOW
+            qemu_rdma_signal_unregister(rdma, index, chunk, wc.wr_id);
    >> #endif
+        }

Then qemu_rdma_signal_unregister would be unused for
!RDMA_ALLOW_REGISTRATION_WINDOW, so you would need to put that
function within an #ifdef (the whole function, not just the
body).

At this point RDMA_ALLOW_REGISTRATION_WINDOW becomes a slightly
wrong name.  What about RDMA_UNREGISTRATION_EXAMPLE?

Sounds fine to me. Will change it.

+        if(test_bit(chunk, block->transit_bitmap)) {
+            DDPRINTF("Cannot unregister inflight chunk: %" PRIu64 "\n", chunk);
+            continue;
+        }
Nobody will process this unregistration after this.


Good catch - the clear_bit() was too late.


Should you move this


+        // perhaps test_and_clear_bit here?  and exit if the bit is zero
+        clear_bit(chunk, block->unregister_bitmap);
+
+        ret = ibv_dereg_mr(block->pmr[chunk]);
+        block->pmr[chunk] = NULL;
+        block->remote_keys[chunk] = 0;
+
+        if(ret != 0) {
+            perror("unregistration chunk failed");
+            return -ret;
+        }
+        rdma->total_registrations--;
+
+        reg.key.chunk = chunk;
+        ret = qemu_rdma_exchange_send(rdma, &head, (uint8_t *) &reg,
+                                &resp, &resp_idx, NULL);
+        if (ret < 0) {
+            return ret;
+        }
+
+        DDPRINTF("Unregister for chunk: %" PRIu64 " complete.\n", chunk);

to a separate function and check the unregister_bitmap when you get a
transfer completed message from IB?

We would need a new thread for that. The migration is currently
single threaded. The new thread would listen asynchronously
for the unregistration complete messages. Based on the last couple
years of emails I've seen, introducing a new thread is not
a simple review task.

With respect, now that the implementation is available, I'd like
to move on, as we do not have an LRU for this migration and
I do not have the authorized manpower to implementing one.

There is a sufficient body of (compile-time-enabled) code here
for anyone to take up should overcommitment become a problem
for someone in the middle of an RDMA migration.


Also, you do not need to receive the unregistration completed response
now, do you?  You can have another bitmap for "unregistration in progress"
and only wait for completion before re-pinning.  This way, the impact
of unregistration is low if the page is not going to be changed soon.

Same answer above applies to this question.

_However_, please make this new functionality a separate patch so that we
can discuss it better.

Will do.

- Michael




reply via email to

[Prev in Thread] Current Thread [Next in Thread]