qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [RFC] find_next_bit optimizations


From: Peter Lieven
Subject: [Qemu-devel] [RFC] find_next_bit optimizations
Date: Mon, 11 Mar 2013 14:44:03 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130221 Thunderbird/17.0.3

Hi,

I ever since had a few VMs which are very hard to migrate because of a lot of 
memory I/O. I found that finding the next dirty bit
seemed to be one of the culprits (apart from removing locking which Paolo is 
working on).

I have to following proposal which seems to help a lot in my case. Just wanted 
to have some feedback.
I applied the same unrolling idea like in buffer_is_zero().

Peter

--- a/util/bitops.c
+++ b/util/bitops.c
@@ -24,12 +24,13 @@ unsigned long find_next_bit(const unsigned long *addr, 
unsigned long size,
     const unsigned long *p = addr + BITOP_WORD(offset);
     unsigned long result = offset & ~(BITS_PER_LONG-1);
     unsigned long tmp;
+    unsigned long d0,d1,d2,d3;

     if (offset >= size) {
         return size;
     }
     size -= result;
-    offset %= BITS_PER_LONG;
+    offset &= (BITS_PER_LONG-1);
     if (offset) {
         tmp = *(p++);
         tmp &= (~0UL << offset);
@@ -43,6 +44,18 @@ unsigned long find_next_bit(const unsigned long *addr, 
unsigned long size,
         result += BITS_PER_LONG;
     }
     while (size & ~(BITS_PER_LONG-1)) {
+        while (!(size & (4*BITS_PER_LONG-1))) {
+            d0 = *p;
+            d1 = *(p+1);
+            d2 = *(p+2);
+            d3 = *(p+3);
+            if (d0 || d1 || d2 || d3) {
+                break;
+            }
+            p+=4;
+            result += 4*BITS_PER_LONG;
+            size -= 4*BITS_PER_LONG;
+        }
         if ((tmp = *(p++))) {
             goto found_middle;
         }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]