[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RfC PATCH 0/4] make display updates thread safe.

From: Mark Cave-Ayland
Subject: Re: [Qemu-devel] [RfC PATCH 0/4] make display updates thread safe.
Date: Mon, 3 Apr 2017 16:06:46 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0

On 03/04/17 13:42, Gerd Hoffmann wrote:

> On Mo, 2017-04-03 at 13:24 +0100, Mark Cave-Ayland wrote:
>> On 03/04/17 13:03, Gerd Hoffmann wrote:
>>>   Hi,
>>>> I checked the branch, is bitmap_copy_and_clear_atomic correct when you
>>>> have partial updates?  Maybe you need to handle partial updates of the
>>>> first and last word?
>>> Should not be a problem.  We might clear some more bits, but these are
>>> outsize the visible area so they should cause visible corruption (and if
>>> the visible area changes the display code needs to do a full refresh
>>> anyway).
>> Right. Also is there a reason that
>> cpu_physical_memory_snapshot_and_clear_dirty() uses BITS_PER_LEVEL when
>> calculating the alignment used for the first/last addresses?
> The code copy doesn't operate on bits but on bitmap longs, so I can just
> copy the whole long instead of fiddeling with bits.  So I round down to
> a bitmap long start.  On 64bit hosts that are units of 64 pages.
> Actually querying the copied bitmap operates on page granularity of
> course.
>> Otherwise you end up expanding the range of your last address
>> considerably beyond the next page alignment.
> Yes, it's larger, but as it expands to the start of the long word I
> expect the penalty is almost zero (same cache line) and being able to
> just copy the whole long instead of dealing with individual bits should
> be a win.
>> And also in the main page
>> loop why is BITS_PER_LEVEL used? I see that several of the internal
>> bitmap routines appear to use BITS_PER_LONG for compressing into the
>> bitmap which might be more appropriate?
> shift by BITS_PER_LEVEL is the same as division by BITS_PER_LONG

Right hopefully I understand this now. The following diff appears to fix
CG3/TCX for me:

diff --git a/exec.c b/exec.c
index 000af18..66bdcf4 100644
--- a/exec.c
+++ b/exec.c
@@ -1071,7 +1071,7 @@ DirtyBitmapSnapshot
      (ram_addr_t start, ram_addr_t length, unsigned client)
     DirtyMemoryBlocks *blocks;
-    unsigned long align = 1 << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
+    unsigned long align = 1UL << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
     ram_addr_t first = QEMU_ALIGN_DOWN(start, align);
     ram_addr_t last  = QEMU_ALIGN_UP(start + length, align);
     DirtyBitmapSnapshot *snap;
@@ -1097,7 +1097,6 @@ DirtyBitmapSnapshot

         assert(QEMU_IS_ALIGNED(offset, (1 << BITS_PER_LEVEL)));
         assert(QEMU_IS_ALIGNED(num,    (1 << BITS_PER_LEVEL)));
-        offset >>= BITS_PER_LEVEL;

         bitmap_copy_and_clear_atomic(snap->dirty + dest,
                                      blocks->blocks[idx] + (offset >>

There were 2 issues here: without the UL suffix on align I was getting
incorrect first/last addresses since the high bits of align weren't
being cleared, and then offset appeared to be shifted twice.

Does this look reasonable to you?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]