qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] qemu-system-ppc video artifacts since "tcg: drop global l


From: Alex Bennée
Subject: Re: [Qemu-ppc] qemu-system-ppc video artifacts since "tcg: drop global lock during TCG code execution"
Date: Wed, 15 Mar 2017 16:20:40 +0000
User-agent: mu4e 0.9.19; emacs 25.2.9

Gerd Hoffmann <address@hidden> writes:

>   Hi,
>
>> Instead of having MMIO register spaces we use the dirty tracking
>> mechanism. Here regions are marked as for dirty tracking and when the
>> SoftMMU helper first comes to this bit of memory it will follow the slow
>> path and mark region as visited.
>
> visited?  Do you mean dirty bit is set for the page?  Or is this
> something else?

Yes the dirty page bit (or the clearing of the TLB_NOTDIRTY bit from the
SoftMMU entry).

>
>> Once done this bit is cleared and all
>> future writes to that page are written directly from the translated
>> code. This no longer has an implicit synchronisation from the BQL so
>> there is now a race and you can have memory being updated which might
>> miss this flagging.
>
> Not fully following what you are trying to say.  But pages being updated
> without dirty bit getting set (if clear) certainly is a problem for the
> vga emulation (and live migration too).
>
>> Note that KVM has some similar hacks to avoid trapping all writes to
>> video memory with its coalesced mmio mechanism however I'm not familiar
>> with all the details.
>
> Normal linear framebuffer access doesn't use this.

Ahh OK - as I said I wasn't super familiar with what coalesced mmio was
trying to achieve. I assume it is trying to avoid trapping on every
single MMIO access?

>
>> Now there are mechanisms we can use to ensure there are no races happen
>> and return to the situation that the display is only updated when the
>> TCG cores are not running.
>
> tcg and display updates running in parallel isn't a problem, we have
> that with kvm anyway.  Dirty bit handling must be correct though.
>
> With kvm at the start of each display update vga fetches the dirty
> bitmap from the kernel (memory_region_sync_dirty_bitmap).  Then it goes
> use memory_region_get_dirty to figure which pages have been touched.
>
> When memory_region_sync_dirty_bitmap is called the kernel will clear the
> memory bitmap of the region and also map all pages read-only.  Next
> guest update will pagefault and the kernel can set the dirty bit for the
> page (maybe there is a more efficient way with EPT available).
>
> I suspect the memory_region_sync_dirty_bitmap call on tcg should reset
> the fast path optimization, so the slow path can update the dirty bits
> correctly.

Sure. And for the low level cputlb implementation we can clear those
bits atomically. However when the memory region is synced we also need
to flush any entries in the TLB that have already been hit and cleared
TLB_NOTDIRTY to we can trigger the slow path again. This is tricky from
outside of a vCPU context because we can't just queue the work and exit
the vCPU run loop to reach a known CPU state.

The RFC PATCH I sent earlier solves this by ensuring the update runs in
a quiescent period (i.e. when the vCPUs are not running) but it is
sub-optimal as it means the display code has to have a basic knowledge
of vCPUs and when they run.

--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]