[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/3] i386: hvf: In kick_vcpu use hv_vcpu_interrupt to force e
|
From: |
Phil Dennis-Jordan |
|
Subject: |
Re: [PATCH 2/3] i386: hvf: In kick_vcpu use hv_vcpu_interrupt to force exit |
|
Date: |
Mon, 16 Oct 2023 16:19:03 +0200 |
Hi Roman,
A quick update on my progress on this so far:
- I've managed to repro the graphical issue; it turns up with the
switch to hv_vcpu_run_until() (patch 3/3). I can't repro with just the
changes to the kick. (patch 2/3; It would have surprised me if that
had caused it.)
- Thanks for linking to the old mailing list threads for background on
the issues.
- I've rebased/ported your more sophisticated vCPU kick implementation
to the latest upstream master branch. This doesn't seem to cause any
obvious regressions, but it also doesn't appear to fix the issue with
hv_vcpu_run_until(). My branch is here,
https://github.com/pmj/qemu/tree/hvf-kick-modern and the updated
version of your patch here:
https://github.com/qemu/qemu/commit/fadc716c5bb15345a49e08eecf7ab1077782975c
- I've done some tracing, and I so far can't spot any undelivered or
slow kicks. Each kick is paired with a return from hvf_vcpu_exec with
reason EXCP_INTERRUPT, and this happens within less than a
millisecond, typically around 200µs. I'm seeing kicks being delivered
at all the various stages we're expecting them apart from VMX
preemption timer exits, at least so far. But those only target a tiny
window and so should be statistically quite rare.
- I'm starting to entertain the idea that hv_vcpu_run_until() exhibits
some other difference compared to hv_vcpu_run() which is causing the
graphical issues, rather than a problem with interrupt delivery.
Unfortunately, I'm not familiar with how drawing works on legacy
(S)VGA graphics, and what effect might cause it to end up with these
dropped scanline updates. I might try to find some kind of OSDev
example code that draws to (S)VGA and hopefully lets me repro and
perhaps debug the problem in isolation.
Any ideas/thoughts?
Thanks,
Phil
On Sun, 8 Oct 2023 at 21:30, Phil Dennis-Jordan <phil@philjordan.eu> wrote:
>
>
> On Sun, 8 Oct 2023 at 21:19, Roman Bolshakov <roman@roolebo.dev> wrote:
>>
>> > I assume that's with patch 3/3 applied as well? The fact you've
>> > repro'd it with just these patch would explain why I've not been able
>> > to fix it on the APIC side…
>> >
>>
>> Yes, I applied with patch 3/3 and then retested only with the first two
>> patches.
>
>
> OK, interesting that it would happen without patch 3 as well.
>
>>
>> > > FWIW. I recall a few years ago I submitted a similar patch that does
>> > > something similar but addresses a few more issues:
>> > >
>> > > https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolshakov@yadro.com/
>> > >
>> > > I don't remember why it never got merged.
>> > >
>> >
>> > Looks like the VM kick might be a more complex undertaking than I was
>> > anticipating. I'll try to repro the problem you ran into, and then look
>> > over your original patch and make sense of it. Hopefully an updated version
>> > of your 'kick' implementation will work well in combination with the
>> > newer hv_vcpu_run_until() API from patch 3/3. I'll keep you posted.
>> >
>>
>> Apparently I left a note that some interrupts weren't delivered even
>> with my patch and I was not able figure out the reason back then. I had
>> another attempt to debug this two weeks later after I submitted v4 and I
>> can find a WIP branch on github where I added a Debug Registers support
>> patch and some tracepoints:
>>
>> https://github.com/qemu/qemu/compare/master...roolebo:qemu:hvf-debug-kick
>>
>> Perhaps that's where we should start from besides the obvious need of
>> rebase.
>
>
> Sounds good, I'll take a look at those changes and try to work out what to do
> next.
>
>>
>> With regards to hv_vcpu_run_until() I can find the following thread:
>> https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09468.html
>>
>> > hv_vcpu_run_until() was also evaluated on macOS 10.15.5 but it degrades
>> > VM performance significantly compared to explicit setting of
>> > VMX-preepmtion timer value and hv_vcpu_run(). The performance issue was
>> > observed on Broadwell-based MacBook Air and Ivy Bridge-based MacBook
>> > Pro.
>> >
>> > macOS 11.0 Beta deprecated hv_vcpu_run() and introduced a special
>> > declaration for hv_vcpu_run_until(), that's not available 10.15 -
>> > HV_DEADLINE_FOREVER (UINT64_MAX, which is bigger than maximum value of
>> > VMX-preeemption counter). Perhaps the performance issue is addressed
>> > there.
>>
>> All discussion with Paolo might be helpful, particurlarly:
>> https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09893.html
>>
>> > So, I've tried Big Sur Beta and it has exactly the same performance
>> > issue with hv_vcpu_run_until() while hv_vcpu_run() works as good as it
>> > worked on 10.15.5. I've submitted FB7827341 to Apple wrt the issue.
>>
>> In November 2020, Apple responded to FB7827341 that there's an issue on
>> QEMU side.
>
>
> Hmm, that's interesting. I'll need to work my way through that thread, but
> I'll point out that in my testing with SMP guests, I measured a performance
> *improvement* with the hv_vcpu_run_until() API (and the forever deadline)
> versus hv_vcpu_run(), as it significantly reduced BQL contention - with so
> many VMEXITs, vCPU threads were spending a lot of time waiting for the lock.
>
>