Re: Proposal for a regular upstream performance testing

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for a regular upstream performance testing

From:	Lukáš Doktor
Subject:	Re: Proposal for a regular upstream performance testing
Date:	Mon, 21 Mar 2022 09:46:12 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

Dear qemu developers,

you might remember the "replied to" email from a bit over year ago to raise a 
discussion about a qemu performance regression CI. On KVM forum I presented 
https://www.youtube.com/watch?v=Cbm3o4ACE3Y&list=PLbzoR-pLrL6q4ZzA4VRpy42Ua4-D2xHUR&index=9
 some details about my testing pipeline. I think it's stable enough to become 
part of the official CI so people can consume, rely on it and hopefully even 
suggest configuration changes.

The CI consists of:

1. Jenkins pipeline(s) - internal, not available to developers, running daily 
builds of the latest available commit
2. Publicly available anonymized results: 
https://ldoktor.github.io/tmp/RedHat-Perf-worker1/
3. (optional) a manual gitlab pulling job which triggered by the Jenkins 
pipeline when that particular commit is checked

The (1) is described here: 
https://run-perf.readthedocs.io/en/latest/jenkins.html and can be replicated on 
other premises and the individual jobs can be executed directly 
https://run-perf.readthedocs.io on any linux box using Fedora guests (via pip 
or container https://run-perf.readthedocs.io/en/latest/container.html ).

As for the (3) I made a testing pipeline available here: 
https://gitlab.com/ldoktor/qemu/-/pipelines with one always-passing test and 
one allow-to-fail actual testing job. If you think such integration would be 
useful, I can add it as another job to the official qemu repo. Note the 
integration is a bit hacky as, due to resources, we can not test all commits 
but rather test on daily basis, which is not officially supported by gitlab.

Note the aim of this project is to ensure some very basic system-level workflow 
performance stays the same or that the differences are described and ideally 
pinned to individual commits. It should not replace thorough release testing or 
low-level performance tests.

Regards,
Lukáš


Dne 26. 11. 20 v 9:10 Lukáš Doktor napsal(a):
> Hello guys,
> 
> I had been around qemu on the Avocado-vt side for quite some time and a while 
> ago I shifted my focus on performance testing. Currently I am not aware of 
> any upstream CI that would continuously monitor the upstream qemu performance 
> and I'd like to change that. There is a lot to cover so please bear with me.
> 
> Goal
> ====
> 
> The goal of this initiative is to detect system-wide performance regressions 
> as well as improvements early, ideally pin-point the individual commits and 
> notify people that they should fix things. All in upstream and ideally with 
> least human interaction possible.
> 
> Unlike the recent work of Ahmed Karaman's 
> https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/ my aim is on the 
> system-wide performance inside the guest (like fio, uperf, ...)
> 
> Tools
> =====
> 
> In house we have several different tools used by various teams and I bet 
> there are tons of other tools out there that can do that. I can not speak for 
> all teams but over the time many teams at Red Hat have come to like pbench 
> https://distributed-system-analysis.github.io/pbench/ to run the tests and 
> produce machine readable results and use other tools (Ansible, scripts, ...) 
> to provision the systems and to generate the comparisons.
> 
> As for myself I used python for PoC and over the last year I pushed hard to 
> turn it into a usable and sensible tool which I'd like to offer: 
> https://run-perf.readthedocs.io/en/latest/ anyway I am open to suggestions 
> and comparisons. As I am using it downstream to watch regressions I do plan 
> on keep developing the tool as well as the pipelines (unless a better tool is 
> found that would replace it or it's parts).
> 
> How
> ===
> 
> This is a tough question. Ideally this should be a standalone service that 
> would only notify the author of the patch that caused the change with a bunch 
> of useful data so they can either address the issue or just be aware of this 
> change and mark it as expected.
> 
> Ideally the community should have a way to also issue their custom builds in 
> order to verify their patches so they can debug and address issues better 
> than just commit to qemu-master.
> 
> The problem with those is that we can not simply use travis/gitlab/... 
> machines for running those tests, because we are measuring in-guest actual 
> performance. We can't just stop the time when the machine decides to schedule 
> another container/vm. I briefly checked the public bare-metal offerings like 
> rackspace but these are most probably not sufficient either because (unless 
> I'm wrong) they only give you a machine but it is not guaranteed that it will 
> be the same machine the next time. If we are to compare the results we don't 
> need just the same model, we really need the very same machine. Any change to 
> the machine might lead to a significant difference (disk replacement, even 
> firmware update...).
> 
> Solution 1
> ----------
> 
> Doing this for downstream builds I can start doing this for upstream as well. 
> At this point I can offer a single pipeline watching only changes in qemu 
> (downstream we are checking distro/kernel changes as well but that would 
> require too much time at this point) on a single x86_64 machine. I can not 
> offer a public access to the testing machine, not even checking custom builds 
> (unless someone provides me a publicly available machine(s) that I would use 
> for this). What I can offer is running the checks on the latest qemu master, 
> publishing the reports, bisecting issues and notifying people about the 
> changes. An example of a report can be found here: 
> https://drive.google.com/file/d/1V2w7QpSuybNusUaGxnyT5zTUvtZDOfsb/view?usp=sharing
>  a documentation of the format is here: 
> https://run-perf.readthedocs.io/en/latest/scripts.html#html-results I can 
> also attach the raw pbench results if needed (as well as details about the 
> tests that were executed and the params and other details).
> 
> Currently the covered scenarios would be a default libvirt machine with qcow2 
> storage and tuned libvirt machine (cpus, hugepages, numa, raw disk...) 
> running fio, uperf and linpack on the latest GA RHEL. In the future I can 
> add/tweak the scenarios as well as tests selection based on your feedback.
> 
> Solution 2
> ----------
> 
> I can offer a documentation: 
> https://run-perf.readthedocs.io/en/latest/jenkins.html and someone can 
> fork/inspire by it and setup the pipelines on their system, making it 
> available to the outside world, add your custom scenarios and variants. Note 
> the setup does not require Jenkins, it's just an example and could be easily 
> turned into a cronjob or whatever you chose.
> 
> Solution 3
> ----------
> 
> You name it. I bet there are many other ways to perform system-wide 
> performance testing.
> 
> Regards,
> Lukáš

OpenPGP_0x26B362E47FCF22C1.asc
Description: OpenPGP public key

OpenPGP_signature
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Proposal for a regular upstream performance testing, Lukáš Doktor <=
- Re: Proposal for a regular upstream performance testing, Stefan Hajnoczi, 2022/03/21
  - Re: Proposal for a regular upstream performance testing, Lukáš Doktor, 2022/03/21
    - Re: Proposal for a regular upstream performance testing, Stefan Hajnoczi, 2022/03/22
    - Re: Proposal for a regular upstream performance testing, Lukáš Doktor, 2022/03/28
    - Re: Proposal for a regular upstream performance testing, Lukáš Doktor, 2022/03/28
    - Re: Proposal for a regular upstream performance testing, Stefan Hajnoczi, 2022/03/28
    - Re: Proposal for a regular upstream performance testing, Lukáš Doktor, 2022/03/28

Prev by Date: Re: [PATCH 1/2] fix cmpxchg instruction
Next by Date: Re: [PATCH 2/2] fix lock cmpxchg instruction
Previous by thread: [PATCH qemu] ppc/spapr/ddw: Add 2M pagesize
Next by thread: Re: Proposal for a regular upstream performance testing
Index(es):
- Date
- Thread