[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Proposal for a regular upstream performance testing
From: |
Chenqun (kuhn) |
Subject: |
RE: Proposal for a regular upstream performance testing |
Date: |
Wed, 2 Dec 2020 08:23:45 +0000 |
> -----Original Message-----
> From: Qemu-devel
> [mailto:qemu-devel-bounces+kuhn.chenqun=huawei.com@nongnu.org] On
> Behalf Of Luká? Doktor
> Sent: Thursday, November 26, 2020 4:10 PM
> To: QEMU Developers <qemu-devel@nongnu.org>
> Cc: Charles Shih <cheshi@redhat.com>; Aleksandar Markovic
> <aleksandar.qemu.devel@gmail.com>; Stefan Hajnoczi
> <stefanha@redhat.com>
> Subject: Proposal for a regular upstream performance testing
>
> Hello guys,
>
> I had been around qemu on the Avocado-vt side for quite some time and a while
> ago I shifted my focus on performance testing. Currently I am not aware of any
> upstream CI that would continuously monitor the upstream qemu performance
> and I'd like to change that. There is a lot to cover so please bear with me.
>
> Goal
> ====
>
> The goal of this initiative is to detect system-wide performance regressions
> as
> well as improvements early, ideally pin-point the individual commits and
> notify
> people that they should fix things. All in upstream and ideally with least
> human
> interaction possible.
>
> Unlike the recent work of Ahmed Karaman's
> https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/ my aim is on the
> system-wide performance inside the guest (like fio, uperf, ...)
>
> Tools
> =====
>
> In house we have several different tools used by various teams and I bet there
> are tons of other tools out there that can do that. I can not speak for all
> teams
> but over the time many teams at Red Hat have come to like pbench
> https://distributed-system-analysis.github.io/pbench/ to run the tests and
> produce machine readable results and use other tools (Ansible, scripts, ...)
> to
> provision the systems and to generate the comparisons.
>
> As for myself I used python for PoC and over the last year I pushed hard to
> turn
> it into a usable and sensible tool which I'd like to offer:
> https://run-perf.readthedocs.io/en/latest/ anyway I am open to suggestions
> and comparisons. As I am using it downstream to watch regressions I do plan
> on keep developing the tool as well as the pipelines (unless a better tool is
> found that would replace it or it's parts).
>
> How
> ===
>
> This is a tough question. Ideally this should be a standalone service that
> would
> only notify the author of the patch that caused the change with a bunch of
> useful data so they can either address the issue or just be aware of this
> change
> and mark it as expected.
>
> Ideally the community should have a way to also issue their custom builds in
> order to verify their patches so they can debug and address issues better than
> just commit to qemu-master.
>
> The problem with those is that we can not simply use travis/gitlab/...
> machines
> for running those tests, because we are measuring in-guest actual
> performance. We can't just stop the time when the machine decides to
> schedule another container/vm. I briefly checked the public bare-metal
> offerings like rackspace but these are most probably not sufficient either
> because (unless I'm wrong) they only give you a machine but it is not
> guaranteed that it will be the same machine the next time. If we are to
> compare the results we don't need just the same model, we really need the
> very same machine. Any change to the machine might lead to a significant
> difference (disk replacement, even firmware update...).
Hi Lukáš,
It's nice to see a discussion of QEMU performance topic.
If you have a need for CI platform and physical machine environments, maybe
compass-ci can help you.
Compass-ci is an open CI platform of the openEuler community and is growing.
Here's a brief reame:
https://gitee.com/wu_fengguang/compass-ci/blob/master/README.en.md
Thanks,
Chen Qun
>
> Solution 1
> ----------
>
> Doing this for downstream builds I can start doing this for upstream as well.
> At
> this point I can offer a single pipeline watching only changes in qemu
> (downstream we are checking distro/kernel changes as well but that would
> require too much time at this point) on a single x86_64 machine. I can not
> offer
> a public access to the testing machine, not even checking custom builds
> (unless
> someone provides me a publicly available machine(s) that I would use for
> this).
> What I can offer is running the checks on the latest qemu master, publishing
> the reports, bisecting issues and notifying people about the changes. An
> example of a report can be found here:
> https://drive.google.com/file/d/1V2w7QpSuybNusUaGxnyT5zTUvtZDOfsb/view
> ?usp=sharing a documentation of the format is here:
> https://run-perf.readthedocs.io/en/latest/scripts.html#html-results I can also
> attach the raw pbench results if needed (as well as details about the tests
> that
> were executed and the params and other details).
>
> Currently the covered scenarios would be a default libvirt machine with qcow2
> storage and tuned libvirt machine (cpus, hugepages, numa, raw disk...) running
> fio, uperf and linpack on the latest GA RHEL. In the future I can add/tweak
> the
> scenarios as well as tests selection based on your feedback.
>
> Solution 2
> ----------
>
> I can offer a documentation:
> https://run-perf.readthedocs.io/en/latest/jenkins.html and someone can
> fork/inspire by it and setup the pipelines on their system, making it
> available to
> the outside world, add your custom scenarios and variants. Note the setup
> does not require Jenkins, it's just an example and could be easily turned
> into a
> cronjob or whatever you chose.
>
> Solution 3
> ----------
>
> You name it. I bet there are many other ways to perform system-wide
> performance testing.
>
> Regards,
> Lukáš
>
RE: Proposal for a regular upstream performance testing,
Chenqun (kuhn) <=