[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH 0/5] colo: Introduce resource agent and test suite/CI
From: |
Zhang, Chen |
Subject: |
RE: [PATCH 0/5] colo: Introduce resource agent and test suite/CI |
Date: |
Tue, 16 Jun 2020 01:42:45 +0000 |
> -----Original Message-----
> From: Lukas Straub <lukasstraub2@web.de>
> Sent: Sunday, June 7, 2020 3:00 AM
> To: Zhang, Chen <chen.zhang@intel.com>
> Cc: qemu-devel <qemu-devel@nongnu.org>; Alberto Garcia
> <berto@igalia.com>; Dr. David Alan Gilbert <dgilbert@redhat.com>; Jason
> Wang <jasowang@redhat.com>
> Subject: Re: [PATCH 0/5] colo: Introduce resource agent and test suite/CI
>
> On Mon, 18 May 2020 09:38:24 +0000
> "Zhang, Chen" <chen.zhang@intel.com> wrote:
>
> > > -----Original Message-----
> > > From: Lukas Straub <lukasstraub2@web.de>
> > > Sent: Monday, May 11, 2020 8:27 PM
> > > To: qemu-devel <qemu-devel@nongnu.org>
> > > Cc: Alberto Garcia <berto@igalia.com>; Dr. David Alan Gilbert
> > > <dgilbert@redhat.com>; Zhang, Chen <chen.zhang@intel.com>
> > > Subject: [PATCH 0/5] colo: Introduce resource agent and test
> > > suite/CI
> > >
> > > Hello Everyone,
> > > These patches introduce a resource agent for fully automatic
> > > management of colo and a test suite building upon the resource agent to
> extensively test colo.
> > >
> > > Test suite features:
> > > -Tests failover with peer crashing and hanging and failover during
> > > checkpoint -Tests network using ssh and iperf3 -Quick test requires
> > > no special configuration -Network test for testing colo-compare
> > > -Stress test: failover all the time with network load
> > >
> > > Resource agent features:
> > > -Fully automatic management of colo
> > > -Handles many failures: hanging/crashing qemu, replication error,
> > > disk error, ...
> > > -Recovers from hanging qemu by using the "yank" oob command -Tracks
> > > which node has up-to-date data -Works well in clusters with more
> > > than 2 nodes
> > >
> > > Run times on my laptop:
> > > Quick test: 200s
> > > Network test: 800s (tagged as slow)
> > > Stress test: 1300s (tagged as slow)
> > >
> > > The test suite needs access to a network bridge to properly test the
> > > network, so some parameters need to be given to the test run. See
> > > tests/acceptance/colo.py for more information.
> > >
> > > I wonder how this integrates in existing CI infrastructure. Is there
> > > a common CI for qemu where this can run or does every subsystem have
> > > to run their own CI?
> >
> > Wow~ Very happy to see this series.
> > I have checked the "how to" in tests/acceptance/colo.py, But it looks
> > not enough for users, can you write an independent document for this
> series?
> > Include test Infrastructure ASC II diagram, test cases design ,
> > detailed how to and more information for pacemaker cluster and resource
> agent..etc ?
>
> Hi,
> I quickly created a more complete howto for configuring a pacemaker cluster
> and using the resource agent, I hope it helps:
> https://wiki.qemu.org/Features/COLO/Managed_HOWTO
Hi Lukas,
I noticed you contribute some content in Qemu COLO WIKI.
For the Features/COLO/Manual HOWTO
https://wiki.qemu.org/Features/COLO/Manual_HOWTO
Why not keep the Secondary side start command same with the
qemu/docs/COLO-FT.txt?
If I understand correctly, add the quorum related command in secondary will
support resume replication.
Then, we can add primary/secondary resume step here.
Thanks
Zhang Chen
>
> Regards,
> Lukas Straub
>
> > Thanks
> > Zhang Chen
> >
> >
> > >
> > > Regards,
> > > Lukas Straub
> > >
> > >
> > > Lukas Straub (5):
> > > block/quorum.c: stable children names
> > > colo: Introduce resource agent
> > > colo: Introduce high-level test suite
> > > configure,Makefile: Install colo resource-agent
> > > MAINTAINERS: Add myself as maintainer for COLO resource agent
> > >
> > > MAINTAINERS | 6 +
> > > Makefile | 5 +
> > > block/quorum.c | 20 +-
> > > configure | 10 +
> > > scripts/colo-resource-agent/colo | 1429 ++++++++++++++++++++++
> > > scripts/colo-resource-agent/crm_master | 44 +
> > > scripts/colo-resource-agent/crm_resource | 12 +
> > > tests/acceptance/colo.py | 689 +++++++++++
> > > 8 files changed, 2209 insertions(+), 6 deletions(-) create mode
> > > 100755 scripts/colo-resource-agent/colo create mode 100755
> > > scripts/colo-resource- agent/crm_master create mode 100755
> > > scripts/colo-resource-agent/crm_resource
> > > create mode 100644 tests/acceptance/colo.py
> > >
> > > --
> > > 2.20.1