qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/5] colo: Introduce resource agent and test suite/CI


From: Lukas Straub
Subject: Re: [PATCH 0/5] colo: Introduce resource agent and test suite/CI
Date: Sat, 6 Jun 2020 20:59:32 +0200

On Mon, 18 May 2020 09:38:24 +0000
"Zhang, Chen" <chen.zhang@intel.com> wrote:

> > -----Original Message-----
> > From: Lukas Straub <lukasstraub2@web.de>
> > Sent: Monday, May 11, 2020 8:27 PM
> > To: qemu-devel <qemu-devel@nongnu.org>
> > Cc: Alberto Garcia <berto@igalia.com>; Dr. David Alan Gilbert
> > <dgilbert@redhat.com>; Zhang, Chen <chen.zhang@intel.com>
> > Subject: [PATCH 0/5] colo: Introduce resource agent and test suite/CI
> > 
> > Hello Everyone,
> > These patches introduce a resource agent for fully automatic management of
> > colo and a test suite building upon the resource agent to extensively test 
> > colo.
> > 
> > Test suite features:
> > -Tests failover with peer crashing and hanging and failover during 
> > checkpoint
> > -Tests network using ssh and iperf3 -Quick test requires no special
> > configuration -Network test for testing colo-compare -Stress test: failover 
> > all
> > the time with network load
> > 
> > Resource agent features:
> > -Fully automatic management of colo
> > -Handles many failures: hanging/crashing qemu, replication error, disk
> > error, ...
> > -Recovers from hanging qemu by using the "yank" oob command -Tracks
> > which node has up-to-date data -Works well in clusters with more than 2
> > nodes
> > 
> > Run times on my laptop:
> > Quick test: 200s
> > Network test: 800s (tagged as slow)
> > Stress test: 1300s (tagged as slow)
> > 
> > The test suite needs access to a network bridge to properly test the 
> > network,
> > so some parameters need to be given to the test run. See
> > tests/acceptance/colo.py for more information.
> > 
> > I wonder how this integrates in existing CI infrastructure. Is there a 
> > common
> > CI for qemu where this can run or does every subsystem have to run their
> > own CI?  
> 
> Wow~ Very happy to see this series.
> I have checked the "how to" in tests/acceptance/colo.py,
> But it looks not enough for users, can you write an independent document for 
> this series?
> Include test Infrastructure ASC II diagram,  test cases design , detailed how 
> to and more information for 
> pacemaker cluster and resource agent..etc ?

Hi,
I quickly created a more complete howto for configuring a pacemaker cluster and 
using the resource agent, I hope it helps:
https://wiki.qemu.org/Features/COLO/Managed_HOWTO

Regards,
Lukas Straub

> Thanks
> Zhang Chen
> 
> 
> > 
> > Regards,
> > Lukas Straub
> > 
> > 
> > Lukas Straub (5):
> >   block/quorum.c: stable children names
> >   colo: Introduce resource agent
> >   colo: Introduce high-level test suite
> >   configure,Makefile: Install colo resource-agent
> >   MAINTAINERS: Add myself as maintainer for COLO resource agent
> > 
> >  MAINTAINERS                              |    6 +
> >  Makefile                                 |    5 +
> >  block/quorum.c                           |   20 +-
> >  configure                                |   10 +
> >  scripts/colo-resource-agent/colo         | 1429 ++++++++++++++++++++++
> >  scripts/colo-resource-agent/crm_master   |   44 +
> >  scripts/colo-resource-agent/crm_resource |   12 +
> >  tests/acceptance/colo.py                 |  689 +++++++++++
> >  8 files changed, 2209 insertions(+), 6 deletions(-)  create mode 100755
> > scripts/colo-resource-agent/colo  create mode 100755 scripts/colo-resource-
> > agent/crm_master
> >  create mode 100755 scripts/colo-resource-agent/crm_resource
> >  create mode 100644 tests/acceptance/colo.py
> > 
> > --
> > 2.20.1  

Attachment: pgpiPYuZ4E_TO.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]