[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v3 0/7] colo: Introduce resource agent and test suite/CI

From: Lukas Straub
Subject: [PATCH v3 0/7] colo: Introduce resource agent and test suite/CI
Date: Tue, 4 Aug 2020 12:46:29 +0200

Hello Everyone,
So here is v3. Patch 1 can already be merged independently of the others.
Please review.

Lukas Straub

Based-on: <cover.1596528468.git.lukasstraub2@web.de>
"Introduce 'yank' oob qmp command to recover from hanging qemu"


 -resource-agent: Don't determine local qemu state by remote master-score, query
  directly via qmp instead
 -resource-agent: Add max_queue_size parameter for colo-compare
 -resource-agent: Fix monitor action on secondary returning error during
  clean shutdown
 -resource-agent: Fix stop action setting master-score to 0 on primary on
  clean shutdown

 -use new yank api
 -drop disk_size parameter
 -introduce pick_qemu_util function and use it


Hello Everyone,
These patches introduce a resource agent for fully automatic management of colo
and a test suite building upon the resource agent to extensively test colo.

Test suite features:
-Tests failover with peer crashing and hanging and failover during checkpoint
-Tests network using ssh and iperf3
-Quick test requires no special configuration
-Network test for testing colo-compare
-Stress test: failover all the time with network load

Resource agent features:
-Fully automatic management of colo
-Handles many failures: hanging/crashing qemu, replication error, disk error, 
-Recovers from hanging qemu by using the "yank" oob command
-Tracks which node has up-to-date data
-Works well in clusters with more than 2 nodes

Run times on my laptop:
Quick test: 200s
Network test: 800s (tagged as slow)
Stress test: 1300s (tagged as slow)

For the last two tests, the test suite needs access to a network bridge to
properly test the network, so some parameters need to be given to the test
run. See tests/acceptance/colo.py for more information.

Lukas Straub

Lukas Straub (7):
  block/quorum.c: stable children names
  avocado_qemu: Introduce pick_qemu_util to pick qemu utility binaries
  boot_linux.py: Use pick_qemu_util
  colo: Introduce resource agent
  colo: Introduce high-level test suite
  configure,Makefile: Install colo resource-agent
  MAINTAINERS: Add myself as maintainer for COLO resource agent

 MAINTAINERS                               |    6 +
 Makefile                                  |    5 +
 block/quorum.c                            |   20 +-
 configure                                 |   10 +
 scripts/colo-resource-agent/colo          | 1501 +++++++++++++++++++++
 scripts/colo-resource-agent/crm_master    |   44 +
 scripts/colo-resource-agent/crm_resource  |   12 +
 tests/acceptance/avocado_qemu/__init__.py |   15 +
 tests/acceptance/boot_linux.py            |   11 +-
 tests/acceptance/colo.py                  |  677 ++++++++++
 10 files changed, 2286 insertions(+), 15 deletions(-)
 create mode 100755 scripts/colo-resource-agent/colo
 create mode 100755 scripts/colo-resource-agent/crm_master
 create mode 100755 scripts/colo-resource-agent/crm_resource
 create mode 100644 tests/acceptance/colo.py


Attachment: pgp9ecAQf7VZy.pgp
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]