qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: runaway avocado


From: Cleber Rosa
Subject: Re: runaway avocado
Date: Thu, 11 Feb 2021 21:31:08 -0500

On Fri, Feb 12, 2021 at 12:59:23AM +0100, Philippe Mathieu-Daudé wrote:
> On 2/11/21 8:21 PM, Peter Maydell wrote:
> > On Thu, 11 Feb 2021 at 18:47, Cleber Rosa <crosa@redhat.com> wrote:
> >> On Thu, Feb 11, 2021 at 05:37:20PM +0000, Peter Maydell wrote:
> >>> I wonder if we could have avocado run all our acceptance cases
> >>> under a 'ulimit -f' setting that restricts the amount of disk
> >>> space they can use? That would restrict the damage that could
> >>> be done by any runaways. A CPU usage limit might also be good.
> > 
> >> To me that sounds a lot like Linux cgroups.
> > 
> > ...except that ulimits are a well-established mechanism that
> > is straightforward, works for any user and is cross-platform
> > for most Unixes, whereas cgroups are complicated, Linux specific,
> > and AIUI require root access to set them up and configure them.
> 
> I agree with Peter, having being POSIX compliant is better than
> restricting to (recent) Linux. But also note we have users interested
> running tests for Windows builds. See the Cirrus-CI.
> 

Sure, I feel like cgroups is more comprehensive, but definitely have
the drawbacks you both listed.

> > 
> >> We can have a script setting up a cgroup as part of a
> >> gitlab-ci.{yml,d} job for the jobs that will run on the non-shared
> >> GitLab runners (such as the s390 and aarch64 machines owned by the
> >> QEMU project).
> >>
> >> Does this sound like a solution?
> > 
> > We want a solution that works for anybody running
> > "make check-acceptance" in any situation, not just for
> > the CI runners.
> 
> Indeed. Public CI time being limited, I expect users to run tests
> elsewhere. We don't mind about data loss on CI runners.
>

That was kind of my point.  We want to use all the resources the
GitLab CI shared runners give us, so extra limit enforcements make no
sense to me.  Also, on my personal machines, I also prefer to have
faster test turnarounds, so putting extra limits is not beneficial to
me.  YMMV, so my opinion is that this should be an opt-in, *not*
enabled by default.

My initial take on this is that we can have a few pre-defined scripts
that set those limits.  Users get to activate those profiles by
name if say, a given environment variable is set.  Something like:

  RESOURCE_LIMIT_PROFILE=low_cpu_4g_files
  if [ -n $RESOURCE_LIMIT_PROFILE ]; then
  ./scripts/limit-resources/$RESOUCE_LIMIT_PROFILE $*

> FWIW similar complain last year:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg672277.html
>

The specific issue of Avocado's cache size should be addressed in this
development cycle, and a solution available on 86.0.  It's being tracked
here:

  https://github.com/avocado-framework/avocado/issues/4311

Now, in Peter's case, it was QEMU writing to a replay.bin file, and I
don't see a practical way that Avocado could limit the overall disk
space usage by whathever gets run on a test unless disk quotas are
set.  Not sure if this belongs on a test framework though.

Cheers,
- Cleber.

> Regards,
> 
> Phil.
> 

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]