From MAILER-DAEMON Fri Jul 02 23:55:59 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1lzWl5-0000hn-CL for mharc-qemu-stable@gnu.org; Fri, 02 Jul 2021 23:55:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58006) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzW8P-0005L8-RO; Fri, 02 Jul 2021 23:16:01 -0400 Received: from mail-lj1-x231.google.com ([2a00:1450:4864:20::231]:36492) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lzW8M-0008Mz-MT; Fri, 02 Jul 2021 23:16:01 -0400 Received: by mail-lj1-x231.google.com with SMTP id a6so16124033ljq.3; Fri, 02 Jul 2021 20:15:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=8auevRkSnBXCEjEkBDaMhiuZbUoGqtrBzImvPMb177k=; b=Ukyw/HzJba5gH3nPpAr4tSVpwWxEUbYN0WXLeb1v5AGSb8HV2Kp5HyVrRe4Lsg06oq xkaNssey3EvKA12uUDwxg4qlQpM7r1awMJNt3k+C9UlyiHWlBX8friKHJku2VIR+2MJc PYH1KETtF0uAyJkj+wUTV9tkq6VSzYnMv7rziomDF+CvXs2lHm382T2s6WdeICWMTFZn 8fK4rsTPX6UAPdlDnxSpSzFY6S/L3MTVA76ssxaSeAJIg3RcyTLDWZcHTwzadQPepoRb 4J9kUJZ6iW5YLm1wTeOiCYYhrJX/BHkXj0zuRa2b3M76QVPWVjtQuRcdnHGiyeyvAGyk kh1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=8auevRkSnBXCEjEkBDaMhiuZbUoGqtrBzImvPMb177k=; b=eRQt2BX8GKA4OVXZUh4LrgmV56a27XqBjIBGEJc7fEGMIyv8NshApUOoxZ0pv2huDb z4CpWG9bbp3jYXPhvfyMszCfycEVrW01VoPgExe488+kpyqb6reo4HMdLxSwBU+8kvsa KRYFh7FAITOErfSa5Ubz7s7d7qA6pphqk7gnWAsAEtzNWz8Ip97jENvzZiPXSgTWKqaH zXRAYqoX9sPiLLVUnvx8VCzNW57SI2eqGX8Xlntf2fXHXotcW4v4luYXc4rQxQ6eahSZ B/ftj+Ry2Te+Pk24f5laW6s5C+TiMgFO4op4Imc+CJl3K22YN5s+6baZMcjho7Wd6zdX eofw== X-Gm-Message-State: AOAM533aCtisWDDzby6C0O7ehqcP60N2Un+1Fn9OTpNq5k5ZJmMQv0Uy GVQJUzzKHEqiKby1/syjelaqpB8iRb4z//I08KEyLiEVPok= X-Google-Smtp-Source: ABdhPJx0SPK/ab0R1s00dnU876XtTGSC/fGONL6zDI//qoIvcbeaKrPD3/58hGwg+sxRdKjydFma4CQw0saLNiz94l8= X-Received: by 2002:a05:651c:201e:: with SMTP id s30mr2021529ljo.364.1625282155151; Fri, 02 Jul 2021 20:15:55 -0700 (PDT) MIME-Version: 1.0 From: =?UTF-8?B?5riF5rC05a+b5a2Q?= Date: Sat, 3 Jul 2021 12:15:42 +0900 Message-ID: Subject: How do you set a value to a QOM register from GDB? To: qemu-stable@nongnu.org, qemu-discuss@nongnu.org Content-Type: multipart/alternative; boundary="00000000000084861e05c62f7d08" Received-SPF: pass client-ip=2a00:1450:4864:20::231; envelope-from=hiroko07168@gmail.com; helo=mail-lj1-x231.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 02 Jul 2021 23:55:58 -0400 X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Jul 2021 03:16:02 -0000 --00000000000084861e05c62f7d08 Content-Type: text/plain; charset="UTF-8" Hello everyone. I made a simple QOM which returns an error message when I read/write the QOM register value like the bottom. Then I'm trying to read/write the QOM register from GDB. I can read the QOM register value using the "print" command and get the error message in qemu monitor. This means that "print" calls the test_read function. (gdb) p *0x40000004 999 (qemu) access test_read 0 However, I can't write a value to the QOM register by "set" command. Moreover, "set" command doesn't call the test_write function because I don't get the error message defined in the test_write function. (gdb) set *((int *)0x40000004) = 100 (gdb) p *0x40000004 999 I really want to solve this problem. Can you suggest any solutions how to set a value to the QOM register via GDB ? Best regard, Hiroko ------------------------------------------------------------ static void test_reset(DeviceState *dev) { TestState *s = TEST(dev); s->src = 444; // address : 0x40000000 s->fix_value = 999; // address : 0x40000004 } static uint64_t test_read(void *opaque, hwaddr offset, unsigned size) { error_report("access test_read %d", (int)offset); TestState *s = (TestState *)opaque; switch ((int)offset) { case 0: return s->src; case 4: return s->fix_value; default: error_report("bad offset : %d", (int)offset); return 0; } } static void test_write(void *opaque, hwaddr offset, uint64_t value, unsigned size) { error_report("access test_write %d %d", (int)offset, (int)size); TestState *s = (TestState *)opaque; if(offset == 0){ s->src = value; }else{ qemu_log_mask(LOG_GUEST_ERROR,"test_write: can't change %x\n", (int)offset); } } ---------------------------------------------------------------------------------------------------------- --00000000000084861e05c62f7d08 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello everyone.

I made a simple QOM which returns a= n error message when I read/write the QOM register value like the bottom.
Then I'm trying to read/write the QOM register from G= DB.
I can read the QOM register value using the "print" comman= d and get the error message in qemu monitor.
This means that "print= " calls the test_read function.
(gdb) p *0x40000004
=C2=A0 =C2= =A0 =C2=A0 999
(qemu) access test_read 0

However, I can't wri= te a value to the QOM register by "set" command.
Moreover, &qu= ot;set" command doesn't call the test_write function because I don= 't get the error message defined in the test_write function.
(gdb) s= et *((int *)0x40000004) =3D 100
(gdb) p *0x40000004
=C2=A0 =C2=A0 =C2= =A0 999

I really want to solve this problem.
Can you suggest any = solutions how to set a value to the QOM register via GDB ?

Best rega= rd,
Hiroko
----------------------------------------------------------= --
static void test_reset(DeviceState *dev)
{
=C2=A0 =C2=A0 TestSt= ate *s =3D TEST(dev);
=C2=A0 =C2=A0 s->src =3D 444;=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// address : 0x40000000
=C2=A0 =C2= =A0 s->fix_value =3D 999;=C2=A0 =C2=A0 =C2=A0// address : 0x40000004
= }

static uint64_t test_read(void *opaque, hwaddr offset,
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0unsigned size)
{
=C2=A0 =C2=A0 error_report("ac= cess test_read %d", (int)offset);
=C2=A0 =C2=A0 TestState *s =3D (T= estState *)opaque;

=C2=A0 =C2=A0 switch ((int)offset) {
=C2=A0 = =C2=A0 case 0:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return s->src;
=C2=A0 = =C2=A0 case 4:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return s->fix_value;
= =C2=A0 =C2=A0 default:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 error_report("ba= d offset : %d", (int)offset);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0;=
=C2=A0 =C2=A0 }
}

static void test_write(void *opaque, hwaddr= offset,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 uint64_t value, unsigned size)
{
=C2=A0 =C2=A0 e= rror_report("access test_write %d %d", (int)offset, (int)size);=C2=A0 =C2=A0 TestState *s =3D (TestState *)opaque;

=C2=A0 =C2=A0 = if(offset =3D=3D 0){
=C2=A0 =C2=A0 =C2=A0 =C2=A0 s->src =3D value;=C2=A0 =C2=A0 }else{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 qemu_log_mask(LOG_GUES= T_ERROR,"test_write: can't change %x\n", (int)offset);
=C2= =A0 =C2=A0 }
}
------------------------------------------------------= ----------------------------------------------------
--00000000000084861e05c62f7d08-- From MAILER-DAEMON Fri Jul 02 23:56:00 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1lzWl6-0000lH-Ma for mharc-qemu-stable@gnu.org; Fri, 02 Jul 2021 23:56:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58338) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzWCG-0005Yp-RO; Fri, 02 Jul 2021 23:20:00 -0400 Received: from mail-lj1-x22c.google.com ([2a00:1450:4864:20::22c]:40498) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lzWCE-0002xQ-Vm; Fri, 02 Jul 2021 23:20:00 -0400 Received: by mail-lj1-x22c.google.com with SMTP id d25so16093208lji.7; Fri, 02 Jul 2021 20:19:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=Shqmw4NaVYUsLVb5P32D9F/4RQrXfIGbR7y7Gng9dqw=; b=BRKwqpJGX3Fcclqu6yKokwqdkpaQogPA07YcY46y8UFmQxXRzjDOob9lWtbymD2Hdz ROPFs+dj6NNixa9kXLA1QKOWJ7Yp5kT1u+5zRbeK+olVsn8AWPKINnn8KWDTk9mihfXr wUQXv8O9J2V3oz+R7O21UBR93CVo4XMkvRQgUuzeGi6YxrPZHVY8qsW8v+kwCWmIoskV 1DzZnmZNKnJQ7YSkM5Paki7I06zayQqHu27PWqIFzRh8IbeMZqirTLvlMFF1VtKXVO5m 1j5+r7S1OjqEWlJ8nR9eOYHvI5FQcsfgyLtVMO+kte42t8sBQflpIGCjho99RalI+/ot alzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Shqmw4NaVYUsLVb5P32D9F/4RQrXfIGbR7y7Gng9dqw=; b=FlXrzQzhF6XN5ujxE993LyJBHNMFpPxZGAfbJ2mbIx0GLx0daSXpD2lviJigw0B8Hj ga8uUR7FfufEmmjOGovrQ4VJfqRZKegMn3K40APRGfZJR3BxPowlOUX7W7yZ/mJ9kyme jttEVL74uIUIv3JYGr6riyNHCA45ghzGZVRvzH4jb2oU+FWujCwPEylEKbAngb/gahel bB1F0Q2hhtC3GhuLZVP3ZndDrgJ+9eczMk68ngwNJ3yIHk1bRShFZWH8uUMd9QFz052S kLo1gGItSPyurQ0YFN3YN953me9jIyXvBuZlu+smq/32yI8xz9WY4ncnIVlY6uSi503k ngjw== X-Gm-Message-State: AOAM530VdTR2I0LOedHhenjWBSduErR74aFFmuWnRm9Z8jyEL1gvC1Am OCx9lQAqJJOvubiBtOOA8VBzxxv33sAFjkVaOgaCaigv4GM= X-Google-Smtp-Source: ABdhPJzOJAXHZisEwIBIBgSWB2E/FBYiiOY29F0uD+7VC6tuHTax12EunG3QU5voJ+vEsvrbSzQIXm+n6juty/VDxbY= X-Received: by 2002:a2e:8119:: with SMTP id d25mr2033055ljg.86.1625282397033; Fri, 02 Jul 2021 20:19:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?B?5riF5rC05a+b5a2Q?= Date: Sat, 3 Jul 2021 12:19:43 +0900 Message-ID: Subject: Re: How do you set a value to a QOM register from GDB? To: qemu-stable@nongnu.org, qemu-discuss@nongnu.org Content-Type: multipart/alternative; boundary="000000000000ef5aa205c62f8b24" Received-SPF: pass client-ip=2a00:1450:4864:20::22c; envelope-from=hiroko07168@gmail.com; helo=mail-lj1-x22c.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Fri, 02 Jul 2021 23:55:59 -0400 X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Jul 2021 03:20:01 -0000 --000000000000ef5aa205c62f8b24 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sorry, I made a mistake writing in (qemu) access test_read 0 The correction is as follows : (qemu) access test_read 4 2021=E5=B9=B47=E6=9C=883=E6=97=A5(=E5=9C=9F) 12:15 =E6=B8=85=E6=B0=B4=E5=AF= =9B=E5=AD=90 : > Hello everyone. > > I made a simple QOM which returns an error message when I read/write the > QOM register value like the bottom. > > Then I'm trying to read/write the QOM register from GDB. > I can read the QOM register value using the "print" command and get the > error message in qemu monitor. > This means that "print" calls the test_read function. > (gdb) p *0x40000004 > 999 > (qemu) access test_read 0 > > However, I can't write a value to the QOM register by "set" command. > Moreover, "set" command doesn't call the test_write function because I > don't get the error message defined in the test_write function. > (gdb) set *((int *)0x40000004) =3D 100 > (gdb) p *0x40000004 > 999 > > I really want to solve this problem. > Can you suggest any solutions how to set a value to the QOM register via > GDB ? > > Best regard, > Hiroko > ------------------------------------------------------------ > static void test_reset(DeviceState *dev) > { > TestState *s =3D TEST(dev); > s->src =3D 444; // address : 0x40000000 > s->fix_value =3D 999; // address : 0x40000004 > } > > static uint64_t test_read(void *opaque, hwaddr offset, > unsigned size) > { > error_report("access test_read %d", (int)offset); > TestState *s =3D (TestState *)opaque; > > switch ((int)offset) { > case 0: > return s->src; > case 4: > return s->fix_value; > default: > error_report("bad offset : %d", (int)offset); > return 0; > } > } > > static void test_write(void *opaque, hwaddr offset, > uint64_t value, unsigned size) > { > error_report("access test_write %d %d", (int)offset, (int)size); > TestState *s =3D (TestState *)opaque; > > if(offset =3D=3D 0){ > s->src =3D value; > }else{ > qemu_log_mask(LOG_GUEST_ERROR,"test_write: can't change %x\n", > (int)offset); > } > } > > -------------------------------------------------------------------------= --------------------------------- > --000000000000ef5aa205c62f8b24 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Sorry, I made a mistake writing in
(q= emu) access test_read 0

The correction is as f= ollows :=C2=A0
(qemu) access test_read 4

=


2021=E5=B9=B47=E6=9C=883=E6=97=A5(=E5=9C=9F) 12:15 =E6=B8= =85=E6=B0=B4=E5=AF=9B=E5=AD=90 <hiroko07168@gmail.com>:
Hello everyone.

I made a simple QOM= which returns an error message when I read/write the QOM register value li= ke the bottom.

Then I'm trying to read/write the QOM= register from GDB.
I can read the QOM register value using the "pr= int" command and get the error message in qemu monitor.
This means = that "print" calls the test_read function.
(gdb) p *0x40000004=
=C2=A0 =C2=A0 =C2=A0 999
(qemu) access test_read 0

However, I= can't write a value to the QOM register by "set" command.Moreover, "set" command doesn't call the test_write function= because I don't get the error message defined in the test_write functi= on.
(gdb) set *((int *)0x40000004) =3D 100
(gdb) p *0x40000004
=C2= =A0 =C2=A0 =C2=A0 999

I really want to solve this problem.
Can yo= u suggest any solutions how to set a value to the QOM register via GDB ?
Best regard,
Hiroko
--------------------------------------------= ----------------
static void test_reset(DeviceState *dev)
{
=C2=A0= =C2=A0 TestState *s =3D TEST(dev);
=C2=A0 =C2=A0 s->src =3D 444;=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// address : 0x40000000=
=C2=A0 =C2=A0 s->fix_value =3D 999;=C2=A0 =C2=A0 =C2=A0// address : = 0x40000004
}

static uint64_t test_read(void *opaque, hwaddr offse= t,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned size)
{
=C2=A0 =C2=A0 error_repo= rt("access test_read %d", (int)offset);
=C2=A0 =C2=A0 TestStat= e *s =3D (TestState *)opaque;

=C2=A0 =C2=A0 switch ((int)offset) {=C2=A0 =C2=A0 case 0:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return s->src;=C2=A0 =C2=A0 case 4:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return s->fix_valu= e;
=C2=A0 =C2=A0 default:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 error_report(&q= uot;bad offset : %d", (int)offset);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ret= urn 0;
=C2=A0 =C2=A0 }
}

static void test_write(void *opaque, = hwaddr offset,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 uint64_t value, unsigned size)
{
=C2=A0 = =C2=A0 error_report("access test_write %d %d", (int)offset, (int)= size);
=C2=A0 =C2=A0 TestState *s =3D (TestState *)opaque;

=C2=A0= =C2=A0 if(offset =3D=3D 0){
=C2=A0 =C2=A0 =C2=A0 =C2=A0 s->src =3D v= alue;
=C2=A0 =C2=A0 }else{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 qemu_log_mask(= LOG_GUEST_ERROR,"test_write: can't change %x\n", (int)offset)= ;
=C2=A0 =C2=A0 }
}
----------------------------------------------= ------------------------------------------------------------
--000000000000ef5aa205c62f8b24-- From MAILER-DAEMON Sat Jul 03 13:51:52 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1lzjo0-0006Lr-Gv for mharc-qemu-stable@gnu.org; Sat, 03 Jul 2021 13:51:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34356) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzjny-0006Jq-Kp for qemu-stable@nongnu.org; Sat, 03 Jul 2021 13:51:50 -0400 Received: from mail-ej1-x634.google.com ([2a00:1450:4864:20::634]:36612) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lzjnw-0002kx-Bx for qemu-stable@nongnu.org; Sat, 03 Jul 2021 13:51:50 -0400 Received: by mail-ej1-x634.google.com with SMTP id nd37so21879285ejc.3 for ; Sat, 03 Jul 2021 10:51:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=UGowmaP/b1Dc9fWcPDTMmBgfbwYe8G28LXdHp/1QUpg=; b=jwnkroXBroK+Gy+NQP+JDRaxwGBI+sms4C8VfVQWdOn8XG5Q4Tohmfcupdrln30azI tfOpXuq/X7e2JwjCyguYP5BJ2oR3UmSmCc0AgHuDQGfom0z6trU5yKploVPkfOHlFGIy iySHsy0m7ggxAf/9Dgc93MIsRWppuQZf1u+IkpQO05dkIl7kVyfWFB5JJPdY711Dqkjc XgGCzA/+C+NvlJ8fOsOIE0QBmNb2rro0W/ElodMvPvj0AYi184AEgfRF3b2bcmgzAIZe QlWlmVLeJzT5dC7thdAb3pVyMT1jWDuo/S2s+hKOpSS0zDCPMz7i7RxnBcxTqdURfeiv pVgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=UGowmaP/b1Dc9fWcPDTMmBgfbwYe8G28LXdHp/1QUpg=; b=OfaHaa9WwZnSOyDa50H5NFtMHYKy42Z0QMESWMTKoNa0SPvAYDeA5wl+e18l/JUNQX hjmlv7kBeY+pZguaysvWK3vnnFfucWqZk3IcZTnc581MaukJgtSBi751KCxuAmkfmfFB ikIcqu+I6nBEOtn37FZnxKnp/PpzpxHJWQ+MP3gzcDK1T6pKIILhrozjxudfHM6xQAGM 6EI5DgG7e7NKw6yBHIlnRI7RIp72aQ3aw8Gfi2PzJ9Po2l1Gup4HsoWF3TinzeBELOHq lGO6RMwVuA/ZIm7uK7k8T28/gI9Swx9O5db3XKeDcWIet8sWbK3gwLyUUWX2AFLWB/Fv 3mEw== X-Gm-Message-State: AOAM531YCJ4ePgM9chYMy9U6k4DlnTlzn/k/xKEis+N4Aavbfe0nZWHC MD+rFb4VLZV2NUu1HZ11r2sZmMlUG4Re1fENbrCplA== X-Google-Smtp-Source: ABdhPJwoXZP+EUeR+IeE/okKVNoCx57jfIF4qgyWBgD+zzDIy23CxRXNLDM36GTdORwb+WZJuGio/EtmkP6Y+p7iSHg= X-Received: by 2002:a17:907:98eb:: with SMTP id ke11mr5613373ejc.85.1625334706722; Sat, 03 Jul 2021 10:51:46 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Maydell Date: Sat, 3 Jul 2021 18:51:09 +0100 Message-ID: Subject: Re: How do you set a value to a QOM register from GDB? To: =?UTF-8?B?5riF5rC05a+b5a2Q?= Cc: qemu-stable , qemu-discuss Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::634; envelope-from=peter.maydell@linaro.org; helo=mail-ej1-x634.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Jul 2021 17:51:50 -0000 On Sat, 3 Jul 2021 at 04:57, =E6=B8=85=E6=B0=B4=E5=AF=9B=E5=AD=90 wrote: > > Hello everyone. > > I made a simple QOM which returns an error message when I read/write the = QOM register value like the bottom. > > Then I'm trying to read/write the QOM register from GDB. > I can read the QOM register value using the "print" command and get the e= rror message in qemu monitor. > This means that "print" calls the test_read function. > (gdb) p *0x40000004 > 999 > (qemu) access test_read 0 > > However, I can't write a value to the QOM register by "set" command. > Moreover, "set" command doesn't call the test_write function because I do= n't get the error message defined in the test_write function. > (gdb) set *((int *)0x40000004) =3D 100 > (gdb) p *0x40000004 > 999 > > I really want to solve this problem. I've just realized why this happens. The gdbstub cannot write to devices, only to guest RAM/ROM. (It can read either RAM/ROM or devices.) If you try to write to a device then the write will be silently discarded. (The technical detail is that gdbstub.c calls cpu_memory_rw_debug() which calls address_space_write_rom(), which ignores writes to emulated devices.) > Can you suggest any solutions how to set a value to the QOM register via = GDB ? You can't do this via GDB, I'm afraid. Only actual guest code can do that. thanks -- PMM From MAILER-DAEMON Sun Jul 04 02:35:13 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1lzvij-0005ad-Ec for mharc-qemu-stable@gnu.org; Sun, 04 Jul 2021 02:35:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48330) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lzvig-0005aN-4Z; Sun, 04 Jul 2021 02:35:11 -0400 Received: from mail-lf1-x130.google.com ([2a00:1450:4864:20::130]:37816) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lzvid-0004nF-8x; Sun, 04 Jul 2021 02:35:09 -0400 Received: by mail-lf1-x130.google.com with SMTP id v14so941245lfb.4; Sat, 03 Jul 2021 23:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=bOtWRbdoZOH27jyGexuEefoVpSXS+EyDdU4Mn04BD3s=; b=pw0MR9H/vKjMPf6uTKd4mY5hZyFuOO8/1jncQHjSfYEc4XsxHMGI5l6XxXa/KTnoTK B/8zcaf3Q6wnuUjGrmbDJ4ylAWrdCzTL6rnkfqawdDUsbwwXa2gT3AbOTlPVKmpBDB/T MhjOPPzdCmbgh4z+YkljaS5FMd2nIo1cVs+7uxnKU4Gju0g+p5aSzJF+ITk07xr8m0dX AmixQXBU92LZhr+Lm+d7BfdWPyLoYLHBYRdeA8fsIpfIIGzTIG1+9bYdwzsp7M4aADza YykmWMmWOE1ew9xwYfBZ1tgs0qyHHt2avEFYKsUWAapHRcQKsnBe2FLFo8A0A7YbqJ95 9BLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=bOtWRbdoZOH27jyGexuEefoVpSXS+EyDdU4Mn04BD3s=; b=EDdKTh08B23nZyCK0ntw8auMmjlFKVcU685nFoR57O+RXsW2NnF0zPPw+I0WSp0yew 5CnShXM474vygKJg8IY08HDlPNyi9+8tI82nv8w51wW/RxxlE/9VzZDkWYyd+sXltEm/ ntrOnI4cpaOnGs+TSCLf7OKeuC5yE0aVfz759S5/MZqoYgf3uJau8KeVOfCzOUN4lAch gGsqRjY3MO3Sfxj/eJyvgdDRhkbxv706KNnWllvYnnAF/sbtnfqUWho/yoFxmpcAb/42 IvaK/JclDSW+4BngnbvK187q0TDVb8FybhseZZXGBAjVK1/Z71OcHavbkaxvOnAPFBUY 7zWQ== X-Gm-Message-State: AOAM532TXjzxC+CGo3W+iTeO/h/qVe3/76c12PlFRvcboeQ7adbWcHp9 55ogiYAhsdj3agCgm4QvAjVW3SwzAIDf7yCSDtI= X-Google-Smtp-Source: ABdhPJyMl0t8uh/Ajfku+7cA6RzA1iq3dIZb9gA/+XcMTFoHR0AVPf6A1aqFwPNSGuFO9wO9glK17ZLJXDGOBEWvnG4= X-Received: by 2002:a05:6512:3322:: with SMTP id l2mr4133755lfe.108.1625380501860; Sat, 03 Jul 2021 23:35:01 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?B?5riF5rC05a+b5a2Q?= Date: Sun, 4 Jul 2021 15:34:51 +0900 Message-ID: Subject: Re: How do you set a value to a QOM register from GDB? To: Peter Maydell , qemu-stable , qemu-discuss Content-Type: multipart/alternative; boundary="0000000000007035a505c64663de" Received-SPF: pass client-ip=2a00:1450:4864:20::130; envelope-from=hiroko07168@gmail.com; helo=mail-lf1-x130.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2021 06:35:11 -0000 --0000000000007035a505c64663de Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you very much for your confirmation and help! I understand why I can't do that. Best regards, Hiroko 2021=E5=B9=B47=E6=9C=884=E6=97=A5(=E6=97=A5) 2:51 Peter Maydell : > On Sat, 3 Jul 2021 at 04:57, =E6=B8=85=E6=B0=B4=E5=AF=9B=E5=AD=90 wrote: > > > > Hello everyone. > > > > I made a simple QOM which returns an error message when I read/write th= e > QOM register value like the bottom. > > > > Then I'm trying to read/write the QOM register from GDB. > > I can read the QOM register value using the "print" command and get the > error message in qemu monitor. > > This means that "print" calls the test_read function. > > (gdb) p *0x40000004 > > 999 > > (qemu) access test_read 0 > > > > However, I can't write a value to the QOM register by "set" command. > > Moreover, "set" command doesn't call the test_write function because I > don't get the error message defined in the test_write function. > > (gdb) set *((int *)0x40000004) =3D 100 > > (gdb) p *0x40000004 > > 999 > > > > I really want to solve this problem. > > I've just realized why this happens. The gdbstub cannot write > to devices, only to guest RAM/ROM. (It can read either RAM/ROM or > devices.) If you try to write to a device then the write will be > silently discarded. (The technical detail is that gdbstub.c calls > cpu_memory_rw_debug() which calls address_space_write_rom(), which > ignores writes to emulated devices.) > > > Can you suggest any solutions how to set a value to the QOM register vi= a > GDB ? > > You can't do this via GDB, I'm afraid. Only actual guest code can do that= . > > thanks > -- PMM > --0000000000007035a505c64663de Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you very much for your confirmation and help!
I = understand why I can't do that.=C2=A0

Best reg= ards,=C2=A0
Hiroko


2021=E5=B9=B47=E6=9C=884= =E6=97=A5(=E6=97=A5) 2:51 Peter Maydell <peter.maydell@linaro.org>:
On Sat, 3 Jul 2021 at 04:57, =E6=B8=85=E6= =B0=B4=E5=AF=9B=E5=AD=90 <hiroko07168@gmail.com> wrote:
>
> Hello everyone.
>
> I made a simple QOM which returns an error message when I read/write t= he QOM register value like the bottom.
>
> Then I'm trying to read/write the QOM register from GDB.
> I can read the QOM register value using the "print" command = and get the error message in qemu monitor.
> This means that "print" calls the test_read function.
> (gdb) p *0x40000004
>=C2=A0 =C2=A0 =C2=A0 =C2=A0999
> (qemu) access test_read 0
>
> However, I can't write a value to the QOM register by "set&qu= ot; command.
> Moreover, "set" command doesn't call the test_write func= tion because I don't get the error message defined in the test_write fu= nction.
> (gdb) set *((int *)0x40000004) =3D 100
> (gdb) p *0x40000004
>=C2=A0 =C2=A0 =C2=A0 =C2=A0999
>
> I really want to solve this problem.

I've just realized why this happens. The gdbstub cannot write
to devices, only to guest RAM/ROM. (It can read either RAM/ROM or
devices.) If you try to write to a device then the write will be
silently discarded. (The technical detail is that gdbstub.c calls
cpu_memory_rw_debug() which calls address_space_write_rom(), which
ignores writes to emulated devices.)

> Can you suggest any solutions how to set a value to the QOM register v= ia GDB ?

You can't do this via GDB, I'm afraid. Only actual guest code can d= o that.

thanks
-- PMM
--0000000000007035a505c64663de-- From MAILER-DAEMON Mon Jul 05 04:40:28 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m0K9S-0007AV-EZ for mharc-qemu-stable@gnu.org; Mon, 05 Jul 2021 04:40:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46192) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0K9P-00079c-S2 for qemu-stable@nongnu.org; Mon, 05 Jul 2021 04:40:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:43249) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0K9K-00055l-IG for qemu-stable@nongnu.org; Mon, 05 Jul 2021 04:40:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625474415; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=739BGhT/S22dZHAbNyIAUJ6Qz+I6opGAjS8tM4gSWXw=; b=HB6/II5sV8ayLiSN+GrgEFOurX3pIdboxJiMo4ifi+2+Ml3tJ+DbvhioHRHEfJXVGYbjo0 rtTNg5KB5Gxe51uk7XNpm+7E0Zd886MhPcRfygZTym2+o5QK7meN3uKwRljk7ZmZ56AwiN FwgHLmtSvr97TZY7N9hf4d5GSSgN7KY= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-493--Jo6vktZONaMN9GvshXwng-1; Mon, 05 Jul 2021 04:40:14 -0400 X-MC-Unique: -Jo6vktZONaMN9GvshXwng-1 Received: by mail-wm1-f71.google.com with SMTP id n11-20020a05600c3b8bb02901ec5ef98aa0so2849997wms.0 for ; Mon, 05 Jul 2021 01:40:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=739BGhT/S22dZHAbNyIAUJ6Qz+I6opGAjS8tM4gSWXw=; b=P+mvljM6SjJMrTD3Ypiwxj98NZ2+f71K+oYh1J+NH5ng7nl9WifMvudzOIbV0jSzXg VAFECjHV6OX6UlUEv8wwywEmousL9rApw9QPDAuUpA98GpEzx+NXzgbeuZQDsQUEYNX7 aR6fkXyX2L/HoQwDVHT8333876isAZJ7fXdNPkBqw2q4sCHzhIDV7Z1TpTI98HysyQVO 9UJoyiYYytIifGd1mA/9VEHCSDTERyxSfYBK1r4Z7pvDJ5chnecIP2QDeKZS2ZfkftVC Xq4nmrUIB+8OPmy1dRlVH6JkTmPiUB9bWByd/wHg0RcJMnR23Agfwvrrda4ZZL5jWHIv oMJg== X-Gm-Message-State: AOAM532A6b+6L6NMg+L9P/4oP2QsvX2Z4Efj3LjJHnBF/idwEGjEbPtO hhAyMZWm9DCxJGETp8s9amMdPV1WeJxTX+Lb7B4qQxJbXzve5TfEvgAbdeDFlcIp0Z4O/aNV5BD or1JIgXeo/tfku6BN X-Received: by 2002:adf:fe12:: with SMTP id n18mr14059281wrr.219.1625474413159; Mon, 05 Jul 2021 01:40:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyFClVr3gER1V0BeARoQi/uE/lHsbCJd2hQb+7ZrTn34QiHIX4dp+CJqyhb7kAye2x/Jx9WyQ== X-Received: by 2002:adf:fe12:: with SMTP id n18mr14059245wrr.219.1625474412830; Mon, 05 Jul 2021 01:40:12 -0700 (PDT) Received: from x1w.redhat.com (93.red-83-35-24.dynamicip.rima-tde.net. [83.35.24.93]) by smtp.gmail.com with ESMTPSA id j1sm810146wms.7.2021.07.05.01.40.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jul 2021 01:40:12 -0700 (PDT) From: =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= To: qemu-devel@nongnu.org Cc: Mauro Matteo Cascella , Andrew Melnychenko , Alexander Bulekov , Jason Wang , Li Qiang , Dmitry Fleytman , Prasad J Pandit , Thomas Huth , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-stable@nongnu.org Subject: [PATCH] hw/net: Discard overly fragmented packets Date: Mon, 5 Jul 2021 10:40:11 +0200 Message-Id: <20210705084011.814175-1-philmd@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=philmd@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=philmd@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.441, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jul 2021 08:40:24 -0000 Our infrastructure can handle fragmented packets up to NET_MAX_FRAG_SG_LIST (64) pieces. This hard limit has been proven enough in production for years. If it is reached, it is likely an evil crafted packet. Discard it. Include the qtest reproducer provided by Alexander Bulekov: $ make check-qtest-i386 ... Running test qtest-i386/fuzz-vmxnet3-test qemu-system-i386: net/eth.c:334: void eth_setup_ip4_fragmentation(const void *, size_t, void *, size_t, size_t, size_t, _Bool): Assertion `frag_offset % IP_FRAG_UNIT_SIZE == 0' failed. Cc: qemu-stable@nongnu.org Reported-by: OSS-Fuzz (Issue 35799) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/460 Signed-off-by: Philippe Mathieu-Daudé --- hw/net/net_tx_pkt.c | 8 ++ tests/qtest/fuzz-vmxnet3-test.c | 195 ++++++++++++++++++++++++++++++++ MAINTAINERS | 1 + tests/qtest/meson.build | 1 + 4 files changed, 205 insertions(+) create mode 100644 tests/qtest/fuzz-vmxnet3-test.c diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c index 1f9aa59eca2..77e9729a7ba 100644 --- a/hw/net/net_tx_pkt.c +++ b/hw/net/net_tx_pkt.c @@ -590,6 +590,14 @@ static bool net_tx_pkt_do_sw_fragmentation(struct NetTxPkt *pkt, fragment_len = net_tx_pkt_fetch_fragment(pkt, &src_idx, &src_offset, fragment, &dst_idx); + if (dst_idx == NET_MAX_FRAG_SG_LIST && fragment_len > 0) { + /* + * The packet is too fragmented for our infrastructure + * (not enough iovec), don't even try to send. + */ + return false; + } + more_frags = (fragment_offset + fragment_len < pkt->payload_len); eth_setup_ip4_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base, diff --git a/tests/qtest/fuzz-vmxnet3-test.c b/tests/qtest/fuzz-vmxnet3-test.c new file mode 100644 index 00000000000..d69009bf5ce --- /dev/null +++ b/tests/qtest/fuzz-vmxnet3-test.c @@ -0,0 +1,195 @@ +/* + * QTest testcase for vmxnet3 device generated by fuzzer + * + * Copyright Red Hat + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" + +#include "libqos/libqtest.h" + +/* + * https://gitlab.com/qemu-project/qemu/-/issues/460 + */ +static void test_oss_35799_eth_setup_ip4_fragmentation(void) +{ + QTestState *s; + + s = qtest_init("-machine q35 -m 32M -display none -nodefaults " + "-device vmxnet3,netdev=net0 -netdev user,id=net0"); + qtest_outl(s, 0xcf8, 0x80000814); + qtest_outl(s, 0xcfc, 0xe0000000); + qtest_outl(s, 0xcf8, 0x80000804); + qtest_outw(s, 0xcfc, 0x06); + qtest_outl(s, 0xcf8, 0x80000812); + qtest_outl(s, 0xcfc, 0x2000); + qtest_outl(s, 0xcf8, 0x80000815); + qtest_outb(s, 0xcfc, 0x40); + qtest_bufwrite(s, 0x0, "\xe1", 0x1); + qtest_bufwrite(s, 0x1, "\xfe", 0x1); + qtest_bufwrite(s, 0x2, "\xbe", 0x1); + qtest_bufwrite(s, 0x3, "\xba", 0x1); + qtest_bufwrite(s, 0x28, "\xff", 0x1); + qtest_bufwrite(s, 0x29, "\xff", 0x1); + qtest_bufwrite(s, 0x2a, "\xff", 0x1); + qtest_bufwrite(s, 0x2b, "\xff", 0x1); + qtest_bufwrite(s, 0x2c, "\xff", 0x1); + qtest_bufwrite(s, 0x2d, "\xff", 0x1); + qtest_bufwrite(s, 0x2e, "\xff", 0x1); + qtest_bufwrite(s, 0x2f, "\xff", 0x1); + qtest_bufwrite(s, 0x37, "\x40", 0x1); + qtest_bufwrite(s, 0x3e, "\x01", 0x1); + qtest_bufwrite(s, 0xe0004020, "\x00\x00\xfe\xca", 0x4); + qtest_bufwrite(s, 0x9, "\x40", 0x1); + qtest_bufwrite(s, 0xd, "\x10", 0x1); + qtest_bufwrite(s, 0x12, "\x10", 0x1); + qtest_bufwrite(s, 0x19, "\x40", 0x1); + qtest_bufwrite(s, 0x1b, "\x21", 0x1); + qtest_bufwrite(s, 0x1d, "\x0c", 0x1); + qtest_bufwrite(s, 0x2d, "\x00", 0x1); + qtest_bufwrite(s, 0x10000c, "\x08", 0x1); + qtest_bufwrite(s, 0x10000e, "\x45", 0x1); + qtest_bufwrite(s, 0x100017, "\x11", 0x1); + qtest_bufwrite(s, 0x20000600, "\x00", 0x1); + qtest_bufwrite(s, 0x38, "\x01", 0x1); + qtest_bufwrite(s, 0x39, "\x40", 0x1); + qtest_bufwrite(s, 0x48, "\x01", 0x1); + qtest_bufwrite(s, 0x49, "\x40", 0x1); + qtest_bufwrite(s, 0x58, "\x01", 0x1); + qtest_bufwrite(s, 0x59, "\x40", 0x1); + qtest_bufwrite(s, 0x68, "\x01", 0x1); + qtest_bufwrite(s, 0x69, "\x40", 0x1); + qtest_bufwrite(s, 0x78, "\x01", 0x1); + qtest_bufwrite(s, 0x79, "\x40", 0x1); + qtest_bufwrite(s, 0x88, "\x01", 0x1); + qtest_bufwrite(s, 0x89, "\x40", 0x1); + qtest_bufwrite(s, 0x98, "\x01", 0x1); + qtest_bufwrite(s, 0x99, "\x40", 0x1); + qtest_bufwrite(s, 0xa8, "\x01", 0x1); + qtest_bufwrite(s, 0xa9, "\x40", 0x1); + qtest_bufwrite(s, 0xb8, "\x01", 0x1); + qtest_bufwrite(s, 0xb9, "\x40", 0x1); + qtest_bufwrite(s, 0xc8, "\x01", 0x1); + qtest_bufwrite(s, 0xc9, "\x40", 0x1); + qtest_bufwrite(s, 0xd8, "\x01", 0x1); + qtest_bufwrite(s, 0xd9, "\x40", 0x1); + qtest_bufwrite(s, 0xe8, "\x01", 0x1); + qtest_bufwrite(s, 0xe9, "\x40", 0x1); + qtest_bufwrite(s, 0xf8, "\x01", 0x1); + qtest_bufwrite(s, 0xf9, "\x40", 0x1); + qtest_bufwrite(s, 0x108, "\x01", 0x1); + qtest_bufwrite(s, 0x109, "\x40", 0x1); + qtest_bufwrite(s, 0x118, "\x01", 0x1); + qtest_bufwrite(s, 0x119, "\x40", 0x1); + qtest_bufwrite(s, 0x128, "\x01", 0x1); + qtest_bufwrite(s, 0x129, "\x40", 0x1); + qtest_bufwrite(s, 0x138, "\x01", 0x1); + qtest_bufwrite(s, 0x139, "\x40", 0x1); + qtest_bufwrite(s, 0x148, "\x01", 0x1); + qtest_bufwrite(s, 0x149, "\x40", 0x1); + qtest_bufwrite(s, 0x158, "\x01", 0x1); + qtest_bufwrite(s, 0x159, "\x40", 0x1); + qtest_bufwrite(s, 0x168, "\x01", 0x1); + qtest_bufwrite(s, 0x169, "\x40", 0x1); + qtest_bufwrite(s, 0x178, "\x01", 0x1); + qtest_bufwrite(s, 0x179, "\x40", 0x1); + qtest_bufwrite(s, 0x188, "\x01", 0x1); + qtest_bufwrite(s, 0x189, "\x40", 0x1); + qtest_bufwrite(s, 0x198, "\x01", 0x1); + qtest_bufwrite(s, 0x199, "\x40", 0x1); + qtest_bufwrite(s, 0x1a8, "\x01", 0x1); + qtest_bufwrite(s, 0x1a9, "\x40", 0x1); + qtest_bufwrite(s, 0x1b8, "\x01", 0x1); + qtest_bufwrite(s, 0x1b9, "\x40", 0x1); + qtest_bufwrite(s, 0x1c8, "\x01", 0x1); + qtest_bufwrite(s, 0x1c9, "\x40", 0x1); + qtest_bufwrite(s, 0x1d8, "\x01", 0x1); + qtest_bufwrite(s, 0x1d9, "\x40", 0x1); + qtest_bufwrite(s, 0x1e8, "\x01", 0x1); + qtest_bufwrite(s, 0x1e9, "\x40", 0x1); + qtest_bufwrite(s, 0x1f8, "\x01", 0x1); + qtest_bufwrite(s, 0x1f9, "\x40", 0x1); + qtest_bufwrite(s, 0x208, "\x01", 0x1); + qtest_bufwrite(s, 0x209, "\x40", 0x1); + qtest_bufwrite(s, 0x218, "\x01", 0x1); + qtest_bufwrite(s, 0x219, "\x40", 0x1); + qtest_bufwrite(s, 0x228, "\x01", 0x1); + qtest_bufwrite(s, 0x229, "\x40", 0x1); + qtest_bufwrite(s, 0x238, "\x01", 0x1); + qtest_bufwrite(s, 0x239, "\x40", 0x1); + qtest_bufwrite(s, 0x248, "\x01", 0x1); + qtest_bufwrite(s, 0x249, "\x40", 0x1); + qtest_bufwrite(s, 0x258, "\x01", 0x1); + qtest_bufwrite(s, 0x259, "\x40", 0x1); + qtest_bufwrite(s, 0x268, "\x01", 0x1); + qtest_bufwrite(s, 0x269, "\x40", 0x1); + qtest_bufwrite(s, 0x278, "\x01", 0x1); + qtest_bufwrite(s, 0x279, "\x40", 0x1); + qtest_bufwrite(s, 0x288, "\x01", 0x1); + qtest_bufwrite(s, 0x289, "\x40", 0x1); + qtest_bufwrite(s, 0x298, "\x01", 0x1); + qtest_bufwrite(s, 0x299, "\x40", 0x1); + qtest_bufwrite(s, 0x2a8, "\x01", 0x1); + qtest_bufwrite(s, 0x2a9, "\x40", 0x1); + qtest_bufwrite(s, 0x2b8, "\x01", 0x1); + qtest_bufwrite(s, 0x2b9, "\x40", 0x1); + qtest_bufwrite(s, 0x2c8, "\x01", 0x1); + qtest_bufwrite(s, 0x2c9, "\x40", 0x1); + qtest_bufwrite(s, 0x2d8, "\x01", 0x1); + qtest_bufwrite(s, 0x2d9, "\x40", 0x1); + qtest_bufwrite(s, 0x2e8, "\x01", 0x1); + qtest_bufwrite(s, 0x2e9, "\x40", 0x1); + qtest_bufwrite(s, 0x2f8, "\x01", 0x1); + qtest_bufwrite(s, 0x2f9, "\x40", 0x1); + qtest_bufwrite(s, 0x308, "\x01", 0x1); + qtest_bufwrite(s, 0x309, "\x40", 0x1); + qtest_bufwrite(s, 0x318, "\x01", 0x1); + qtest_bufwrite(s, 0x319, "\x40", 0x1); + qtest_bufwrite(s, 0x328, "\x01", 0x1); + qtest_bufwrite(s, 0x329, "\x40", 0x1); + qtest_bufwrite(s, 0x338, "\x01", 0x1); + qtest_bufwrite(s, 0x339, "\x40", 0x1); + qtest_bufwrite(s, 0x348, "\x01", 0x1); + qtest_bufwrite(s, 0x349, "\x40", 0x1); + qtest_bufwrite(s, 0x358, "\x01", 0x1); + qtest_bufwrite(s, 0x359, "\x40", 0x1); + qtest_bufwrite(s, 0x368, "\x01", 0x1); + qtest_bufwrite(s, 0x369, "\x40", 0x1); + qtest_bufwrite(s, 0x378, "\x01", 0x1); + qtest_bufwrite(s, 0x379, "\x40", 0x1); + qtest_bufwrite(s, 0x388, "\x01", 0x1); + qtest_bufwrite(s, 0x389, "\x40", 0x1); + qtest_bufwrite(s, 0x398, "\x01", 0x1); + qtest_bufwrite(s, 0x399, "\x40", 0x1); + qtest_bufwrite(s, 0x3a8, "\x01", 0x1); + qtest_bufwrite(s, 0x3a9, "\x40", 0x1); + qtest_bufwrite(s, 0x3b8, "\x01", 0x1); + qtest_bufwrite(s, 0x3b9, "\x40", 0x1); + qtest_bufwrite(s, 0x3c8, "\x01", 0x1); + qtest_bufwrite(s, 0x3c9, "\x40", 0x1); + qtest_bufwrite(s, 0x3d8, "\x01", 0x1); + qtest_bufwrite(s, 0x3d9, "\x40", 0x1); + qtest_bufwrite(s, 0x3e8, "\x01", 0x1); + qtest_bufwrite(s, 0x3e9, "\x40", 0x1); + qtest_bufwrite(s, 0x3f8, "\x01", 0x1); + qtest_bufwrite(s, 0x3f9, "\x40", 0x1); + qtest_bufwrite(s, 0xd, "\x10", 0x1); + qtest_bufwrite(s, 0x20000600, "\x00", 0x1); + qtest_quit(s); +} + +int main(int argc, char **argv) +{ + const char *arch = qtest_get_arch(); + + g_test_init(&argc, &argv, NULL); + + if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) { + qtest_add_func("fuzz/test_oss_35799_eth_setup_ip4_fragmentation", + test_oss_35799_eth_setup_ip4_fragmentation); + } + + return g_test_run(); +} diff --git a/MAINTAINERS b/MAINTAINERS index cb8f3ea2c2e..43e5050ad96 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2001,6 +2001,7 @@ S: Maintained F: hw/net/vmxnet* F: hw/scsi/vmw_pvscsi* F: tests/qtest/vmxnet3-test.c +F: tests/qtest/fuzz-vmxnet3-test.c Rocker M: Jiri Pirko diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build index b03e8541700..42add92e9d4 100644 --- a/tests/qtest/meson.build +++ b/tests/qtest/meson.build @@ -66,6 +66,7 @@ (config_all_devices.has_key('CONFIG_TPM_TIS_ISA') ? ['tpm-tis-swtpm-test'] : []) + \ (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) + \ (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) + \ + (config_all_devices.has_key('CONFIG_VMXNET3_PCI') ? ['fuzz-vmxnet3-test'] : []) + \ (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) + \ qtests_pci + \ ['fdc-test', -- 2.31.1 From MAILER-DAEMON Tue Jul 06 05:00:34 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m0gwU-0004up-0t for mharc-qemu-stable@gnu.org; Tue, 06 Jul 2021 05:00:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57258) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0gwS-0004uU-El for qemu-stable@nongnu.org; Tue, 06 Jul 2021 05:00:32 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:44395) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0gwP-0000ev-4u for qemu-stable@nongnu.org; Tue, 06 Jul 2021 05:00:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625562027; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pNCoQM6cIQuVQTeSdZXN96RR8OR6ZFYEDHnncOuuXWw=; b=K9JPR1wr6rGmO2hf/LMyHN3hKSzHb6XEYvior2OkAUCI8K3A/HLv+207jWGcqGSDrGne+6 JUzIoI2OgOYXiriv0XQABEIPbV2AislBXlkJBx4eOYQweEfyrCGoV6LCimxb4kSLx9amKx wCOQA+6EJY9ndutBT5manCveJuP5P5c= Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-144-M-5SLDStOTWUA9mittoeLg-1; Tue, 06 Jul 2021 05:00:24 -0400 X-MC-Unique: M-5SLDStOTWUA9mittoeLg-1 Received: by mail-pl1-f200.google.com with SMTP id s23-20020a170902b197b029011aafb8fbadso7038563plr.19 for ; Tue, 06 Jul 2021 02:00:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=pNCoQM6cIQuVQTeSdZXN96RR8OR6ZFYEDHnncOuuXWw=; b=YeQKO++E/e/qTpKpEbeB0nbRsEBG5vhjta/fvcA2cn1sOeeOHqLV0P6B+89A8jlPWJ AmPk20q6awA1T9YrOkBXR6cbRjoI7TNKWLRWj9jev5cTV1CReeDuBbmw8xnRT0dxIeOQ sAuDdyZ5GY7LI49wdP9JFo9daU+AZTjFZEOCpSfMLQ5avL5HE5if8CwMcBQOetMLMGWO GIZLcFQjVBOy9lXkr7E4qzrPwsmkP6tUCtmskYQ4G2TXNArlJVgyPoS1RkY51kmRWpxw Wo/sM2jsGKAOnyBwRk51y1MdtXok8yrxvnYCPUvaGd9xkn8ospyHGJiAxJMIgot0FMqI aDUg== X-Gm-Message-State: AOAM531ZIjTi6sCXhtC8sEUfJTkn/KFSQb5gUgKZAeecm8MZ/z6SPiIi NRF9SGMH16HJbbh13iiYzCZFCQVHTrbpoRID/pPBa8i5+377Sf3HXptJhXlf67mh0k0il4Cx+Le LVDOEYIN7PMrLPBdQPSz8F6cEll4POqEn X-Received: by 2002:a17:90a:b284:: with SMTP id c4mr19993950pjr.213.1625562023542; Tue, 06 Jul 2021 02:00:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzThBsAeQZ/X14QkFD3j7Rdr8R3xP14vGf5yQUvubVjQhyhhPCn2NrIixc7ov/Aw4ZrAxQz0gXtAaTzABVXG3g= X-Received: by 2002:a17:90a:b284:: with SMTP id c4mr19993918pjr.213.1625562023149; Tue, 06 Jul 2021 02:00:23 -0700 (PDT) MIME-Version: 1.0 References: <20210705084011.814175-1-philmd@redhat.com> In-Reply-To: <20210705084011.814175-1-philmd@redhat.com> From: Mauro Matteo Cascella Date: Tue, 6 Jul 2021 11:00:12 +0200 Message-ID: Subject: Re: [PATCH] hw/net: Discard overly fragmented packets To: =?UTF-8?Q?Philippe_Mathieu=2DDaud=C3=A9?= Cc: QEMU Developers , Andrew Melnychenko , Alexander Bulekov , Jason Wang , Li Qiang , Dmitry Fleytman , Prasad J Pandit , Thomas Huth , qemu-stable@nongnu.org Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mcascell@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=170.10.133.124; envelope-from=mcascell@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.442, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jul 2021 09:00:32 -0000 Hello Philippe, I think you don't need root privileges to craft such a highly fragmented packet from within the guest (tools like hping3 or nmap come to mind). Right? If so, we may consider allocating a CVE for this bug. If not, this is not CVE worthy - root does not need an assertion failure to cause damage to the system. On Mon, Jul 5, 2021 at 10:40 AM Philippe Mathieu-Daud=C3=A9 wrote: > > Our infrastructure can handle fragmented packets up to > NET_MAX_FRAG_SG_LIST (64) pieces. This hard limit has > been proven enough in production for years. If it is > reached, it is likely an evil crafted packet. Discard it. > > Include the qtest reproducer provided by Alexander Bulekov: > > $ make check-qtest-i386 > ... > Running test qtest-i386/fuzz-vmxnet3-test > qemu-system-i386: net/eth.c:334: void eth_setup_ip4_fragmentation(const= void *, size_t, void *, size_t, size_t, size_t, _Bool): > Assertion `frag_offset % IP_FRAG_UNIT_SIZE =3D=3D 0' failed. > > Cc: qemu-stable@nongnu.org > Reported-by: OSS-Fuzz (Issue 35799) > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/460 > Signed-off-by: Philippe Mathieu-Daud=C3=A9 > --- > hw/net/net_tx_pkt.c | 8 ++ > tests/qtest/fuzz-vmxnet3-test.c | 195 ++++++++++++++++++++++++++++++++ > MAINTAINERS | 1 + > tests/qtest/meson.build | 1 + > 4 files changed, 205 insertions(+) > create mode 100644 tests/qtest/fuzz-vmxnet3-test.c > > diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c > index 1f9aa59eca2..77e9729a7ba 100644 > --- a/hw/net/net_tx_pkt.c > +++ b/hw/net/net_tx_pkt.c > @@ -590,6 +590,14 @@ static bool net_tx_pkt_do_sw_fragmentation(struct Ne= tTxPkt *pkt, > fragment_len =3D net_tx_pkt_fetch_fragment(pkt, &src_idx, &src_o= ffset, > fragment, &dst_idx); > > + if (dst_idx =3D=3D NET_MAX_FRAG_SG_LIST && fragment_len > 0) { > + /* > + * The packet is too fragmented for our infrastructure > + * (not enough iovec), don't even try to send. > + */ > + return false; > + } > + > more_frags =3D (fragment_offset + fragment_len < pkt->payload_le= n); > > eth_setup_ip4_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base= , > diff --git a/tests/qtest/fuzz-vmxnet3-test.c b/tests/qtest/fuzz-vmxnet3-t= est.c > new file mode 100644 > index 00000000000..d69009bf5ce > --- /dev/null > +++ b/tests/qtest/fuzz-vmxnet3-test.c > @@ -0,0 +1,195 @@ > +/* > + * QTest testcase for vmxnet3 device generated by fuzzer > + * > + * Copyright Red Hat > + * > + * SPDX-License-Identifier: GPL-2.0-or-later > + */ > + > +#include "qemu/osdep.h" > + > +#include "libqos/libqtest.h" > + > +/* > + * https://gitlab.com/qemu-project/qemu/-/issues/460 > + */ > +static void test_oss_35799_eth_setup_ip4_fragmentation(void) > +{ > + QTestState *s; > + > + s =3D qtest_init("-machine q35 -m 32M -display none -nodefaults " > + "-device vmxnet3,netdev=3Dnet0 -netdev user,id=3Dnet0= "); > + qtest_outl(s, 0xcf8, 0x80000814); > + qtest_outl(s, 0xcfc, 0xe0000000); > + qtest_outl(s, 0xcf8, 0x80000804); > + qtest_outw(s, 0xcfc, 0x06); > + qtest_outl(s, 0xcf8, 0x80000812); > + qtest_outl(s, 0xcfc, 0x2000); > + qtest_outl(s, 0xcf8, 0x80000815); > + qtest_outb(s, 0xcfc, 0x40); > + qtest_bufwrite(s, 0x0, "\xe1", 0x1); > + qtest_bufwrite(s, 0x1, "\xfe", 0x1); > + qtest_bufwrite(s, 0x2, "\xbe", 0x1); > + qtest_bufwrite(s, 0x3, "\xba", 0x1); > + qtest_bufwrite(s, 0x28, "\xff", 0x1); > + qtest_bufwrite(s, 0x29, "\xff", 0x1); > + qtest_bufwrite(s, 0x2a, "\xff", 0x1); > + qtest_bufwrite(s, 0x2b, "\xff", 0x1); > + qtest_bufwrite(s, 0x2c, "\xff", 0x1); > + qtest_bufwrite(s, 0x2d, "\xff", 0x1); > + qtest_bufwrite(s, 0x2e, "\xff", 0x1); > + qtest_bufwrite(s, 0x2f, "\xff", 0x1); > + qtest_bufwrite(s, 0x37, "\x40", 0x1); > + qtest_bufwrite(s, 0x3e, "\x01", 0x1); > + qtest_bufwrite(s, 0xe0004020, "\x00\x00\xfe\xca", 0x4); > + qtest_bufwrite(s, 0x9, "\x40", 0x1); > + qtest_bufwrite(s, 0xd, "\x10", 0x1); > + qtest_bufwrite(s, 0x12, "\x10", 0x1); > + qtest_bufwrite(s, 0x19, "\x40", 0x1); > + qtest_bufwrite(s, 0x1b, "\x21", 0x1); > + qtest_bufwrite(s, 0x1d, "\x0c", 0x1); > + qtest_bufwrite(s, 0x2d, "\x00", 0x1); > + qtest_bufwrite(s, 0x10000c, "\x08", 0x1); > + qtest_bufwrite(s, 0x10000e, "\x45", 0x1); > + qtest_bufwrite(s, 0x100017, "\x11", 0x1); > + qtest_bufwrite(s, 0x20000600, "\x00", 0x1); > + qtest_bufwrite(s, 0x38, "\x01", 0x1); > + qtest_bufwrite(s, 0x39, "\x40", 0x1); > + qtest_bufwrite(s, 0x48, "\x01", 0x1); > + qtest_bufwrite(s, 0x49, "\x40", 0x1); > + qtest_bufwrite(s, 0x58, "\x01", 0x1); > + qtest_bufwrite(s, 0x59, "\x40", 0x1); > + qtest_bufwrite(s, 0x68, "\x01", 0x1); > + qtest_bufwrite(s, 0x69, "\x40", 0x1); > + qtest_bufwrite(s, 0x78, "\x01", 0x1); > + qtest_bufwrite(s, 0x79, "\x40", 0x1); > + qtest_bufwrite(s, 0x88, "\x01", 0x1); > + qtest_bufwrite(s, 0x89, "\x40", 0x1); > + qtest_bufwrite(s, 0x98, "\x01", 0x1); > + qtest_bufwrite(s, 0x99, "\x40", 0x1); > + qtest_bufwrite(s, 0xa8, "\x01", 0x1); > + qtest_bufwrite(s, 0xa9, "\x40", 0x1); > + qtest_bufwrite(s, 0xb8, "\x01", 0x1); > + qtest_bufwrite(s, 0xb9, "\x40", 0x1); > + qtest_bufwrite(s, 0xc8, "\x01", 0x1); > + qtest_bufwrite(s, 0xc9, "\x40", 0x1); > + qtest_bufwrite(s, 0xd8, "\x01", 0x1); > + qtest_bufwrite(s, 0xd9, "\x40", 0x1); > + qtest_bufwrite(s, 0xe8, "\x01", 0x1); > + qtest_bufwrite(s, 0xe9, "\x40", 0x1); > + qtest_bufwrite(s, 0xf8, "\x01", 0x1); > + qtest_bufwrite(s, 0xf9, "\x40", 0x1); > + qtest_bufwrite(s, 0x108, "\x01", 0x1); > + qtest_bufwrite(s, 0x109, "\x40", 0x1); > + qtest_bufwrite(s, 0x118, "\x01", 0x1); > + qtest_bufwrite(s, 0x119, "\x40", 0x1); > + qtest_bufwrite(s, 0x128, "\x01", 0x1); > + qtest_bufwrite(s, 0x129, "\x40", 0x1); > + qtest_bufwrite(s, 0x138, "\x01", 0x1); > + qtest_bufwrite(s, 0x139, "\x40", 0x1); > + qtest_bufwrite(s, 0x148, "\x01", 0x1); > + qtest_bufwrite(s, 0x149, "\x40", 0x1); > + qtest_bufwrite(s, 0x158, "\x01", 0x1); > + qtest_bufwrite(s, 0x159, "\x40", 0x1); > + qtest_bufwrite(s, 0x168, "\x01", 0x1); > + qtest_bufwrite(s, 0x169, "\x40", 0x1); > + qtest_bufwrite(s, 0x178, "\x01", 0x1); > + qtest_bufwrite(s, 0x179, "\x40", 0x1); > + qtest_bufwrite(s, 0x188, "\x01", 0x1); > + qtest_bufwrite(s, 0x189, "\x40", 0x1); > + qtest_bufwrite(s, 0x198, "\x01", 0x1); > + qtest_bufwrite(s, 0x199, "\x40", 0x1); > + qtest_bufwrite(s, 0x1a8, "\x01", 0x1); > + qtest_bufwrite(s, 0x1a9, "\x40", 0x1); > + qtest_bufwrite(s, 0x1b8, "\x01", 0x1); > + qtest_bufwrite(s, 0x1b9, "\x40", 0x1); > + qtest_bufwrite(s, 0x1c8, "\x01", 0x1); > + qtest_bufwrite(s, 0x1c9, "\x40", 0x1); > + qtest_bufwrite(s, 0x1d8, "\x01", 0x1); > + qtest_bufwrite(s, 0x1d9, "\x40", 0x1); > + qtest_bufwrite(s, 0x1e8, "\x01", 0x1); > + qtest_bufwrite(s, 0x1e9, "\x40", 0x1); > + qtest_bufwrite(s, 0x1f8, "\x01", 0x1); > + qtest_bufwrite(s, 0x1f9, "\x40", 0x1); > + qtest_bufwrite(s, 0x208, "\x01", 0x1); > + qtest_bufwrite(s, 0x209, "\x40", 0x1); > + qtest_bufwrite(s, 0x218, "\x01", 0x1); > + qtest_bufwrite(s, 0x219, "\x40", 0x1); > + qtest_bufwrite(s, 0x228, "\x01", 0x1); > + qtest_bufwrite(s, 0x229, "\x40", 0x1); > + qtest_bufwrite(s, 0x238, "\x01", 0x1); > + qtest_bufwrite(s, 0x239, "\x40", 0x1); > + qtest_bufwrite(s, 0x248, "\x01", 0x1); > + qtest_bufwrite(s, 0x249, "\x40", 0x1); > + qtest_bufwrite(s, 0x258, "\x01", 0x1); > + qtest_bufwrite(s, 0x259, "\x40", 0x1); > + qtest_bufwrite(s, 0x268, "\x01", 0x1); > + qtest_bufwrite(s, 0x269, "\x40", 0x1); > + qtest_bufwrite(s, 0x278, "\x01", 0x1); > + qtest_bufwrite(s, 0x279, "\x40", 0x1); > + qtest_bufwrite(s, 0x288, "\x01", 0x1); > + qtest_bufwrite(s, 0x289, "\x40", 0x1); > + qtest_bufwrite(s, 0x298, "\x01", 0x1); > + qtest_bufwrite(s, 0x299, "\x40", 0x1); > + qtest_bufwrite(s, 0x2a8, "\x01", 0x1); > + qtest_bufwrite(s, 0x2a9, "\x40", 0x1); > + qtest_bufwrite(s, 0x2b8, "\x01", 0x1); > + qtest_bufwrite(s, 0x2b9, "\x40", 0x1); > + qtest_bufwrite(s, 0x2c8, "\x01", 0x1); > + qtest_bufwrite(s, 0x2c9, "\x40", 0x1); > + qtest_bufwrite(s, 0x2d8, "\x01", 0x1); > + qtest_bufwrite(s, 0x2d9, "\x40", 0x1); > + qtest_bufwrite(s, 0x2e8, "\x01", 0x1); > + qtest_bufwrite(s, 0x2e9, "\x40", 0x1); > + qtest_bufwrite(s, 0x2f8, "\x01", 0x1); > + qtest_bufwrite(s, 0x2f9, "\x40", 0x1); > + qtest_bufwrite(s, 0x308, "\x01", 0x1); > + qtest_bufwrite(s, 0x309, "\x40", 0x1); > + qtest_bufwrite(s, 0x318, "\x01", 0x1); > + qtest_bufwrite(s, 0x319, "\x40", 0x1); > + qtest_bufwrite(s, 0x328, "\x01", 0x1); > + qtest_bufwrite(s, 0x329, "\x40", 0x1); > + qtest_bufwrite(s, 0x338, "\x01", 0x1); > + qtest_bufwrite(s, 0x339, "\x40", 0x1); > + qtest_bufwrite(s, 0x348, "\x01", 0x1); > + qtest_bufwrite(s, 0x349, "\x40", 0x1); > + qtest_bufwrite(s, 0x358, "\x01", 0x1); > + qtest_bufwrite(s, 0x359, "\x40", 0x1); > + qtest_bufwrite(s, 0x368, "\x01", 0x1); > + qtest_bufwrite(s, 0x369, "\x40", 0x1); > + qtest_bufwrite(s, 0x378, "\x01", 0x1); > + qtest_bufwrite(s, 0x379, "\x40", 0x1); > + qtest_bufwrite(s, 0x388, "\x01", 0x1); > + qtest_bufwrite(s, 0x389, "\x40", 0x1); > + qtest_bufwrite(s, 0x398, "\x01", 0x1); > + qtest_bufwrite(s, 0x399, "\x40", 0x1); > + qtest_bufwrite(s, 0x3a8, "\x01", 0x1); > + qtest_bufwrite(s, 0x3a9, "\x40", 0x1); > + qtest_bufwrite(s, 0x3b8, "\x01", 0x1); > + qtest_bufwrite(s, 0x3b9, "\x40", 0x1); > + qtest_bufwrite(s, 0x3c8, "\x01", 0x1); > + qtest_bufwrite(s, 0x3c9, "\x40", 0x1); > + qtest_bufwrite(s, 0x3d8, "\x01", 0x1); > + qtest_bufwrite(s, 0x3d9, "\x40", 0x1); > + qtest_bufwrite(s, 0x3e8, "\x01", 0x1); > + qtest_bufwrite(s, 0x3e9, "\x40", 0x1); > + qtest_bufwrite(s, 0x3f8, "\x01", 0x1); > + qtest_bufwrite(s, 0x3f9, "\x40", 0x1); > + qtest_bufwrite(s, 0xd, "\x10", 0x1); > + qtest_bufwrite(s, 0x20000600, "\x00", 0x1); > + qtest_quit(s); > +} > + > +int main(int argc, char **argv) > +{ > + const char *arch =3D qtest_get_arch(); > + > + g_test_init(&argc, &argv, NULL); > + > + if (strcmp(arch, "i386") =3D=3D 0 || strcmp(arch, "x86_64") =3D=3D 0= ) { > + qtest_add_func("fuzz/test_oss_35799_eth_setup_ip4_fragmentation"= , > + test_oss_35799_eth_setup_ip4_fragmentation); > + } > + > + return g_test_run(); > +} > diff --git a/MAINTAINERS b/MAINTAINERS > index cb8f3ea2c2e..43e5050ad96 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -2001,6 +2001,7 @@ S: Maintained > F: hw/net/vmxnet* > F: hw/scsi/vmw_pvscsi* > F: tests/qtest/vmxnet3-test.c > +F: tests/qtest/fuzz-vmxnet3-test.c > > Rocker > M: Jiri Pirko > diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build > index b03e8541700..42add92e9d4 100644 > --- a/tests/qtest/meson.build > +++ b/tests/qtest/meson.build > @@ -66,6 +66,7 @@ > (config_all_devices.has_key('CONFIG_TPM_TIS_ISA') ? ['tpm-tis-swtpm-te= st'] : []) + \ > (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] := []) + \ > (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e100= 0e-test'] : []) + \ > + (config_all_devices.has_key('CONFIG_VMXNET3_PCI') ? ['fuzz-vmxnet3-tes= t'] : []) + \ > (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []= ) + \ > qtests_pci + = \ > ['fdc-test', > -- > 2.31.1 > --=20 Mauro Matteo Cascella Red Hat Product Security PGP-Key ID: BB3410B0 From MAILER-DAEMON Tue Jul 06 05:09:32 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m0h5A-0006i6-Ak for mharc-qemu-stable@gnu.org; Tue, 06 Jul 2021 05:09:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59164) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0h58-0006hL-Hw for qemu-stable@nongnu.org; Tue, 06 Jul 2021 05:09:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:55475) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0h56-0007SN-Lp for qemu-stable@nongnu.org; Tue, 06 Jul 2021 05:09:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625562567; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fjy30UfYw3/PLJcAqUGcL2rr+4JuDgRykdAA3M2k8vs=; b=GI8pnMCmLSTmGKHSTvzXRfPC4RvFEdAyNY1z843TiA7/WNNy+gwTxEiIBnO3eZNz50A9Ry /JmKsCJGpnTKl05YtLqzjgcFjM3xseirE1GJfdrduE2QKVKtFGeJ8LGhqGgTu3cOYId0UV /5XEUlo7Jv3jXJL/C05W8VT72Iji2so= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-539-TT5jqhg1MxOwNmNtiBvm6A-1; Tue, 06 Jul 2021 05:09:24 -0400 X-MC-Unique: TT5jqhg1MxOwNmNtiBvm6A-1 Received: by mail-wm1-f69.google.com with SMTP id 13-20020a1c010d0000b02901eca51685daso871204wmb.3 for ; Tue, 06 Jul 2021 02:09:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=fjy30UfYw3/PLJcAqUGcL2rr+4JuDgRykdAA3M2k8vs=; b=V6mHN4MnOVEC+P26syQocX7f707f2mKoVQROZJQ1Hg28sZ4oosoTLyhdizuznGHsfQ 7FsODiz2P+mGW7vyHKk1a8+DQaF2aamgDkGdJJKBSh93r91pfVs6TTuyF202UWZYu9Wh yG9MmWI0aXAhYUcwm3ZU4pqjU1phRU5jDX/RErutLUIPhw1mriXkVnjg/E+hNM6q9ZmF UY6kOdgxG01bmSs2RM8syNZFUhjy9bHTTjKip3gBWU5HAA0oid4IWkjvMLpSc0jBtli4 HvP/VqSl2MWuLnnBvPW8Wth+Y5LWHBOVuBPxDFnHJtIPENm7QDG9RcwW9oHavDzZT8IO xgaw== X-Gm-Message-State: AOAM530ETR/NXjOVNMtKMr2jfg3rY1nMOvmAted63u4uWI45m8UBlI8d s0hASJrBhMj8qTqWyOkXx1Z/KxG6igjOhoU5EnAGbaeLjfUxgZrT5486Dq/qBcoV3szmHfJT/X7 ARC8hHAmNuzkp/eJFTF9RjdnQtQ4bvv/TeRcKy8A0oXam2VgvIrqV7/82Tbo0tCgOjg== X-Received: by 2002:a05:600c:4fd0:: with SMTP id o16mr3586265wmq.179.1625562563420; Tue, 06 Jul 2021 02:09:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwaNgSwRW/NXH3Rb0D7da601vrNnO9cstsFbLH3KRuLxFfcqJMh/MSUYph/KHp9MPtRGPQFHA== X-Received: by 2002:a05:600c:4fd0:: with SMTP id o16mr3586236wmq.179.1625562563239; Tue, 06 Jul 2021 02:09:23 -0700 (PDT) Received: from [192.168.1.36] (93.red-83-35-24.dynamicip.rima-tde.net. [83.35.24.93]) by smtp.gmail.com with ESMTPSA id r13sm5558844wrt.38.2021.07.06.02.09.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Jul 2021 02:09:22 -0700 (PDT) Subject: Re: [PATCH] hw/net: Discard overly fragmented packets To: Mauro Matteo Cascella Cc: Andrew Melnychenko , Dmitry Fleytman , Jason Wang , Li Qiang , QEMU Developers , Prasad J Pandit , Alexander Bulekov , Thomas Huth , qemu-stable@nongnu.org References: <20210705084011.814175-1-philmd@redhat.com> From: =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= Message-ID: <79893227-6b00-c5f8-f0f2-4c7d554403a3@redhat.com> Date: Tue, 6 Jul 2021 11:09:22 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=philmd@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=philmd@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.442, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jul 2021 09:09:30 -0000 Hi Mauro, On 7/6/21 11:00 AM, Mauro Matteo Cascella wrote: > Hello Philippe, > > I think you don't need root privileges to craft such a highly > fragmented packet from within the guest (tools like hping3 or nmap > come to mind). Right? If so, we may consider allocating a CVE for this > bug. If not, this is not CVE worthy - root does not need an assertion > failure to cause damage to the system. Thanks for worrying about CVE. I have no clue, so I'll defer that question to Andrew, Dmitry and Jason. Regards, Phil. > On Mon, Jul 5, 2021 at 10:40 AM Philippe Mathieu-Daudé > wrote: >> >> Our infrastructure can handle fragmented packets up to >> NET_MAX_FRAG_SG_LIST (64) pieces. This hard limit has >> been proven enough in production for years. If it is >> reached, it is likely an evil crafted packet. Discard it. >> >> Include the qtest reproducer provided by Alexander Bulekov: >> >> $ make check-qtest-i386 >> ... >> Running test qtest-i386/fuzz-vmxnet3-test >> qemu-system-i386: net/eth.c:334: void eth_setup_ip4_fragmentation(const void *, size_t, void *, size_t, size_t, size_t, _Bool): >> Assertion `frag_offset % IP_FRAG_UNIT_SIZE == 0' failed. >> >> Cc: qemu-stable@nongnu.org >> Reported-by: OSS-Fuzz (Issue 35799) >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/460 >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> hw/net/net_tx_pkt.c | 8 ++ >> tests/qtest/fuzz-vmxnet3-test.c | 195 ++++++++++++++++++++++++++++++++ >> MAINTAINERS | 1 + >> tests/qtest/meson.build | 1 + >> 4 files changed, 205 insertions(+) >> create mode 100644 tests/qtest/fuzz-vmxnet3-test.c >> >> diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c >> index 1f9aa59eca2..77e9729a7ba 100644 >> --- a/hw/net/net_tx_pkt.c >> +++ b/hw/net/net_tx_pkt.c >> @@ -590,6 +590,14 @@ static bool net_tx_pkt_do_sw_fragmentation(struct NetTxPkt *pkt, >> fragment_len = net_tx_pkt_fetch_fragment(pkt, &src_idx, &src_offset, >> fragment, &dst_idx); >> >> + if (dst_idx == NET_MAX_FRAG_SG_LIST && fragment_len > 0) { >> + /* >> + * The packet is too fragmented for our infrastructure >> + * (not enough iovec), don't even try to send. >> + */ >> + return false; >> + } >> + >> more_frags = (fragment_offset + fragment_len < pkt->payload_len); >> >> eth_setup_ip4_fragmentation(l2_iov_base, l2_iov_len, l3_iov_base, From MAILER-DAEMON Tue Jul 06 11:32:56 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m0n4A-00041k-T3 for mharc-qemu-stable@gnu.org; Tue, 06 Jul 2021 11:32:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42418) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0n46-0003y5-1w for qemu-stable@nongnu.org; Tue, 06 Jul 2021 11:32:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:22089) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m0n40-0000hf-Tv for qemu-stable@nongnu.org; Tue, 06 Jul 2021 11:32:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625585563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=np8xZzxDnAX3VrRbNHw3zcygaf5maF4bMo4i/36g2QA=; b=RTEp3KGUJjthngDbpMV/D8S96Je3cKJQvgPHDUrstcRr4pAoqbJzk3JdV/EfwB71yK0zGy hhknEMeVGQ7Hbcp0qQBRSD+HwOj+3Qlh49wwgZtuXWRz60V1oZT/asbf4fUgZgHEDKHXIN Wk9vHxDVoSV4qp0mu1p74lMGuwT9zLc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-149-kddMEV0wPAKx-bVOkjh-yw-1; Tue, 06 Jul 2021 11:32:41 -0400 X-MC-Unique: kddMEV0wPAKx-bVOkjh-yw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3B7DC19200C2; Tue, 6 Jul 2021 15:32:40 +0000 (UTC) Received: from blackfin.pond.sub.org (ovpn-112-169.ams2.redhat.com [10.36.112.169]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0C93560583; Tue, 6 Jul 2021 15:32:39 +0000 (UTC) Received: by blackfin.pond.sub.org (Postfix, from userid 1000) id 724B21132B52; Tue, 6 Jul 2021 17:32:38 +0200 (CEST) From: Markus Armbruster To: Vladimir Sementsov-Ogievskiy Cc: Kevin Wolf , qemu-devel@nongnu.org, armbru@redhat.com, qemu-stable@nongnu.org Subject: Re: [PATCH 0/2] monitor: Shutdown fixes References: <20210212172028.288825-1-kwolf@redhat.com> <5249a722-453c-1bf7-fa0e-4210405546d6@virtuozzo.com> Date: Tue, 06 Jul 2021 17:32:38 +0200 In-Reply-To: <5249a722-453c-1bf7-fa0e-4210405546d6@virtuozzo.com> (Vladimir Sementsov-Ogievskiy's message of "Tue, 29 Jun 2021 12:14:47 +0300") Message-ID: <871r8blap5.fsf@dusky.pond.sub.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=armbru@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain Received-SPF: pass client-ip=216.205.24.124; envelope-from=armbru@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.442, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jul 2021 15:32:50 -0000 Vladimir Sementsov-Ogievskiy writes: > 12.02.2021 20:20, Kevin Wolf wrote: >> This fixes the bug(s) that Markus reported here: >> https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg07719.html >> Kevin Wolf (2): >> monitor: Fix assertion failure on shutdown >> monitor/qmp: Stop processing requests when shutdown is requested >> monitor/monitor.c | 25 +++++++++++++++---------- >> monitor/qmp.c | 5 +++++ >> 2 files changed, 20 insertions(+), 10 deletions(-) >> > > > Hi! > > Now we faced this bug after rebasing Virtuozzo qemu package onto qemu-kvm-5.2.0-16.module+el8.4.0+10806+b7d97207.src.rpm > > So, I'm backporting these patches. > > This probably should go to stable and/or to further Rhel packages. Cc'ing qemu-stable, so it gets considered. From MAILER-DAEMON Wed Jul 07 10:07:23 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m18Cx-0000CQ-AD for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 10:07:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51112) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m18Cv-00009I-MW for qemu-stable@nongnu.org; Wed, 07 Jul 2021 10:07:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:47765) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m18Ct-0000Fm-SY for qemu-stable@nongnu.org; Wed, 07 Jul 2021 10:07:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625666839; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Tx8QqxrbBUhqMjhtl/HmEzuJlcz5ViTm1Q1Mnp4miBc=; b=i9ThPC9O9UtMCwL7z4o/AdUfAc+UEYukm0KGXYN45B1hXdAT1ctaO9TAq+tOqUaVTHtVvq 9TFV90A1XxYI2Vk2NZCN6HSOT/LduAEJht03lOqQaThE1s2v9DOLca+hBrz/bz1N0siyJ1 lXlMvKNzSkcvRhlXTyh2rI8ArQxf5zs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-188-9RpO9ZLLPn-Yh3zxJpQqLA-1; Wed, 07 Jul 2021 10:07:18 -0400 X-MC-Unique: 9RpO9ZLLPn-Yh3zxJpQqLA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3508E1023F41; Wed, 7 Jul 2021 14:07:17 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-110.ams2.redhat.com [10.36.114.110]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9A4C95D9FC; Wed, 7 Jul 2021 14:07:14 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu Subject: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Date: Wed, 7 Jul 2021 16:06:55 +0200 Message-Id: <20210707140655.30982-3-david@redhat.com> In-Reply-To: <20210707140655.30982-1-david@redhat.com> References: <20210707140655.30982-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 14:07:21 -0000 Postcopy never worked properly with 'free-page-hint=on', as there are at least two issues: 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE and consequently won't release free pages back to the OS once migration finishes. The issue is that for postcopy, we won't do a final bitmap sync while the guest is stopped on the source and virtio_balloon_free_page_hint_notify() will only call virtio_balloon_free_page_done() on the source during PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to the destination. 2) Once the VM touches a page on the destination that has been excluded from migration on the source via qemu_guest_free_page_hint() while postcopy is active, that thread will stall until postcopy finishes and all threads are woken up. (with older Linux kernels that won't retry faults when woken up via userfaultfd, we might actually get a SEGFAULT) The issue is that the source will refuse to migrate any pages that are not marked as dirty in the dirty bmap -- for example, because the page might just have been sent. Consequently, the faulting thread will stall, waiting for the page to be migrated -- which could take quite a while and result in guest OS issues. While we could fix 1), for example, by calling virtio_balloon_free_page_done() via pre_save callbacks of the vmstate, 2) is mostly impossible to fix without additional tracking, such that we can actually identify these hinted pages and handle them accordingly. As it never worked properly, let's disable it via the postcopy notifier on the destination. Trying to set "migrate_set_capability postcopy-ram on" on the destination now results in "virtio-balloon: 'free-page-hint' does not support postcopy Error: Postcopy is not supported". Note 1: We could let qemu_guest_free_page_hint() mark postcopy as broken once actually clearing bits on the source. However, it's harder to realize as we can race with users starting postcopy and we cannot produce an expressive error message easily. Note 2: virtio-mem has similar issues, however, access to "unplugged" memory by the guest is very rare and we would have to be very lucky for it to happen during migration. The spec states "The driver SHOULD NOT read from unplugged memory blocks ..." and "The driver MUST NOT write to unplugged memory blocks". virtio-mem will move away from virtio_balloon_free_page_done() soon and handle this case explicitly on the destination. Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: qemu-stable@nongnu.org Cc: Wei Wang Cc: Michael S. Tsirkin Cc: Philippe Mathieu-Daudé Cc: Alexander Duyck Cc: Juan Quintela Cc: "Dr. David Alan Gilbert" Cc: Peter Xu Signed-off-by: David Hildenbrand --- hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++++++++ include/hw/virtio/virtio-balloon.h | 1 + 2 files changed, 27 insertions(+) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 4b5d9e5e50..d0c9dc677c 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -30,6 +30,7 @@ #include "trace.h" #include "qemu/error-report.h" #include "migration/misc.h" +#include "migration/postcopy-ram.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" @@ -692,6 +693,28 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data) return 0; } + +static int virtio_balloon_postcopy_notify(NotifierWithReturn *n, void *opaque) +{ + VirtIOBalloon *dev = container_of(n, VirtIOBalloon, postcopy_notifier); + PostcopyNotifyData *pnd = opaque; + + /* We register the notifier only with 'free-page-hint=on' for now. */ + g_assert(virtio_has_feature(dev->host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT)); + + /* + * Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty + * bitmap and will not get migrated, especially also not when the postcopy + * destination starts using them and requests migration from the source; the + * faulting thread will stall until postcopy migration finishes and + * all threads are woken up. + */ + error_setg(pnd->errp, + "virtio-balloon: 'free-page-hint' does not support postcopy"); + return -ENOENT; +} + static size_t virtio_balloon_config_size(VirtIOBalloon *s) { uint64_t features = s->host_features; @@ -911,6 +934,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, virtio_balloon_handle_free_page_vq); precopy_add_notifier(&s->free_page_hint_notify); + postcopy_add_notifier(&s->postcopy_notifier); object_ref(OBJECT(s->iothread)); s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread), @@ -935,6 +959,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev) object_unref(OBJECT(s->iothread)); virtio_balloon_free_page_stop(s); precopy_remove_notifier(&s->free_page_hint_notify); + postcopy_remove_notifier(&s->postcopy_notifier); } balloon_stats_destroy_timer(s); qemu_remove_balloon_handler(s); @@ -1008,6 +1033,7 @@ static void virtio_balloon_instance_init(Object *obj) qemu_cond_init(&s->free_page_cond); s->free_page_hint_cmd_id = VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN; s->free_page_hint_notify.notify = virtio_balloon_free_page_hint_notify; + s->postcopy_notifier.notify = virtio_balloon_postcopy_notify; object_property_add(obj, "guest-stats", "guest statistics", balloon_stats_get_all, NULL, NULL, s); diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index 5139cf8ab6..d0d5b793b9 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -65,6 +65,7 @@ struct VirtIOBalloon { */ bool block_iothread; NotifierWithReturn free_page_hint_notify; + NotifierWithReturn postcopy_notifier; int64_t stats_last_update; int64_t stats_poll_interval; uint32_t host_features; -- 2.31.1 From MAILER-DAEMON Wed Jul 07 11:03:14 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m194z-0000Nt-SX for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 11:03:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36512) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m194z-0000LM-9z for qemu-stable@nongnu.org; Wed, 07 Jul 2021 11:03:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:55435) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m194x-0006Zp-KM for qemu-stable@nongnu.org; Wed, 07 Jul 2021 11:03:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625670191; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tpd63PvLoqwccvwknyW++wOSUAmq/kcYqjMFvtaZ5Qc=; b=iRPAipiOcCt3uQBEN5pRH7oCbUKrUd2gi6xkOjftj9V+ekaO8VR58lHuLjP8naVAhmsSiW uciK2UO+3GiVozkBCdTvYLmd3Fun/nzvlVWQiKsAEqL1aG0ivjN5cGLRhv9z624tXnlRF8 AqCssPCcepu/xvv3tY/N4AtDxvVz+nc= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-495-K6yP8lo3P5Sjyq8VFmM5Ow-1; Wed, 07 Jul 2021 11:03:07 -0400 X-MC-Unique: K6yP8lo3P5Sjyq8VFmM5Ow-1 Received: by mail-wr1-f70.google.com with SMTP id j2-20020a0560001242b029012c82df4dbbso860850wrx.23 for ; Wed, 07 Jul 2021 08:03:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=tpd63PvLoqwccvwknyW++wOSUAmq/kcYqjMFvtaZ5Qc=; b=WytXpv3UH13F6GhEU/7Euna206gT63XDHhEueRui2WZSZq8gZ44rmEKKWLMYerrHHA Ha/R1t+mdTKzds/eAIMUjXUv0jAr8ltKkrVA7evOJOPVPDj3x8cgtogddeNlKUmtuemt pap8o9ThcNdz4XUR5N2cLo2jd5MLnnjZDFqnVIUcS1TUCg3Z33dGlYGe4K8LkS9mUKW5 v54z4l5RyxezTCf61KrJFmnDCsxMMbwcG/ycp9P/EFjOz28xZXCr7LxwB7DBWNT5FUcr qJYIL8KFVuSJSF70rRNiTKuiCeT9WZvheFSYmMd4miYS9Mlm7EGiZXeCC/rP0nA4acVW 6svQ== X-Gm-Message-State: AOAM530sskb2yK9kB7+tWDJ4G51cr1Uw/r0Vs3fDQ/YmspDLn+vnXWCh vXge8CMdinV4HuHdZwC50gw7LS9fYDJSdezYxG4H9kg3lgm2feawVSMlwUlQwQ6UFDuSjNKhmSX sprwImDYTHA9wJXnc X-Received: by 2002:a7b:ce82:: with SMTP id q2mr81997wmj.60.1625670186575; Wed, 07 Jul 2021 08:03:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy2Y+fayBkFoiRPjaelqhejk/9LLCyIsrDCL11/9ePdCyoyK1H2vC6jZTzp89glMt30NxuX7w== X-Received: by 2002:a7b:ce82:: with SMTP id q2mr81973wmj.60.1625670186415; Wed, 07 Jul 2021 08:03:06 -0700 (PDT) Received: from redhat.com ([2.55.150.102]) by smtp.gmail.com with ESMTPSA id b12sm16490808wrx.60.2021.07.07.08.03.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 08:03:05 -0700 (PDT) Date: Wed, 7 Jul 2021 11:03:03 -0400 From: "Michael S. Tsirkin" To: qemu-devel@nongnu.org Cc: Peter Maydell , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , qemu-stable@nongnu.org, Alexander Bulekov , Richard Henderson , Marcel Apfelbaum Subject: [PULL 03/13] hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field Message-ID: <20210707150157.52328-4-mst@redhat.com> References: <20210707150157.52328-1-mst@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210707150157.52328-1-mst@redhat.com> X-Mailer: git-send-email 2.27.0.106.g8ac3dc51b1 X-Mutt-Fcc: =sent Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=mst@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 15:03:13 -0000 From: Philippe Mathieu-Daudé libFuzzer triggered the following assertion: cat << EOF | qemu-system-i386 -M pc-q35-5.0 \ -nographic -monitor none -serial none \ -qtest stdio -d guest_errors -trace pci\* outl 0xcf8 0xf2000060 outl 0xcfc 0x8400056e EOF pci_cfg_write mch 00:0 @0x60 <- 0x8400056e Aborted (core dumped) This is because guest wrote MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD (reserved value) to the PCIE XBAR register. There is no indication on the datasheet about what occurs when this value is written. Simply ignore it on QEMU (and report an guest error): pci_cfg_write mch 00:0 @0x60 <- 0x8400056e Q35: Reserved PCIEXBAR LENGTH pci_cfg_read mch 00:0 @0x0 -> 0x8086 pci_cfg_read mch 00:0 @0x0 -> 0x29c08086 ... Cc: qemu-stable@nongnu.org Reported-by: Alexander Bulekov BugLink: https://bugs.launchpad.net/qemu/+bug/1878641 Fixes: df2d8b3ed4 ("q35: Introduce q35 pc based chipset emulator") Reviewed-by: Richard Henderson Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20210526142438.281477-1-f4bug@amsat.org> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Reviewed-by: Alexander Bulekov Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- hw/pci-host/q35.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index 2eb729dff5..0f37cf056a 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -29,6 +29,7 @@ */ #include "qemu/osdep.h" +#include "qemu/log.h" #include "hw/i386/pc.h" #include "hw/pci-host/q35.h" #include "hw/qdev-properties.h" @@ -318,6 +319,8 @@ static void mch_update_pciexbar(MCHPCIState *mch) addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK; break; case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD: + qemu_log_mask(LOG_GUEST_ERROR, "Q35: Reserved PCIEXBAR LENGTH\n"); + return; default: abort(); } -- MST From MAILER-DAEMON Wed Jul 07 14:03:03 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1Bt1-0001Wr-Os for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 14:03:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54668) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Bt0-0001W1-8l for qemu-stable@nongnu.org; Wed, 07 Jul 2021 14:03:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:35046) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Bsx-00039r-7O for qemu-stable@nongnu.org; Wed, 07 Jul 2021 14:03:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625680977; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LI/6/3y/Mt3s6myhT3lK9jiaYna4wIDZkPVfJaQ0UPQ=; b=HTDDLfgZ+kioa0PLcTEQbj4J3BcxbUfp9WeR6r9m468XyOROx8ohXIb9HDz6DQEfmbpVwf xYS16WnUEDE2eV/T1qZv/yZC11fCBf3ndupwVUqjUsPZLBNaZ20xD5NGoKjZYUxdH3+HO6 L3UZomOfw+IuGOkynBR1IqHdXQ6gVb4= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-120-P0TYqx8lP3ia5cXP1sRgNg-1; Wed, 07 Jul 2021 14:02:55 -0400 X-MC-Unique: P0TYqx8lP3ia5cXP1sRgNg-1 Received: by mail-qt1-f199.google.com with SMTP id c17-20020ac87dd10000b0290250fd339409so1769371qte.6 for ; Wed, 07 Jul 2021 11:02:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=LI/6/3y/Mt3s6myhT3lK9jiaYna4wIDZkPVfJaQ0UPQ=; b=V4MS0wecJM2Er7UgTDb5BxEUvtS4/lbrkLra0RiTMkcDlr5T+c0yS/jSTWUSm0eyU3 Dn8Z8tY1t9idewR0x3BRx8Bc31BAcBg7Dh7E5ypSphlBjpoik0tXVROlPs1q3nBJavp7 0O/IROBDqd63oJaf3yDG+6Jrhg75ESIbapbM5VEicmnc2+7iBoYl6FVQDs0J6ZNhmbvJ 4FoGzTA5p1NHmzICu3ZG/vBQDgDCsi5jnkS4l+heN9ntzgrkEGURGOUOxKvMHHyVh0A/ 8eXTDKhJdOXOPx/E8MqDIKJfmQijmjgx+KtJ9yytEtXadZg+5EbORmKAvv+qOK2+mQYg Sfyw== X-Gm-Message-State: AOAM531Zosi9Q8/6oqujtlvI766rQ25U/IoNQUcvRYyil6nFwrBPb6+r RjKiGC8hfdDP+0sT+Ak52ckQkFP+AuLwoIlOMiiLqtT1txXbbTQdIdmxkzAdjhqX4gJP8FUpUOI RML7u1h20hHcgtTl6 X-Received: by 2002:a37:a4b:: with SMTP id 72mr14967074qkk.139.1625680974918; Wed, 07 Jul 2021 11:02:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzynuhIWDKoNj6iWvuadi7vNAYjLGQnhdFWKLmuVbkd+xZBtLzvR0vRdV/p5HNtWIwt5b84gA== X-Received: by 2002:a37:a4b:: with SMTP id 72mr14967047qkk.139.1625680974730; Wed, 07 Jul 2021 11:02:54 -0700 (PDT) Received: from t490s (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id i2sm5478341qko.43.2021.07.07.11.02.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 11:02:54 -0700 (PDT) Date: Wed, 7 Jul 2021 14:02:53 -0400 From: Peter Xu To: David Hildenbrand Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210707140655.30982-3-david@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 18:03:02 -0000 On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > As it never worked properly, let's disable it via the postcopy notifier on > the destination. Trying to set "migrate_set_capability postcopy-ram on" > on the destination now results in "virtio-balloon: 'free-page-hint' does > not support postcopy Error: Postcopy is not supported". Would it be possible to do this in reversed order? Say, dynamically disable free-page-hinting if postcopy capability is set when migration starts? Perhaps it can also be re-enabled automatically when migration completes? I see postcopy a "functional" feature and free-page-hint a "performance" feature, from that pov IMHO it's better to not block function for performance. Thanks, -- Peter Xu From MAILER-DAEMON Wed Jul 07 14:58:04 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1CkG-0003TM-KV for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 14:58:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37800) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1CkE-0003QM-DJ for qemu-stable@nongnu.org; Wed, 07 Jul 2021 14:58:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24809) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1CkB-0002c0-DU for qemu-stable@nongnu.org; Wed, 07 Jul 2021 14:58:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625684278; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jglq8+VyyWzrkiJbZUJo6FJu+PaPASGqs1wk9nC/YZw=; b=AQucnC1ip9C77R0A6PDZkae52nyn5eP2VswZIgcnN0jLe+dWkeG9nrYlKUDjPre3Q3XqKe Ltn36ChAt+Uodzwnc5d5UIDjxRZ93UCV9bNrq32gPmMt2ViCgrFpEQhlj6VXhrB1W5ona/ c66AEYr1EYGXD9AE+8tS0EnfXhCSxNU= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-220-bL5NVF3YOsSeQ7fwKFKOgQ-1; Wed, 07 Jul 2021 14:57:57 -0400 X-MC-Unique: bL5NVF3YOsSeQ7fwKFKOgQ-1 Received: by mail-wm1-f70.google.com with SMTP id n37-20020a05600c3ba5b02901fe49ba3bd0so1364524wms.1 for ; Wed, 07 Jul 2021 11:57:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=jglq8+VyyWzrkiJbZUJo6FJu+PaPASGqs1wk9nC/YZw=; b=MWsEHnaSXds1EMA+XXxvhf8knB+OgNXkWM4CGpi76JwspHz94vb9AVCyJxz/gjMUlM HQTlbb9moGFF5clywwRrGSLVjWMY+iIcrr1CEXrSIEBiNKSpwpug44pj29wvQsneLxeX MEtCLFi3BjqjobbMh1OoBSbSj69V0Dk43Q22K6YeiS59EiIHyzoApBpLmuCruh1LfziR ayIaBAINFBAvr01QW9M4GwSZGH73egh5hdJaTN7azGT3zZHO07a/RjjNYZh6C8ZtIJ65 yQP/J/ULRL/ouR4hdlHRhhw7QRhpLsR1IFaxXbOjeAWbWGNo8skDxH4l5BF3axdfPZzE PKAA== X-Gm-Message-State: AOAM531adzb56kStcmP3XQmKm/YTd0l8Sg9HqYiAQ8IU6R3fYLoR6Dtz GYRC/4W04nwQL2z614fcEsW17ljDNcy8kotjW9fIwyM/XWM5ACA4qnqKbckjokzB2nNHeoGcp5P SGZeu536qAtoWswVk X-Received: by 2002:a1c:e91a:: with SMTP id q26mr585818wmc.170.1625684276106; Wed, 07 Jul 2021 11:57:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzRKnWmMwVwkXSP6CM6s96NUxglgPjXOkX+58D/frmPWdorm8Deo2b5GyvTaGzg0bbNAnyxaw== X-Received: by 2002:a1c:e91a:: with SMTP id q26mr585797wmc.170.1625684275918; Wed, 07 Jul 2021 11:57:55 -0700 (PDT) Received: from [192.168.3.132] (p4ff23579.dip0.t-ipconnect.de. [79.242.53.121]) by smtp.gmail.com with ESMTPSA id b8sm7781452wmb.20.2021.07.07.11.57.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 Jul 2021 11:57:55 -0700 (PDT) To: Peter Xu Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> Date: Wed, 7 Jul 2021 20:57:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 18:58:02 -0000 On 07.07.21 20:02, Peter Xu wrote: > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: >> As it never worked properly, let's disable it via the postcopy notifier on >> the destination. Trying to set "migrate_set_capability postcopy-ram on" >> on the destination now results in "virtio-balloon: 'free-page-hint' does >> not support postcopy Error: Postcopy is not supported". > > Would it be possible to do this in reversed order? Say, dynamically disable > free-page-hinting if postcopy capability is set when migration starts? Perhaps > it can also be re-enabled automatically when migration completes? I remember that this might be quite racy. We would have to make sure that no hinting happens before we enable the capability. As soon as we messed with the dirty bitmap (during precopy), postcopy is no longer safe. As noted in the patch, the only runtime alternative is to disable postcopy as soon as we actually do clear a bit. Alternatively, we could ignore any hints if the postcopy capability was enabled. Whatever we do, we have to make sure that a user cannot trick the system into an inconsistent state. Like enabling hinting, starting migration, then enabling the postcopy capability and kicking of postcopy. I did not check if we allow for that, though. -- Thanks, David / dhildenb From MAILER-DAEMON Wed Jul 07 15:05:54 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1Crq-0007Zf-8g for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 15:05:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40036) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Cro-0007ZK-Jb for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:05:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:46938) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Crm-0003uP-6P for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:05:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625684749; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jEbKHqukhfgnoMAGGsgb1K8RcwQIsPpspwyL2vrO/0c=; b=YCr1y633AhZ2TkcRkdBQc1pOhGZq30ERMmsarNlgp0xYaqOGFCcp/9j+1eX5fhap+30MLF QcjXaDfvtJjjtiNWCUxAIASrQeYqV6/gyKxr487ThpDC3HdOeRm0ZW1A7I19TqOWPDqOxA S17RW1zH/ProeZIwSdBOS0BDuRmgCBU= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-44--uEBNTDFNHGd-fng2pWbHA-1; Wed, 07 Jul 2021 15:05:48 -0400 X-MC-Unique: -uEBNTDFNHGd-fng2pWbHA-1 Received: by mail-ed1-f72.google.com with SMTP id w15-20020a05640234cfb02903951279f8f3so1907637edc.11 for ; Wed, 07 Jul 2021 12:05:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=jEbKHqukhfgnoMAGGsgb1K8RcwQIsPpspwyL2vrO/0c=; b=L2YAzbOZ8ny0HyUBllVSosyGVn+Z/o49QKZPBN4LcpR03Gz7t8e6VB/cattuSKR9/W S+L8BA2d+mgJVeZU0cN6zTe4L7kU2dPttb11r8zloB/eDRqLP+/G0kAlIljpJPAfwQOA 5gws3xVxC3LKiV9EIZJ2cyMpuKjVwH7g2HCtI9BApkTewPSbXs/aJRu2WY7trT2X3OC5 HFhTo/XQhi7j7zvxWAM5JN9FE5J5UCQ25baPtLiCS96aMIylheiSK1WRlc+8CmYZNqCc zGUPYHMGCbgPjzhqTbvpA/vQYtWSkRMZ7RzHCP30O6oSpeSsQKfuspYoXHJtVpu/uuQF jKgA== X-Gm-Message-State: AOAM530dEuvOc6dATe1hHSWJcnVpOa0GpYzqY5y1rZ+7h8mR6jdEhyUd 8D0bIeUCKevhKWyfvrassJsW7d2r1POauXI/xs1guxfTRzYDZr7rdBeAbIjItVe+Xsm6T68qM7t VQbWZpWY123wDFs2V X-Received: by 2002:a17:907:7212:: with SMTP id dr18mr26167104ejc.552.1625684746992; Wed, 07 Jul 2021 12:05:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx0uScH6FSJb2bP+IViK+7H4Gb+PyCoaPS2Kg8JOrNt6/6wpVhuiRQK8DhFR3sfzWH/WV3aqA== X-Received: by 2002:a17:907:7212:: with SMTP id dr18mr26167052ejc.552.1625684746651; Wed, 07 Jul 2021 12:05:46 -0700 (PDT) Received: from redhat.com ([2.55.150.102]) by smtp.gmail.com with ESMTPSA id j6sm1083016eds.58.2021.07.07.12.05.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 12:05:45 -0700 (PDT) Date: Wed, 7 Jul 2021 15:05:41 -0400 From: "Michael S. Tsirkin" To: David Hildenbrand Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <20210707150038-mutt-send-email-mst@kernel.org> References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210707140655.30982-3-david@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=mst@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 19:05:52 -0000 On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > Postcopy never worked properly with 'free-page-hint=on', as there are > at least two issues: > > 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE > and consequently won't release free pages back to the OS once > migration finishes. > > The issue is that for postcopy, we won't do a final bitmap sync while > the guest is stopped on the source and > virtio_balloon_free_page_hint_notify() will only call > virtio_balloon_free_page_done() on the source during > PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to > the destination. > > 2) Once the VM touches a page on the destination that has been excluded > from migration on the source via qemu_guest_free_page_hint() while > postcopy is active, that thread will stall until postcopy finishes > and all threads are woken up. (with older Linux kernels that won't > retry faults when woken up via userfaultfd, we might actually get a > SEGFAULT) > > The issue is that the source will refuse to migrate any pages that > are not marked as dirty in the dirty bmap -- for example, because the > page might just have been sent. Consequently, the faulting thread will > stall, waiting for the page to be migrated -- which could take quite > a while and result in guest OS issues. OK so if source gets a request for a page which is not dirty it does not respond immediately? Why not just teach it to respond? It would seem that if destination wants a page we should just give it to the destination ... > > While we could fix 1), for example, by calling > virtio_balloon_free_page_done() via pre_save callbacks of the > vmstate, 2) is mostly impossible to fix without additional tracking, > such that we can actually identify these hinted pages and handle > them accordingly. > As it never worked properly, let's disable it via the postcopy notifier on > the destination. Trying to set "migrate_set_capability postcopy-ram on" > on the destination now results in "virtio-balloon: 'free-page-hint' does > not support postcopy Error: Postcopy is not supported". > Note 1: We could let qemu_guest_free_page_hint() mark postcopy > as broken once actually clearing bits on the source. However, it's > harder to realize as we can race with users starting postcopy > and we cannot produce an expressive error message easily. How about the reverse? Ignore qemu_guest_free_page_hint if postcopy started? Seems better than making it user/guest visible .. > Note 2: virtio-mem has similar issues, however, access to "unplugged" > memory by the guest is very rare and we would have to be very > lucky for it to happen during migration. The spec states > "The driver SHOULD NOT read from unplugged memory blocks ..." > and "The driver MUST NOT write to unplugged memory blocks". > virtio-mem will move away from virtio_balloon_free_page_done() > soon and handle this case explicitly on the destination. > > Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") OK it's not too bad, but I wonder whether above aideas have been explored. > Cc: qemu-stable@nongnu.org > Cc: Wei Wang > Cc: Michael S. Tsirkin > Cc: Philippe Mathieu-Daudé > Cc: Alexander Duyck > Cc: Juan Quintela > Cc: "Dr. David Alan Gilbert" > Cc: Peter Xu > Signed-off-by: David Hildenbrand > --- > hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++++++++ > include/hw/virtio/virtio-balloon.h | 1 + > 2 files changed, 27 insertions(+) > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index 4b5d9e5e50..d0c9dc677c 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -30,6 +30,7 @@ > #include "trace.h" > #include "qemu/error-report.h" > #include "migration/misc.h" > +#include "migration/postcopy-ram.h" > > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-access.h" > @@ -692,6 +693,28 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data) > return 0; > } > > + > +static int virtio_balloon_postcopy_notify(NotifierWithReturn *n, void *opaque) > +{ > + VirtIOBalloon *dev = container_of(n, VirtIOBalloon, postcopy_notifier); > + PostcopyNotifyData *pnd = opaque; > + > + /* We register the notifier only with 'free-page-hint=on' for now. */ > + g_assert(virtio_has_feature(dev->host_features, > + VIRTIO_BALLOON_F_FREE_PAGE_HINT)); > + > + /* > + * Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty > + * bitmap and will not get migrated, especially also not when the postcopy > + * destination starts using them and requests migration from the source; the > + * faulting thread will stall until postcopy migration finishes and > + * all threads are woken up. > + */ > + error_setg(pnd->errp, > + "virtio-balloon: 'free-page-hint' does not support postcopy"); > + return -ENOENT; > +} > + > static size_t virtio_balloon_config_size(VirtIOBalloon *s) > { > uint64_t features = s->host_features; > @@ -911,6 +934,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) > s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, > virtio_balloon_handle_free_page_vq); > precopy_add_notifier(&s->free_page_hint_notify); > + postcopy_add_notifier(&s->postcopy_notifier); > > object_ref(OBJECT(s->iothread)); > s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread), > @@ -935,6 +959,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev) > object_unref(OBJECT(s->iothread)); > virtio_balloon_free_page_stop(s); > precopy_remove_notifier(&s->free_page_hint_notify); > + postcopy_remove_notifier(&s->postcopy_notifier); > } > balloon_stats_destroy_timer(s); > qemu_remove_balloon_handler(s); > @@ -1008,6 +1033,7 @@ static void virtio_balloon_instance_init(Object *obj) > qemu_cond_init(&s->free_page_cond); > s->free_page_hint_cmd_id = VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN; > s->free_page_hint_notify.notify = virtio_balloon_free_page_hint_notify; > + s->postcopy_notifier.notify = virtio_balloon_postcopy_notify; > > object_property_add(obj, "guest-stats", "guest statistics", > balloon_stats_get_all, NULL, NULL, s); > diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h > index 5139cf8ab6..d0d5b793b9 100644 > --- a/include/hw/virtio/virtio-balloon.h > +++ b/include/hw/virtio/virtio-balloon.h > @@ -65,6 +65,7 @@ struct VirtIOBalloon { > */ > bool block_iothread; > NotifierWithReturn free_page_hint_notify; > + NotifierWithReturn postcopy_notifier; > int64_t stats_last_update; > int64_t stats_poll_interval; > uint32_t host_features; > -- > 2.31.1 From MAILER-DAEMON Wed Jul 07 15:07:13 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1Ct7-0000uN-9r for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 15:07:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40458) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Ct5-0000sz-Ed for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:07:11 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:48708) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Ct3-0004Am-U4 for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:07:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625684828; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ypNvv84iXcVzZehM7AZfSH9krghSc8iJk5sRzmjo2s8=; b=IKMw9ECwovNZuu5OLm/5FH/Us6CmYHNsT1KuXGTd480vcWU9iZZmp0QYi6WbhkcBh4Pl32 EpeT49dbptLjC+Migyfh+R03XevcJ2HHvB+pDK0IrIMiwhldvIHxt12htcd2yGRfzIZaSC 9dVS+cG+4a6Mp9a9TkLk8xscLyXxNJI= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-460-97KOvV3nPGCJjgu3lC5FMA-1; Wed, 07 Jul 2021 15:07:07 -0400 X-MC-Unique: 97KOvV3nPGCJjgu3lC5FMA-1 Received: by mail-ed1-f72.google.com with SMTP id f20-20020a0564020054b0290395573bbc17so1892807edu.19 for ; Wed, 07 Jul 2021 12:07:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ypNvv84iXcVzZehM7AZfSH9krghSc8iJk5sRzmjo2s8=; b=T62E0gmYUYIJP5AZzpBAgb3OOKqsn3/eD/MXJGKRPxK06WcgHsmBJzItjG+3w/VlSG 8hTlwdXJnNNZ9gT/IsOjuLNnlOiqSSWdSiTVqfseHUlGD6nmyNYeudlKLIL+jmVuzt/c 4iBATCHn4JkfmcW3jb1N6WOBPnzYzNmv02FO/xMb4TErpldDBPxlmsTW6kuJsTIKzQ1I N6MN4kiYYb8XvqL/0ZQ9abxdTHQdRv2xvXufumuWtFJSeL1Pt9RyyW/5cK0hyLgJO+fb ond3KBr1yHSYDeZEZyKJOq0iLcmzI56zwcDWzNsgwW+SSqvH/lwyCPrEK+ytfaPlruRh 4d8w== X-Gm-Message-State: AOAM530GNwTqzdHkLYV2iSr3hayCC9CWjghB3XG/9MYmccZDO6Y/ZXY2 CW0OUIB26c0YMRrQpFT5R03iFt+ogDRmUeS/baH6yldW9QQCEYIqam+ct6YI4jTbXIPLXVEmEoj Ktx2XY+Mx+IdVSW8W X-Received: by 2002:a17:907:da7:: with SMTP id go39mr26256789ejc.24.1625684826715; Wed, 07 Jul 2021 12:07:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyd0JT6NtqFiXmwI6LLOgqrQbCJeQ3SzxV7lKl3bOg3JcSqmyluzbq/HU2+5JKbNtzVvkNnOA== X-Received: by 2002:a17:907:da7:: with SMTP id go39mr26256761ejc.24.1625684826526; Wed, 07 Jul 2021 12:07:06 -0700 (PDT) Received: from redhat.com ([2.55.150.102]) by smtp.gmail.com with ESMTPSA id d13sm9349601eds.56.2021.07.07.12.07.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 12:07:06 -0700 (PDT) Date: Wed, 7 Jul 2021 15:07:02 -0400 From: "Michael S. Tsirkin" To: David Hildenbrand Cc: Peter Xu , qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <20210707150610-mutt-send-email-mst@kernel.org> References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> MIME-Version: 1.0 In-Reply-To: <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=170.10.133.124; envelope-from=mst@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 19:07:11 -0000 On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: > On 07.07.21 20:02, Peter Xu wrote: > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > As it never worked properly, let's disable it via the postcopy notifier on > > > the destination. Trying to set "migrate_set_capability postcopy-ram on" > > > on the destination now results in "virtio-balloon: 'free-page-hint' does > > > not support postcopy Error: Postcopy is not supported". > > > > Would it be possible to do this in reversed order? Say, dynamically disable > > free-page-hinting if postcopy capability is set when migration starts? Perhaps > > it can also be re-enabled automatically when migration completes? > > I remember that this might be quite racy. We would have to make sure that no > hinting happens before we enable the capability. > > As soon as we messed with the dirty bitmap (during precopy), postcopy is no > longer safe. As noted in the patch, the only runtime alternative is to > disable postcopy as soon as we actually do clear a bit. Alternatively, we > could ignore any hints if the postcopy capability was enabled. > > Whatever we do, we have to make sure that a user cannot trick the system > into an inconsistent state. Like enabling hinting, starting migration, then > enabling the postcopy capability and kicking of postcopy. I did not check if > we allow for that, though. What bothers me with limitations like this is we train users about this lack of orthogonality, it's then very hard to retrain them that a given feature is safe to use. > -- > Thanks, > > David / dhildenb From MAILER-DAEMON Wed Jul 07 15:14:10 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1Czq-00055J-6Z for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 15:14:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41812) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Czo-00053X-V3 for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:14:08 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20075) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Czn-0004yE-1r for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:14:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625685246; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lsmhZQbhieDND7EelZWzeL+gwaZkwwmR1U8RNvnx6qM=; b=FzfUspH1niXrepsui6A/SsrqlnX0KQnepo6X8wFbPbSL+g7sVczPEcKjON68AOgvgB62kQ obAS+Zh4GyMSKUXjnFFEmAMXl0AZ9XgX+3pXEJnKpbdyTr04RhqTGCGbhFp5AM/TA20TXa jkr71ja4bRTEBMoqh9JIrNmXY0N6iqk= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-264-mgdL10GZN_epYfk7kVpR2Q-1; Wed, 07 Jul 2021 15:14:03 -0400 X-MC-Unique: mgdL10GZN_epYfk7kVpR2Q-1 Received: by mail-wm1-f72.google.com with SMTP id p3-20020a05600c3583b02901f55d71e34aso1374121wmq.4 for ; Wed, 07 Jul 2021 12:14:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=lsmhZQbhieDND7EelZWzeL+gwaZkwwmR1U8RNvnx6qM=; b=M3oks911GyUHForc+SynIKzW2PYx3Z21OeOQciW0FM3biVLBeYsagKQwufWslXvdNO btzC/02vM3SJ2sCXQ3OoIHnuZVayxXE9NzkG0CMkaiGoZWBFU7zdXlTvl5RDAJU4Obx9 hLOtooBbyfvUpD29WLjJz0cPJRLtA8qvJmfWMiOJSvsM+8LPO/eh0hcPXf+wHylX9TMe FFcdKa+IU3uzCl1ON18JIV/O7PJZaO14p6nJg52neYn+N+vLUKn7/IdYs6WpqhuoOTg3 MKoXURElFtt+VkxbAvdRfNqI48c6bhgPzD6wTWtRaa7iGOPcX3UIkO7anyNIwFn4s4jJ 9dIA== X-Gm-Message-State: AOAM533suo0hfWUAXtOM0a11v28FBQPVc5QX+O8E0uFrZRFGLKf6fRw5 sAsgkKVb6P2iOtsyOzPz6dDvcgkZghxfD4ZSBvKYw9l+mv0Za4RRKT0XuyAnLMvF0h0dJ9UpNHc 25HsEvFGQDWFqmQHk X-Received: by 2002:a05:600c:4f0c:: with SMTP id l12mr27451636wmq.105.1625685242023; Wed, 07 Jul 2021 12:14:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx0zJopyWSmS/Or25EZ090kCyCjVxZGDOduCLyS7SN7bpAI4uYdlKAA9J+f83JncY5gnMZ3Qw== X-Received: by 2002:a05:600c:4f0c:: with SMTP id l12mr27451617wmq.105.1625685241812; Wed, 07 Jul 2021 12:14:01 -0700 (PDT) Received: from [192.168.3.132] (p4ff23579.dip0.t-ipconnect.de. [79.242.53.121]) by smtp.gmail.com with ESMTPSA id a9sm20546443wrv.37.2021.07.07.12.14.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 Jul 2021 12:14:01 -0700 (PDT) To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> Date: Wed, 7 Jul 2021 21:14:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210707150038-mutt-send-email-mst@kernel.org> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 19:14:09 -0000 On 07.07.21 21:05, Michael S. Tsirkin wrote: > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: >> Postcopy never worked properly with 'free-page-hint=on', as there are >> at least two issues: >> >> 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE >> and consequently won't release free pages back to the OS once >> migration finishes. >> >> The issue is that for postcopy, we won't do a final bitmap sync while >> the guest is stopped on the source and >> virtio_balloon_free_page_hint_notify() will only call >> virtio_balloon_free_page_done() on the source during >> PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to >> the destination. >> >> 2) Once the VM touches a page on the destination that has been excluded >> from migration on the source via qemu_guest_free_page_hint() while >> postcopy is active, that thread will stall until postcopy finishes >> and all threads are woken up. (with older Linux kernels that won't >> retry faults when woken up via userfaultfd, we might actually get a >> SEGFAULT) >> >> The issue is that the source will refuse to migrate any pages that >> are not marked as dirty in the dirty bmap -- for example, because the >> page might just have been sent. Consequently, the faulting thread will >> stall, waiting for the page to be migrated -- which could take quite >> a while and result in guest OS issues. > > OK so if source gets a request for a page which is not dirty > it does not respond immediately? Why not just teach it to > respond? It would seem that if destination wants a page we > should just give it to the destination ... The source does not know if a page has already been sent (e.g., via the background migration thread that moves all data over) vs. the page has not been send because the page was hinted. This is the part where we'd need additional tracking on the source to actually know that. We must not send a page twice, otherwise bad things can happen when placing pages that already have been migrated, because that scenario can easily happen with ordinary postcopy (page has already been sent and we're dealing with a stale request from the destination). > > >> >> While we could fix 1), for example, by calling >> virtio_balloon_free_page_done() via pre_save callbacks of the >> vmstate, 2) is mostly impossible to fix without additional tracking, >> such that we can actually identify these hinted pages and handle >> them accordingly. >> As it never worked properly, let's disable it via the postcopy notifier on >> the destination. Trying to set "migrate_set_capability postcopy-ram on" >> on the destination now results in "virtio-balloon: 'free-page-hint' does >> not support postcopy Error: Postcopy is not supported". >> Note 1: We could let qemu_guest_free_page_hint() mark postcopy >> as broken once actually clearing bits on the source. However, it's >> harder to realize as we can race with users starting postcopy >> and we cannot produce an expressive error message easily. > > > How about the reverse? Ignore qemu_guest_free_page_hint if postcopy > started? Seems better than making it user/guest visible .. Might be an option, but we let the user configure something that does not work in combination ... essentially ignoring one of both user settings. Also not perfect IMHO. > >> Note 2: virtio-mem has similar issues, however, access to "unplugged" >> memory by the guest is very rare and we would have to be very >> lucky for it to happen during migration. The spec states >> "The driver SHOULD NOT read from unplugged memory blocks ..." >> and "The driver MUST NOT write to unplugged memory blocks". >> virtio-mem will move away from virtio_balloon_free_page_done() >> soon and handle this case explicitly on the destination. >> >> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") > > OK it's not too bad, but I wonder whether above aideas have been > explored. TBH, it's been broken all along and I'd rather have a simple fix. If somebody ever cares about this, we could investigate making it work (or making postcopy overrule free page hinting). But I'm open for suggestions. -- Thanks, David / dhildenb From MAILER-DAEMON Wed Jul 07 15:19:23 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1D4s-0001h3-V8 for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 15:19:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1D4r-0001eu-LY for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:19:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:45953) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1D4o-0005WX-Tv for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:19:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625685558; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NewGue3Sn0vGzqMXCIlKCZSvTeFmHJD4PXrKCTJPfZs=; b=MO2Gf1wi7JUSXIZWeWmyEu0ssugydMMn15pnjSArzPcWrKtJN3AzJzelZq89Cu4YHQQuMj Q5hOZuqbTulekOMbh0Vcadm4sksd3JP8yoWd+ilLPkmOzgWrrrbzfAQw8CKhTyRfTg5fu1 2NDMXvr3BADiXaGrZg/6bolpGNxdiqg= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-473-t8jg56MOObeKL7p5ybwWMA-1; Wed, 07 Jul 2021 15:19:17 -0400 X-MC-Unique: t8jg56MOObeKL7p5ybwWMA-1 Received: by mail-wr1-f69.google.com with SMTP id t12-20020adff04c0000b029013253c3389dso1108901wro.7 for ; Wed, 07 Jul 2021 12:19:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=NewGue3Sn0vGzqMXCIlKCZSvTeFmHJD4PXrKCTJPfZs=; b=i/OMHm+9VgGddZbyLhZ5wcPfG/t1/UkuXSDVdwGebFlESv0CO5ZMn7TqLQpmmHp3Ry x6N+c5TQo2qSq3Dd/n0geYg0OS6NF7RQAunYCXf7ppMb/R/NvC/EQq553/oDv/BJTYQy UxZXvv4nB9Kyb27PAP320NBM2XPDeNJXDS6JW4lTFH+2aaO7yTK2PmyzVD+BWJSlBC8L HrFKX7V+LS0Psjris0KjRNNDINY04e5NLPWgXCo/pnN9NIJee4cWdQCjsJZMOqwzxYPa uM4LQfyQrDNtCgBBKo3rqRwHJE9VscAgcXyzw9mzZ9lKKcPJ/8IX50eikLtl9LkzC/Q3 YSOQ== X-Gm-Message-State: AOAM5310QW93EQ+8Kmytxbv3wteo5KzRA7vzHvK4QZLW/8U4gHpkH3SB y4S5z/1669wWqstljMJzujMUi0A+I2PVte6jpoRwhfxvTJNHZbabSaqJqjtWwZ4iXcD7zQg7Q/V VNWGhmYm7Y1uDs8hc X-Received: by 2002:a5d:5388:: with SMTP id d8mr29067299wrv.423.1625685555884; Wed, 07 Jul 2021 12:19:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwSg/l6o1hkX0WsnPwfWugNJJXp06SGXOwgevnRPHN9Fn5LXcbcd1vb6eUg55tuaNCP3X+q/Q== X-Received: by 2002:a5d:5388:: with SMTP id d8mr29067271wrv.423.1625685555623; Wed, 07 Jul 2021 12:19:15 -0700 (PDT) Received: from redhat.com ([2.55.150.102]) by smtp.gmail.com with ESMTPSA id l13sm7810343wrz.34.2021.07.07.12.19.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 12:19:14 -0700 (PDT) Date: Wed, 7 Jul 2021 15:19:11 -0400 From: "Michael S. Tsirkin" To: David Hildenbrand Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <20210707151459-mutt-send-email-mst@kernel.org> References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> MIME-Version: 1.0 In-Reply-To: <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=216.205.24.124; envelope-from=mst@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 19:19:21 -0000 On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote: > On 07.07.21 21:05, Michael S. Tsirkin wrote: > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > Postcopy never worked properly with 'free-page-hint=on', as there are > > > at least two issues: > > > > > > 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE > > > and consequently won't release free pages back to the OS once > > > migration finishes. > > > > > > The issue is that for postcopy, we won't do a final bitmap sync while > > > the guest is stopped on the source and > > > virtio_balloon_free_page_hint_notify() will only call > > > virtio_balloon_free_page_done() on the source during > > > PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to > > > the destination. > > > > > > 2) Once the VM touches a page on the destination that has been excluded > > > from migration on the source via qemu_guest_free_page_hint() while > > > postcopy is active, that thread will stall until postcopy finishes > > > and all threads are woken up. (with older Linux kernels that won't > > > retry faults when woken up via userfaultfd, we might actually get a > > > SEGFAULT) > > > > > > The issue is that the source will refuse to migrate any pages that > > > are not marked as dirty in the dirty bmap -- for example, because the > > > page might just have been sent. Consequently, the faulting thread will > > > stall, waiting for the page to be migrated -- which could take quite > > > a while and result in guest OS issues. > > > > OK so if source gets a request for a page which is not dirty > > it does not respond immediately? Why not just teach it to > > respond? It would seem that if destination wants a page we > > should just give it to the destination ... > > The source does not know if a page has already been sent (e.g., via the > background migration thread that moves all data over) vs. the page has not > been send because the page was hinted. This is the part where we'd need > additional tracking on the source to actually know that. > > We must not send a page twice, otherwise bad things can happen when placing > pages that already have been migrated, because that scenario can easily > happen with ordinary postcopy (page has already been sent and we're dealing > with a stale request from the destination). OK let me get this straight A. source sends page B. destination requests page C. destination changes page D. source sends page E. destination overwrites page this is what you are worried about right? the fix is to mark page clean in A. then in D to not send page if it's clean? And the problem with hinting is this: A. page is marked clean B. destination requests page C. destination changes page D. source sends page <- does not happen, page is clean! E. destination overwrites page did I get it right? > > > > > > > > > > While we could fix 1), for example, by calling > > > virtio_balloon_free_page_done() via pre_save callbacks of the > > > vmstate, 2) is mostly impossible to fix without additional tracking, > > > such that we can actually identify these hinted pages and handle > > > them accordingly. > > > As it never worked properly, let's disable it via the postcopy notifier on > > > the destination. Trying to set "migrate_set_capability postcopy-ram on" > > > on the destination now results in "virtio-balloon: 'free-page-hint' does > > > not support postcopy Error: Postcopy is not supported". > > > Note 1: We could let qemu_guest_free_page_hint() mark postcopy > > > as broken once actually clearing bits on the source. However, it's > > > harder to realize as we can race with users starting postcopy > > > and we cannot produce an expressive error message easily. > > > > > > How about the reverse? Ignore qemu_guest_free_page_hint if postcopy > > started? Seems better than making it user/guest visible .. > > Might be an option, but we let the user configure something that does not > work in combination ... essentially ignoring one of both user settings. Also > not perfect IMHO. > > > > > > Note 2: virtio-mem has similar issues, however, access to "unplugged" > > > memory by the guest is very rare and we would have to be very > > > lucky for it to happen during migration. The spec states > > > "The driver SHOULD NOT read from unplugged memory blocks ..." > > > and "The driver MUST NOT write to unplugged memory blocks". > > > virtio-mem will move away from virtio_balloon_free_page_done() > > > soon and handle this case explicitly on the destination. > > > > > > Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") > > > > OK it's not too bad, but I wonder whether above aideas have been > > explored. > > TBH, it's been broken all along and I'd rather have a simple fix. If > somebody ever cares about this, we could investigate making it work (or > making postcopy overrule free page hinting). But I'm open for suggestions. > > -- > Thanks, > > David / dhildenb From MAILER-DAEMON Wed Jul 07 15:47:42 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1DWI-0002vI-1R for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 15:47:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48000) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1DWG-0002s9-OF for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:47:40 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41459) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1DWE-0001NB-PO for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:47:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625687257; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vtqTIt/enPKSSIS1E7MC7RymE3BVVBHL9vBwVJz0G9Y=; b=bbecvroMQGSksUSBgs4t8nlp2Hk1uFCdWC7Qrhsx9ZiWkHVlZRcYahtTwK4WpgWsqvEeUm WKf4LB0prj1Y+5No2IalZiq7iu6biG5My8mc12fxFKPOBaGdaKFEzsjsUJEhBviVJkJaxY jcnvbRTKijLKSPSBVPOdRLC8Y1Z3alA= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-416-yoYwycPUNHqZUGxzoZ381Q-1; Wed, 07 Jul 2021 15:47:34 -0400 X-MC-Unique: yoYwycPUNHqZUGxzoZ381Q-1 Received: by mail-wm1-f69.google.com with SMTP id d16-20020a1c73100000b02901f2d21e46efso1399080wmb.6 for ; Wed, 07 Jul 2021 12:47:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=vtqTIt/enPKSSIS1E7MC7RymE3BVVBHL9vBwVJz0G9Y=; b=lZjtZfTuqDcxTpse79nCssfOesVshpOuHsxcLN4WOdJiT9U957rrdBAYOCLCdMHe6c D0MyjE4aDd63nlXHQOUfnC1oATaQC5G1isoSMMwstEcgu9tBpITtFYEsMiIkBIrX0715 j8b2CjDi3abW4pCckHFO8QJdo/THggwIKrC4BNrgfYj1wbJondbg3RqSU0XV0XzC35Fx aZTOjwTSkWUnNbpwAZdDUdMPW86m+pBtis3dCLhiGsI3eB0FAwbCJeL9C0vHPhkhiXNf 8uCC4zEPn1PdLkdsF6y0oKJEdhs5II2BSXP5eYF+OQGAcjvEZrmJYSvRBwxfAt4pga4T 4pJw== X-Gm-Message-State: AOAM533+jNl3Pkl6mAA0hbfLptK/+ZWYp/xiR1oleS9VSMtyoCm9utB5 4MRPqTOzHzlLs9XlH5O1h+d4zEGGLDaFbkZdm+DvJq3dZUddqyIPJRyWqalkp2hqJsye+IPw3/e RC78dIgopb9/WFFBR X-Received: by 2002:a05:600c:1d06:: with SMTP id l6mr25351863wms.111.1625687253142; Wed, 07 Jul 2021 12:47:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6wUrTynArzl8nIJsNf7cBmMKph3bKmO33kWYYzKod69AJHYAgOzUCEkIXR1KS7lGhmIDfzQ== X-Received: by 2002:a05:600c:1d06:: with SMTP id l6mr25351842wms.111.1625687252917; Wed, 07 Jul 2021 12:47:32 -0700 (PDT) Received: from [192.168.3.132] (p4ff23579.dip0.t-ipconnect.de. [79.242.53.121]) by smtp.gmail.com with ESMTPSA id r9sm15598115wmq.25.2021.07.07.12.47.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 Jul 2021 12:47:32 -0700 (PDT) To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> <20210707151459-mutt-send-email-mst@kernel.org> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <40a148d7-acad-67ee-ac66-e9ad56a23b44@redhat.com> Date: Wed, 7 Jul 2021 21:47:31 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210707151459-mutt-send-email-mst@kernel.org> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 19:47:40 -0000 On 07.07.21 21:19, Michael S. Tsirkin wrote: > On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote: >> On 07.07.21 21:05, Michael S. Tsirkin wrote: >>> On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: >>>> Postcopy never worked properly with 'free-page-hint=on', as there are >>>> at least two issues: >>>> >>>> 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE >>>> and consequently won't release free pages back to the OS once >>>> migration finishes. >>>> >>>> The issue is that for postcopy, we won't do a final bitmap sync while >>>> the guest is stopped on the source and >>>> virtio_balloon_free_page_hint_notify() will only call >>>> virtio_balloon_free_page_done() on the source during >>>> PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to >>>> the destination. >>>> >>>> 2) Once the VM touches a page on the destination that has been excluded >>>> from migration on the source via qemu_guest_free_page_hint() while >>>> postcopy is active, that thread will stall until postcopy finishes >>>> and all threads are woken up. (with older Linux kernels that won't >>>> retry faults when woken up via userfaultfd, we might actually get a >>>> SEGFAULT) >>>> >>>> The issue is that the source will refuse to migrate any pages that >>>> are not marked as dirty in the dirty bmap -- for example, because the >>>> page might just have been sent. Consequently, the faulting thread will >>>> stall, waiting for the page to be migrated -- which could take quite >>>> a while and result in guest OS issues. >>> >>> OK so if source gets a request for a page which is not dirty >>> it does not respond immediately? Why not just teach it to >>> respond? It would seem that if destination wants a page we >>> should just give it to the destination ... >> >> The source does not know if a page has already been sent (e.g., via the >> background migration thread that moves all data over) vs. the page has not >> been send because the page was hinted. This is the part where we'd need >> additional tracking on the source to actually know that. >> >> We must not send a page twice, otherwise bad things can happen when placing >> pages that already have been migrated, because that scenario can easily >> happen with ordinary postcopy (page has already been sent and we're dealing >> with a stale request from the destination). > > OK let me get this straight > > A. source sends page > B. destination requests page > C. destination changes page > D. source sends page > E. destination overwrites page > > this is what you are worried about right? IIRC E. is with recent kernels: E. placing the page fails with -EEXIST and postcopy migration fails However, the man page (man ioctl_userfaultfd) doesn't describe what is actually supposed to happen when double-placing. Could be that it's "undefined behavior". I did not try, though. This is how it works today: A. source sends page and marks it clean B. destination requests page C. destination receives page and places it D. source ignores request as page is clean > > the fix is to mark page clean in A. > then in D to not send page if it's clean? > > And the problem with hinting is this: > > A. page is marked clean > B. destination requests page > C. destination changes page > D. source sends page <- does not happen, page is clean! > E. destination overwrites page Simplified it's A. page is marked clean by hinting code B. destination requests page D. source ignores request as page is clean E. destination stalls until postcopy unregisters uffd Some thoughts 1. We do have a a recv bitmap where we track received pages on the destination (e.g., ramblock_recv_bitmap_test()), however we only use it to avoid sending duplicate requests to the hypervisor AFAIKs, and don't check it when placing pages. 2. Changing the migration behavior unconditionally on the source will break migration to old QEMU binaries that cannot handle this change. 3. I think the current behavior is in place to make debugging easier. If only a single instance of a page will ever be migrated from source to destination, there cannot be silent data corruption. Further, we avoid migrating unnecessarily pages twice. Maybe Dave and Peter can spot any flaws in my understanding. -- Thanks, David / dhildenb From MAILER-DAEMON Wed Jul 07 15:58:05 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1DgL-0000vE-Oy for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 15:58:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50126) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1DgJ-0000qE-K1 for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:58:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26245) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1DgH-0002pc-8T for qemu-stable@nongnu.org; Wed, 07 Jul 2021 15:58:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625687880; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=F6d8885ce8PGtfhBl1ogGiVS5lvofW4mULjgTk0iFNc=; b=YFx1HjrXgOwmyhGuPblOoHV49Yt+vilkejJDBQferASTCeAXz+6p3rbufoGobECNqqSsv1 Q8LXkq4Py3AwYvwAziyc+o3tkQQBz2Z9OMKKyfIfR+isu+iiza8ymVuME51DwEsoXgIYQu wWBwEeBZcTuW7Z84scq3YQX3cxjkOLw= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-467-vz1FX2W-PDe9F9DbbkqrzQ-1; Wed, 07 Jul 2021 15:57:57 -0400 X-MC-Unique: vz1FX2W-PDe9F9DbbkqrzQ-1 Received: by mail-wm1-f70.google.com with SMTP id j141-20020a1c23930000b0290212502cb19aso1441944wmj.0 for ; Wed, 07 Jul 2021 12:57:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=F6d8885ce8PGtfhBl1ogGiVS5lvofW4mULjgTk0iFNc=; b=ofH8geilz+JY3TFX4vg68ml4GtdZgt7DkTN+n7BlfBuyOv0wYpUx+6jYRjtZRfMnFp QJ0BPt0UOZu99tw/YMXVYrFu/XPAvk9DhZoMeGWVmj/tYq5hfMCybBoU1i3G7k7WVyUx ZegtAiWuLfLuB0iXAz4lWJyDLRWCW4d5n5xUhOdoJCl3cso/LHiVo6yQaIJcFpSsq5ZJ i45avP3nJf6gzEj5dw0Y9B1tnqhrwwM1HyRpn3MvNmctaL1km0zLZZHemhHyJ8X8XKCV ftX1hHpXuiDfp6HTJV+IfB8gu26rG+CZqCAabnUpuJi8d41QNmbN8QbAd881M7sqn5CU ClLQ== X-Gm-Message-State: AOAM531wzS8wSsD5fjOaI4vZK23qiRV1jt4L0ECS2xc2fKFaXX3+rmDq yol/2rl04s+dWEVwLJWANKb7ah80P2P8dJ44ZxOsfFjmDinGrz5PmPJ98oVcOQC/vNHS7clIBNk Wi7slL77MqwKorjaA X-Received: by 2002:a05:6000:110:: with SMTP id o16mr12904734wrx.284.1625687876324; Wed, 07 Jul 2021 12:57:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxAXtS/z9nqd9Mt6/W0j5Pcw9oV5sk9wGpdrlH+c8KiFHlhiiBeXLJwYXp4UjIsI/s41fFO+g== X-Received: by 2002:a05:6000:110:: with SMTP id o16mr12904705wrx.284.1625687876089; Wed, 07 Jul 2021 12:57:56 -0700 (PDT) Received: from redhat.com ([2.55.150.102]) by smtp.gmail.com with ESMTPSA id c16sm21319407wru.24.2021.07.07.12.57.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 12:57:55 -0700 (PDT) Date: Wed, 7 Jul 2021 15:57:51 -0400 From: "Michael S. Tsirkin" To: David Hildenbrand Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <20210707155413-mutt-send-email-mst@kernel.org> References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> <20210707151459-mutt-send-email-mst@kernel.org> <40a148d7-acad-67ee-ac66-e9ad56a23b44@redhat.com> MIME-Version: 1.0 In-Reply-To: <40a148d7-acad-67ee-ac66-e9ad56a23b44@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=216.205.24.124; envelope-from=mst@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 19:58:03 -0000 On Wed, Jul 07, 2021 at 09:47:31PM +0200, David Hildenbrand wrote: > On 07.07.21 21:19, Michael S. Tsirkin wrote: > > On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote: > > > On 07.07.21 21:05, Michael S. Tsirkin wrote: > > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > > > Postcopy never worked properly with 'free-page-hint=on', as there are > > > > > at least two issues: > > > > > > > > > > 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE > > > > > and consequently won't release free pages back to the OS once > > > > > migration finishes. > > > > > > > > > > The issue is that for postcopy, we won't do a final bitmap sync while > > > > > the guest is stopped on the source and > > > > > virtio_balloon_free_page_hint_notify() will only call > > > > > virtio_balloon_free_page_done() on the source during > > > > > PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to > > > > > the destination. > > > > > > > > > > 2) Once the VM touches a page on the destination that has been excluded > > > > > from migration on the source via qemu_guest_free_page_hint() while > > > > > postcopy is active, that thread will stall until postcopy finishes > > > > > and all threads are woken up. (with older Linux kernels that won't > > > > > retry faults when woken up via userfaultfd, we might actually get a > > > > > SEGFAULT) > > > > > > > > > > The issue is that the source will refuse to migrate any pages that > > > > > are not marked as dirty in the dirty bmap -- for example, because the > > > > > page might just have been sent. Consequently, the faulting thread will > > > > > stall, waiting for the page to be migrated -- which could take quite > > > > > a while and result in guest OS issues. > > > > > > > > OK so if source gets a request for a page which is not dirty > > > > it does not respond immediately? Why not just teach it to > > > > respond? It would seem that if destination wants a page we > > > > should just give it to the destination ... > > > > > > The source does not know if a page has already been sent (e.g., via the > > > background migration thread that moves all data over) vs. the page has not > > > been send because the page was hinted. This is the part where we'd need > > > additional tracking on the source to actually know that. > > > > > > We must not send a page twice, otherwise bad things can happen when placing > > > pages that already have been migrated, because that scenario can easily > > > happen with ordinary postcopy (page has already been sent and we're dealing > > > with a stale request from the destination). > > > > OK let me get this straight > > > > A. source sends page > > B. destination requests page > > C. destination changes page > > D. source sends page > > E. destination overwrites page > > > > this is what you are worried about right? > > IIRC E. is with recent kernels: > > E. placing the page fails with -EEXIST and postcopy migration fails > > However, the man page (man ioctl_userfaultfd) doesn't describe what is > actually supposed to happen when double-placing. Could be that it's > "undefined behavior". > > I did not try, though. > > > This is how it works today: > > A. source sends page and marks it clean > B. destination requests page > C. destination receives page and places it > D. source ignores request as page is clean If it's actually -EEXIST then we could just resend it and teach destination to ignore -EEXIST errors right? Will actually make things a bit more robust: destination handles its own consistency instead of relying on source. > > > > the fix is to mark page clean in A. > > then in D to not send page if it's clean? > > > > And the problem with hinting is this: > > > > A. page is marked clean > > B. destination requests page > > C. destination changes page > > D. source sends page <- does not happen, page is clean! > > E. destination overwrites page > > Simplified it's > > A. page is marked clean by hinting code > B. destination requests page > D. source ignores request as page is clean > E. destination stalls until postcopy unregisters uffd > > > Some thoughts > > 1. We do have a a recv bitmap where we track received pages on the > destination (e.g., ramblock_recv_bitmap_test()), however we only use it to > avoid sending duplicate requests to the hypervisor AFAIKs, and don't check > it when placing pages. > > 2. Changing the migration behavior unconditionally on the source will break > migration to old QEMU binaries that cannot handle this change. We can always make this depend on new machine types. > 3. I think the current behavior is in place to make debugging easier. If > only a single instance of a page will ever be migrated from source to > destination, there cannot be silent data corruption. Further, we avoid > migrating unnecessarily pages twice. > Likely does not matter much for performance, it seems unlikely that the race is all that common. > Maybe Dave and Peter can spot any flaws in my understanding. > > -- > Thanks, > > David / dhildenb From MAILER-DAEMON Wed Jul 07 16:08:28 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1DqO-0007Nw-2Y for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 16:08:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52642) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1DqM-0007Lw-9f for qemu-stable@nongnu.org; Wed, 07 Jul 2021 16:08:26 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:48617) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1DqK-0004HL-3j for qemu-stable@nongnu.org; Wed, 07 Jul 2021 16:08:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625688502; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WSfC7JM8c4ofRC7dSY4airnMbWYY65nsSS95ROPxoKY=; b=gJaZGOYkAchYAidzNR7U6UaBnV8WSqyK+iG1vPxX9AU7fS5tR97ThFjnRxlxCBIDVA0d4J MYsXsl6nppotHnx6GFU4O91hgi+1PFLT+737But2u6eWHYcvY6KfjYpW3OW6Pr62tFPMgO sWha4FxQp+VlBt4s+Tw9k84IOoI/9mk= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-587-RDr3-x7iPXO9To6TXvKyXQ-1; Wed, 07 Jul 2021 16:08:19 -0400 X-MC-Unique: RDr3-x7iPXO9To6TXvKyXQ-1 Received: by mail-qt1-f199.google.com with SMTP id 12-20020ac8570c0000b02902520e309f5dso760298qtw.8 for ; Wed, 07 Jul 2021 13:08:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=WSfC7JM8c4ofRC7dSY4airnMbWYY65nsSS95ROPxoKY=; b=sgJiP3BkuFZtkl6yLW9PCuDYzSiiHUCqRCxa7jT1Q4ufQTWJgWCtFC8tU3TGvJb8nz dVbiF1vtToaHRZ1m9HThZO4sEdmysHFBPCWipp4P3uAG2jnTmnpNrOkQtp2YosQ5qDUm SoObGGwKpHJ40Fbiu6b+FuLrhrsB1TQHZ/r+PK0sIpKJX6dqBbHuH8A1Fi8PZY4fnQof B2soPIBUx1IQ9EggrYLUQcvv51aeIk6d/FO1Y3VDf/0eMrS3gw0M7YISrNwsw/P/btsJ KJXYK1D5w6AN6P/eTbgXwnZfdaL3B+/mylHpn9tIKwP3Yj0SvHW/0Jv3Y/TYf1sy9MH9 xXfQ== X-Gm-Message-State: AOAM53322t11kNh/fmmjZLNmqq+ZFvbpt0N43Pm8F2P5+0hu2TjpcXu5 YWuVTc+8HwVQEmFoiaIZKuJF8OjS5XjPpdU3Sq4eFKDSwb5GgcsQyXAOhGL/YB+vI4gpFMP1Wbq VKQVht/Vyi3h93qZQ X-Received: by 2002:ad4:40c1:: with SMTP id x1mr25824932qvp.33.1625688498975; Wed, 07 Jul 2021 13:08:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzswpqEFM3E7ml5P22W+9i7H0i649sWL5seu3ptEfAzQg3aH7iR2211OLrHN2cso9W0k0+nKw== X-Received: by 2002:ad4:40c1:: with SMTP id x1mr25824913qvp.33.1625688498702; Wed, 07 Jul 2021 13:08:18 -0700 (PDT) Received: from t490s (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id w2sm6342298qkm.65.2021.07.07.13.08.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 13:08:17 -0700 (PDT) Date: Wed, 7 Jul 2021 16:08:16 -0400 From: Peter Xu To: David Hildenbrand Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> MIME-Version: 1.0 In-Reply-To: <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 20:08:26 -0000 On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: > On 07.07.21 20:02, Peter Xu wrote: > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > As it never worked properly, let's disable it via the postcopy notifier on > > > the destination. Trying to set "migrate_set_capability postcopy-ram on" > > > on the destination now results in "virtio-balloon: 'free-page-hint' does > > > not support postcopy Error: Postcopy is not supported". > > > > Would it be possible to do this in reversed order? Say, dynamically disable > > free-page-hinting if postcopy capability is set when migration starts? Perhaps > > it can also be re-enabled automatically when migration completes? > > I remember that this might be quite racy. We would have to make sure that no > hinting happens before we enable the capability. > > As soon as we messed with the dirty bitmap (during precopy), postcopy is no > longer safe. As noted in the patch, the only runtime alternative is to > disable postcopy as soon as we actually do clear a bit. Alternatively, we > could ignore any hints if the postcopy capability was enabled. Logically migration capabilities are applied at VM starts, and these capabilities should be constant during migration (I didn't check if there's a hard requirement; easy to add that if we want to assure it), and in most cases for the lifecycle of the vm. > > Whatever we do, we have to make sure that a user cannot trick the system > into an inconsistent state. Like enabling hinting, starting migration, then > enabling the postcopy capability and kicking of postcopy. I did not check if > we allow for that, though. We could turn free page hinting off when migration starts with postcopy-ram=on, then re-enable it after migration finishes. That looks very safe to me. And I don't even worry on user trying to mess it up - as that only put their own VM at risk; that's mostly fine to me. Thanks, -- Peter Xu From MAILER-DAEMON Wed Jul 07 18:32:58 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1G6E-0000wS-MN for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 18:32:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35412) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1F0L-0001jV-07; Wed, 07 Jul 2021 17:22:49 -0400 Received: from mail-ed1-x532.google.com ([2a00:1450:4864:20::532]:38505) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m1F0J-0005G1-4N; Wed, 07 Jul 2021 17:22:48 -0400 Received: by mail-ed1-x532.google.com with SMTP id x12so5322077eds.5; Wed, 07 Jul 2021 14:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=IX98sfedunE/b1mi5uLJCyWdqpbD3Iqov+vChPCcoTw=; b=cycKYlsxsv9ICJY4kiiUWAE+cCI7ENLoE9r/1LQLZ5XH62dod28HG9JqgS+XYIhZF+ b//mKX4i8nU3gCMYredahDVnkB7r/a+cRe9S1f1tKwpvtIBeOJMIOwPtino38So4dsYT p6AgVKFCLnRBmAX3N9KOnHOq0xhJJoyv4HxNAGIuzCOUQKzLne0e6VYTE5UEXnOLTw0R nu/IXrzcuJDM5Al1PRlIpe11mGSWBqLCQFyQRr3SstZeC8XGO7F46WXgcXi6uRmCAKuD 3EMMLhD80bvGsjuMGIwGp/uXkauDwYbJ34cXHgc80WfJhoxdHf8WkTqOS1A7FiN35jxD TQRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IX98sfedunE/b1mi5uLJCyWdqpbD3Iqov+vChPCcoTw=; b=B9uAq9moT4LhFPrVCBpnXvkLWpIuMfLKr1fOGZcyFho5rFfvAzd9ovTPk15o7ksbgz w0mk9EGoxJQNWWcWDdt14mBgjRcqhhst5IWoiqlg9r+PcxOTxisAtlir4l/jwcYqw8Al 9vGpykB5Ngw+wH9kMywPx2KZMxQvL0FdYKojb9WBS2K2/za9Y39WN6TjjDvWQ+pA5QNd NlXYjoK/+IGnOqZ+N77eZFI76rvAuNMoFMQS0uDLvHtFR0UIKdJ0rKYBvxnHPI+tbp9Y VB5q45TZvd+fHWTbnMIcRuuGTcVHGtvFy5MBVSfiYHJOGOToVw4o0OKmXYerATwFyXS4 951w== X-Gm-Message-State: AOAM531TuvmZf+hwtm9mv8hwHjXOIPDFxjLhg8+k6WYDA8refLJn0BFT UJHar7TkKgqWMGQIuzhs8yKBOsYzQOHYEXpdVh0= X-Google-Smtp-Source: ABdhPJxE7rL4yyoVerR8SwvupwlTPObHHBPoiVKFc3WVqpAQHZaP89J4bDjAtB/aJ0Z3bBIKBurK9GPUGu7opV7FA1Q= X-Received: by 2002:a05:6402:22c6:: with SMTP id dm6mr8683740edb.228.1625692964208; Wed, 07 Jul 2021 14:22:44 -0700 (PDT) MIME-Version: 1.0 References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> In-Reply-To: From: Alexander Duyck Date: Wed, 7 Jul 2021 14:22:32 -0700 Message-ID: Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT To: Peter Xu Cc: David Hildenbrand , qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , =?UTF-8?Q?Philippe_Mathieu=2DDaud=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" Content-Type: text/plain; charset="UTF-8" Received-SPF: pass client-ip=2a00:1450:4864:20::532; envelope-from=alexander.duyck@gmail.com; helo=mail-ed1-x532.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Wed, 07 Jul 2021 18:32:57 -0400 X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 21:22:49 -0000 On Wed, Jul 7, 2021 at 1:08 PM Peter Xu wrote: > > On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: > > On 07.07.21 20:02, Peter Xu wrote: > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > > As it never worked properly, let's disable it via the postcopy notifier on > > > > the destination. Trying to set "migrate_set_capability postcopy-ram on" > > > > on the destination now results in "virtio-balloon: 'free-page-hint' does > > > > not support postcopy Error: Postcopy is not supported". > > > > > > Would it be possible to do this in reversed order? Say, dynamically disable > > > free-page-hinting if postcopy capability is set when migration starts? Perhaps > > > it can also be re-enabled automatically when migration completes? > > > > I remember that this might be quite racy. We would have to make sure that no > > hinting happens before we enable the capability. > > > > As soon as we messed with the dirty bitmap (during precopy), postcopy is no > > longer safe. As noted in the patch, the only runtime alternative is to > > disable postcopy as soon as we actually do clear a bit. Alternatively, we > > could ignore any hints if the postcopy capability was enabled. > > Logically migration capabilities are applied at VM starts, and these > capabilities should be constant during migration (I didn't check if there's a > hard requirement; easy to add that if we want to assure it), and in most cases > for the lifecycle of the vm. Would it make sense to maybe just look at adding a postcopy value to the PrecopyNotifyData that you could populate with migration_in_postcopy() in precopy_notify()? Then all you would need to do is check for that value and if it is set you shut down the page hinting or don't start it since I suspect it wouldn't likely add any value anyway since I would think flagging unused pages doesn't add much value in a postcopy environment anyway. > > > > Whatever we do, we have to make sure that a user cannot trick the system > > into an inconsistent state. Like enabling hinting, starting migration, then > > enabling the postcopy capability and kicking of postcopy. I did not check if > > we allow for that, though. > > We could turn free page hinting off when migration starts with postcopy-ram=on, > then re-enable it after migration finishes. That looks very safe to me. And I > don't even worry on user trying to mess it up - as that only put their own VM > at risk; that's mostly fine to me. We wouldn't necessarily even need to really turn it off, just don't start it. I wonder if we couldn't just get away with adding a check to the existing virtio_balloon_free_page_hint_notify to see if we are in the postcopy state there and just shut things down or not start them. From MAILER-DAEMON Wed Jul 07 18:40:48 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1GDn-0003BC-K6 for mharc-qemu-stable@gnu.org; Wed, 07 Jul 2021 18:40:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47532) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1GDl-00038j-EL for qemu-stable@nongnu.org; Wed, 07 Jul 2021 18:40:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22421) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1GDi-00079K-8f for qemu-stable@nongnu.org; Wed, 07 Jul 2021 18:40:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625697640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=c5jjtdzXoJ8ixCUydUJAcneqRApUOr2AVXzvNeBmd9M=; b=Jr1YgHnG/cBpn1P4FnvDH3n3mNZp0TfMiI+NFKVXxDdVe0pTDbaVLk8x+pg05a1UFLOwqP w+9smde9VeBSAsOd7BjF0eeDKVz2WVTEUoerOsIFbP8U4XSXYWgmReBkchnxWgN0sTclMt +72Ftr/rzcO4zBIPXYatiCF100BE0WA= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-262-i_xcgt85MpuQrm95ajGeBA-1; Wed, 07 Jul 2021 18:40:39 -0400 X-MC-Unique: i_xcgt85MpuQrm95ajGeBA-1 Received: by mail-qv1-f71.google.com with SMTP id y35-20020a0cb8a30000b0290270c2da88e8so2661904qvf.13 for ; Wed, 07 Jul 2021 15:40:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=c5jjtdzXoJ8ixCUydUJAcneqRApUOr2AVXzvNeBmd9M=; b=TKXBjsxnBlegqQBXytXkN0tjZRf6+E4sqpU5tbjIUqpBmGWvbMXDufQ8FDRa6tcWIW mNyCb8lejjnMDM4r/uQfaHfQ6aoWrsD96th46gx4VnZLmgGLjMoe3tygJpYYOR2t+bBz ZVYWS5nehhziFL3kCm4s5bMHQR3Kq82uCJUKjJAh747uDraA35IHcyoTxfBwTdXo4Yw3 YIPDgRr7coCEtsCiQidAE6SCSgQgwp3j/p+F2HyI8krvMswaEZPM23NcHEjT19fKe93R YbaDgivIGlyzQTmx9XTdYCkyPnO59gyF0+PFQzRB5+H9SCQGaQ1KDKiBGaVunQ2bqQAf +86g== X-Gm-Message-State: AOAM530XpaKSSWQ5ktVo22zCjPBhtLd87NGds2KEeTiCpw1NeSew24Zt GTfqyxRfCtW4nq3NMkUXZ5nxLic6bi7CK5ccMJBcXxWwnLYrMEiMUaIl2K5eVsP8jRl3TrMi7K7 /nxlF8SOZBkgdY6ig X-Received: by 2002:ad4:4774:: with SMTP id d20mr26288578qvx.38.1625697639430; Wed, 07 Jul 2021 15:40:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyMa5EhIBbWpqgvJXEmdx6ucub9keY4wDVCTQFWKA3NBQZZTAD4erEWbQzV5orUdrlnBuiUjQ== X-Received: by 2002:ad4:4774:: with SMTP id d20mr26288561qvx.38.1625697639196; Wed, 07 Jul 2021 15:40:39 -0700 (PDT) Received: from t490s (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id g15sm157204qkl.104.2021.07.07.15.40.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Jul 2021 15:40:38 -0700 (PDT) Date: Wed, 7 Jul 2021 18:40:37 -0400 From: Peter Xu To: Alexander Duyck Cc: David Hildenbrand , qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , Juan Quintela , "Dr. David Alan Gilbert" Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2021 22:40:46 -0000 On Wed, Jul 07, 2021 at 02:22:32PM -0700, Alexander Duyck wrote: > On Wed, Jul 7, 2021 at 1:08 PM Peter Xu wrote: > > > > On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: > > > On 07.07.21 20:02, Peter Xu wrote: > > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > > > As it never worked properly, let's disable it via the postcopy notifier on > > > > > the destination. Trying to set "migrate_set_capability postcopy-ram on" > > > > > on the destination now results in "virtio-balloon: 'free-page-hint' does > > > > > not support postcopy Error: Postcopy is not supported". > > > > > > > > Would it be possible to do this in reversed order? Say, dynamically disable > > > > free-page-hinting if postcopy capability is set when migration starts? Perhaps > > > > it can also be re-enabled automatically when migration completes? > > > > > > I remember that this might be quite racy. We would have to make sure that no > > > hinting happens before we enable the capability. > > > > > > As soon as we messed with the dirty bitmap (during precopy), postcopy is no > > > longer safe. As noted in the patch, the only runtime alternative is to > > > disable postcopy as soon as we actually do clear a bit. Alternatively, we > > > could ignore any hints if the postcopy capability was enabled. > > > > Logically migration capabilities are applied at VM starts, and these > > capabilities should be constant during migration (I didn't check if there's a > > hard requirement; easy to add that if we want to assure it), and in most cases > > for the lifecycle of the vm. > > Would it make sense to maybe just look at adding a postcopy value to > the PrecopyNotifyData that you could populate with > migration_in_postcopy() in precopy_notify()? Should we check migrate_postcopy_ram() rather than migration_in_postcopy()? It's the precopy phase that's dropping the dirty bits and can potentially hang a postcopy vcpu, afaiu. > > Then all you would need to do is check for that value and if it is set > you shut down the page hinting or don't start it since I suspect it > wouldn't likely add any value anyway since I would think flagging > unused pages doesn't add much value in a postcopy environment anyway. > > > > > > > Whatever we do, we have to make sure that a user cannot trick the system > > > into an inconsistent state. Like enabling hinting, starting migration, then > > > enabling the postcopy capability and kicking of postcopy. I did not check if > > > we allow for that, though. > > > > We could turn free page hinting off when migration starts with postcopy-ram=on, > > then re-enable it after migration finishes. That looks very safe to me. And I > > don't even worry on user trying to mess it up - as that only put their own VM > > at risk; that's mostly fine to me. > > We wouldn't necessarily even need to really turn it off, just don't > start it. I wonder if we couldn't just get away with adding a check to > the existing virtio_balloon_free_page_hint_notify to see if we are in > the postcopy state there and just shut things down or not start them. This makes me wonder whether qemu_guest_free_page_hint() should be called at all on destination host when incoming postcopy migration is in progress. Right now the check migration_is_setup_or_active() should return true on destination host, however I am not sure if that's necessary as we don't track dirty at all there. -- Peter Xu From MAILER-DAEMON Thu Jul 08 03:14:55 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1OFK-0000c7-Uh for mharc-qemu-stable@gnu.org; Thu, 08 Jul 2021 03:14:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58664) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1OFK-0000ac-1W for qemu-stable@nongnu.org; Thu, 08 Jul 2021 03:14:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:32320) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1OFH-00054K-BZ for qemu-stable@nongnu.org; Thu, 08 Jul 2021 03:14:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625728489; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SvnYOp7N9dSi1lskfdj2KpQOHcR4DLJqCdMPgdf0yp0=; b=hoM92oQ+ipLgTdYiV/y32X/C9rfpjsySrMPyEEpNnpQYRDAh4zwYrfJJBaMzlH6QVW1voL RHIePBJyHo2PXUdGO2hemeMX6zV+pr57//kL44ZQFAZPvLljderAcXeis2DuDWR/Km6jY0 WIz8V8ZQn7m6uJAD0NeVpQFFkQ2Acbc= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-467-pLNBGpoyNZ2aIDdea6DBtA-1; Thu, 08 Jul 2021 03:14:48 -0400 X-MC-Unique: pLNBGpoyNZ2aIDdea6DBtA-1 Received: by mail-wm1-f70.google.com with SMTP id h4-20020a05600c3504b02902190c4d3d18so244907wmq.8 for ; Thu, 08 Jul 2021 00:14:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=SvnYOp7N9dSi1lskfdj2KpQOHcR4DLJqCdMPgdf0yp0=; b=ib2Jn1ZmCxLgjQ7GxpEpSqNT6Y9LYdXHa0keiUKDb1uZ3tJDnZm1695u++XNALk7gk luf5f5rp/n1TLc0wR8PV+d19iAB2u6sjtZU8jXLTHvVD0xoEVE/1u6mLF6lnnIdO13p9 TP6SxnMRwbAm3K3x8uRuzS6Ru23o3ashmQBTyotLdDA9CwNfdIPAlBBZvyZl67w328wW OMNNVbQlEKG9mZydFJyqnnQS0pFTOejfkpLiTI7BFTEP84+oTRjs9oWSA5qPL2Db24De 10ZOFksum1A7ljWlud2WFUsyqDiiTBLlskCYkMGml6UoxuyIQIgnw62JHmQv+iQWplqG Wtpw== X-Gm-Message-State: AOAM531/XqY/M1wlf4YKYcRMvg31eB8dNp+yNfegtbsNzXuCYG3327ly T0vFiHt7b2A3K7Nu4cwIXShOoF3Yx0xLBARLI164j7lTlASw/kYCOfcbzCmiRYATp7tGXgshO0/ ckI4jzkPOceAcz3WF X-Received: by 2002:a7b:c08f:: with SMTP id r15mr7059836wmh.173.1625728487109; Thu, 08 Jul 2021 00:14:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJya8PaPJmddJICa48OjlqIskCfH9uy9URFxU3vLqxr30T8zxXjQq9hJSb8jwfmqFU0ezy7Gyg== X-Received: by 2002:a7b:c08f:: with SMTP id r15mr7059788wmh.173.1625728486705; Thu, 08 Jul 2021 00:14:46 -0700 (PDT) Received: from [192.168.3.132] (p4ff23cf9.dip0.t-ipconnect.de. [79.242.60.249]) by smtp.gmail.com with ESMTPSA id f18sm1315853wru.53.2021.07.08.00.14.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 08 Jul 2021 00:14:46 -0700 (PDT) To: Peter Xu , Alexander Duyck Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Juan Quintela , "Dr. David Alan Gilbert" References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <06249318-9ef6-e309-308b-53b51d4f6d6d@redhat.com> Date: Thu, 8 Jul 2021 09:14:45 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2021 07:14:54 -0000 On 08.07.21 00:40, Peter Xu wrote: > On Wed, Jul 07, 2021 at 02:22:32PM -0700, Alexander Duyck wrote: >> On Wed, Jul 7, 2021 at 1:08 PM Peter Xu wrote: >>> >>> On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: >>>> On 07.07.21 20:02, Peter Xu wrote: >>>>> On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: >>>>>> As it never worked properly, let's disable it via the postcopy notifier on >>>>>> the destination. Trying to set "migrate_set_capability postcopy-ram on" >>>>>> on the destination now results in "virtio-balloon: 'free-page-hint' does >>>>>> not support postcopy Error: Postcopy is not supported". >>>>> >>>>> Would it be possible to do this in reversed order? Say, dynamically disable >>>>> free-page-hinting if postcopy capability is set when migration starts? Perhaps >>>>> it can also be re-enabled automatically when migration completes? >>>> >>>> I remember that this might be quite racy. We would have to make sure that no >>>> hinting happens before we enable the capability. >>>> >>>> As soon as we messed with the dirty bitmap (during precopy), postcopy is no >>>> longer safe. As noted in the patch, the only runtime alternative is to >>>> disable postcopy as soon as we actually do clear a bit. Alternatively, we >>>> could ignore any hints if the postcopy capability was enabled. >>> >>> Logically migration capabilities are applied at VM starts, and these >>> capabilities should be constant during migration (I didn't check if there's a >>> hard requirement; easy to add that if we want to assure it), and in most cases >>> for the lifecycle of the vm. >> >> Would it make sense to maybe just look at adding a postcopy value to >> the PrecopyNotifyData that you could populate with >> migration_in_postcopy() in precopy_notify()? > > Should we check migrate_postcopy_ram() rather than migration_in_postcopy()? Right, we care about the source only -- if postcopy could be started. > >> >> Then all you would need to do is check for that value and if it is set >> you shut down the page hinting or don't start it since I suspect it >> wouldn't likely add any value anyway since I would think flagging >> unused pages doesn't add much value in a postcopy environment anyway. >> We'd have to never kick it off right from the start as I explained previously. As soon as you messed with the bitmaps it's problematic. >>>> >>>> Whatever we do, we have to make sure that a user cannot trick the system >>>> into an inconsistent state. Like enabling hinting, starting migration, then >>>> enabling the postcopy capability and kicking of postcopy. I did not check if >>>> we allow for that, though. >>> >>> We could turn free page hinting off when migration starts with postcopy-ram=on, >>> then re-enable it after migration finishes. That looks very safe to me. And I >>> don't even worry on user trying to mess it up - as that only put their own VM >>> at risk; that's mostly fine to me. >> >> We wouldn't necessarily even need to really turn it off, just don't >> start it. I wonder if we couldn't just get away with adding a check to >> the existing virtio_balloon_free_page_hint_notify to see if we are in >> the postcopy state there and just shut things down or not start them. > > This makes me wonder whether qemu_guest_free_page_hint() should be called at > all on destination host when incoming postcopy migration is in progress. It really shouldn't. And if it would currently happen, it would be due to issue 1. described in the patch description that will be fixed independently, such that hinting is completely done once running on the destination. > > Right now the check migration_is_setup_or_active() should return true on > destination host, however I am not sure if that's necessary as we don't track > dirty at all there. migration_is_setup_or_active(s->state) uses migrate_get_current(), which gives us the outgoing state (source) not the incoming state (destination). -- Thanks, David / dhildenb From MAILER-DAEMON Thu Jul 08 03:23:21 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1ONV-0006B2-6B for mharc-qemu-stable@gnu.org; Thu, 08 Jul 2021 03:23:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1ONT-0006AX-KB for qemu-stable@nongnu.org; Thu, 08 Jul 2021 03:23:19 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:51177) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1ONQ-0007YO-7s for qemu-stable@nongnu.org; Thu, 08 Jul 2021 03:23:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625728995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xMOggDAkOdxjQigCxigY5pOcddrmosF6jLQXjagmN6M=; b=d4tUpd58aE/qxTbkZ7w3YU4xOYyF8n/FjUw5ttp7ZohNz7A/ecOzi51dyroWThiOJIduLm 6yB26wIcWutogYtHtnZ8y8irpr8iwGmpHp45qYh4x5/nIUnJWj/SbzMhFDMnRGtXOqd+L0 MWD29s+rdvvtPJCqQcsf2x9B97UYR3Y= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-9-5e_F_jtdMByBsAWYCm34Ug-1; Thu, 08 Jul 2021 03:23:12 -0400 X-MC-Unique: 5e_F_jtdMByBsAWYCm34Ug-1 Received: by mail-wr1-f69.google.com with SMTP id i10-20020a5d55ca0000b029013b976502b6so204048wrw.2 for ; Thu, 08 Jul 2021 00:23:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=xMOggDAkOdxjQigCxigY5pOcddrmosF6jLQXjagmN6M=; b=jQmCJhXvR5JTA+9EyFYgIzCBFleGbA9NjIcGgzdiYzzGyqJhejCrXVRD7kHnOkDVAg +bOXpgA4Wv6A8tLgB7lPjz3B8RdMnTtG1FpxaQmdGCks0EKW1JS/g2O0wzoroDnVKcos vDQ7lcVn72DA9Z75u8KJbY/pqy3wBxVLLZPwgGKzEXSZptlkYsEyJQULaY9hEpj3fR1r 1HXXwMv/aSHQ/CExy0NMoXv/+HwAeZNW1ZKxciDe7/L2cxNEiBO7huWFpEMtGcdypPVu QNLhyh6G18l+Gisnw+MjZd1Cajzc8qbnMtj/S1I6+cdpa/97MpDE2kSa9GAZ3u5TWrQv dKcg== X-Gm-Message-State: AOAM531N8omatTCaurNBCQC14cPGjmLwXQitUz5qttvJsJwdHeI0X8FY RdTE0fxTpN8zT95h8tySP/rBoeBeNcu7m7Lq9UeJTF90DL5MLIQSsdJSX4zk7YarxcWcCsyfcrN XkcBrHQ6c7pqPNrYz X-Received: by 2002:adf:aacb:: with SMTP id i11mr32408503wrc.371.1625728990856; Thu, 08 Jul 2021 00:23:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwvrhXeOy7Wg4VDUDxGyeaI3Cv7oRB1mSi97FFAzzLLplQLtKsvBPCCWPvsXmVY00YFEF4msw== X-Received: by 2002:adf:aacb:: with SMTP id i11mr32408477wrc.371.1625728990698; Thu, 08 Jul 2021 00:23:10 -0700 (PDT) Received: from [192.168.3.132] (p4ff23cf9.dip0.t-ipconnect.de. [79.242.60.249]) by smtp.gmail.com with ESMTPSA id y66sm1141932wmy.39.2021.07.08.00.23.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 08 Jul 2021 00:23:10 -0700 (PDT) To: Alexander Duyck , Peter Xu Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Juan Quintela , "Dr. David Alan Gilbert" References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <5f5dd7f3-ce09-53d6-db48-1a333119205d@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: Date: Thu, 8 Jul 2021 09:23:09 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2021 07:23:19 -0000 On 07.07.21 23:22, Alexander Duyck wrote: > On Wed, Jul 7, 2021 at 1:08 PM Peter Xu wrote: >> >> On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote: >>> On 07.07.21 20:02, Peter Xu wrote: >>>> On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: >>>>> As it never worked properly, let's disable it via the postcopy notifier on >>>>> the destination. Trying to set "migrate_set_capability postcopy-ram on" >>>>> on the destination now results in "virtio-balloon: 'free-page-hint' does >>>>> not support postcopy Error: Postcopy is not supported". >>>> >>>> Would it be possible to do this in reversed order? Say, dynamically disable >>>> free-page-hinting if postcopy capability is set when migration starts? Perhaps >>>> it can also be re-enabled automatically when migration completes? >>> >>> I remember that this might be quite racy. We would have to make sure that no >>> hinting happens before we enable the capability. >>> >>> As soon as we messed with the dirty bitmap (during precopy), postcopy is no >>> longer safe. As noted in the patch, the only runtime alternative is to >>> disable postcopy as soon as we actually do clear a bit. Alternatively, we >>> could ignore any hints if the postcopy capability was enabled. >> >> Logically migration capabilities are applied at VM starts, and these >> capabilities should be constant during migration (I didn't check if there's a >> hard requirement; easy to add that if we want to assure it), and in most cases >> for the lifecycle of the vm. > > Would it make sense to maybe just look at adding a postcopy value to > the PrecopyNotifyData that you could populate with > migration_in_postcopy() in precopy_notify()? > > Then all you would need to do is check for that value and if it is set > you shut down the page hinting or don't start it since I suspect it > wouldn't likely add any value anyway since I would think flagging > unused pages doesn't add much value in a postcopy environment anyway. I don't think that's true. With free page hinting you reduce the effective VM size you have to migrate. Any page that has to be migrated will consume bandwidth. 1. Although postcopy transfers only the currently requested pages, the background thread will keep pushing pages, making postcopy eventually run longer. While in postcopy (well, and in precopy) we are faced with a clear performance degradation, so we want to minimize the overall time spent. 2. Usually you let precopy run for a while before switching to postcopy. With free page hinting you might be able to greatly reduce the number of pages you'll have to migrate later in the same amount of time. So there would be value, but at least I am not too interested in making it work in combination perfectly if it results in significant migration code changes; my goal is to not silently break guests when used in combination -- once there is the actual requirement to optimize this setup, we can work on that optimization (as discussed with MST here). So I'll explore going the migrate_postcopy_ram() way to silently (or at least warn) disable free page hinting. Thanks. -- Thanks, David / dhildenb From MAILER-DAEMON Thu Jul 08 04:19:37 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1PFx-0002gu-Ma for mharc-qemu-stable@gnu.org; Thu, 08 Jul 2021 04:19:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41194) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1PFv-0002f8-EB for qemu-stable@nongnu.org; Thu, 08 Jul 2021 04:19:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:32444) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1PFs-0008Tw-Gy for qemu-stable@nongnu.org; Thu, 08 Jul 2021 04:19:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625732371; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cH+4Yjf2eIz1YyivKEM3vEEzQyX0DbbxxTqKi0dJ8sw=; b=MEF5/CacrPAZR8zJncvPOQW7iWI+GlbYYUhUg0ZRhsxR1Juz6B6UqLLge+TjHhhSNxJ0wL CTmoUjMpFibI2Yh1RjXVDX43Wq/l7PmjNoM2f05Zp5ZV/b3eVSLg0jqVkX1/cxHhLlC1V0 X6Djq4EWWIOz4a4uzjgfNY60GWkLZu0= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-234-1qe7uvO9NLeQewu-iN8HhQ-1; Thu, 08 Jul 2021 04:19:30 -0400 X-MC-Unique: 1qe7uvO9NLeQewu-iN8HhQ-1 Received: by mail-wm1-f71.google.com with SMTP id n17-20020a05600c4f91b0290209ebf81aabso2062233wmq.2 for ; Thu, 08 Jul 2021 01:19:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=cH+4Yjf2eIz1YyivKEM3vEEzQyX0DbbxxTqKi0dJ8sw=; b=cTu6RxSYIOwAq0AdMofLydj80DqCbEPt3UBS7Wnyk72NVXuYvoGfB1EfbSBtur2nX2 GsE66ygAaszB3oYn4njY4ujUDhFHCTnng1mFgdbrMZ8Wd6bKp2zW3HwMdFMfidJVLKvg 6cU65Rbz/sX3aw13/lFTr1c66ixkk1NXyk9lDMw35QIFspXGh09KrXpJh11cUHICx19R xzDbWKfThX2DR4ylBS+V9OAcbxykwfSmdOp4fxXYxQxC5NwThoPYFlePApfqgwd2RxM9 ioA4uFdfJ1Cvps4F1RSLIgt827TWj22/jbBFy0RDNGjG/S1GeO3NUIsG/GxeMGz6f5eT yBVg== X-Gm-Message-State: AOAM531xmPqdKTcdDRez42NCQlHaYIRtW7FTtFh1lVhNe1kl26t1p9en BOg1SL6bfrUHBNJtUbRVJXIGM/K+95YukJ3AKsTKPLur0l5d31Ts8IANRuIYoQalnJcQ7N24fEZ SWojCzVOP+j7lgkqW X-Received: by 2002:a7b:cc15:: with SMTP id f21mr3832111wmh.5.1625732368856; Thu, 08 Jul 2021 01:19:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy4QWnsRfTIdMhm+3L6pXCALpbfKrDCUqW9Yn8b/3TVD3yiFfjnMYDnoRD9Fo4oH4khM+PSpw== X-Received: by 2002:a7b:cc15:: with SMTP id f21mr3832083wmh.5.1625732368526; Thu, 08 Jul 2021 01:19:28 -0700 (PDT) Received: from [192.168.3.132] (p4ff23cf9.dip0.t-ipconnect.de. [79.242.60.249]) by smtp.gmail.com with ESMTPSA id j12sm1028737wrq.83.2021.07.08.01.19.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 08 Jul 2021 01:19:28 -0700 (PDT) To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> <20210707151459-mutt-send-email-mst@kernel.org> <40a148d7-acad-67ee-ac66-e9ad56a23b44@redhat.com> <20210707155413-mutt-send-email-mst@kernel.org> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: Date: Thu, 8 Jul 2021 10:19:26 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210707155413-mutt-send-email-mst@kernel.org> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2021 08:19:35 -0000 On 07.07.21 21:57, Michael S. Tsirkin wrote: > On Wed, Jul 07, 2021 at 09:47:31PM +0200, David Hildenbrand wrote: >> On 07.07.21 21:19, Michael S. Tsirkin wrote: >>> On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote: >>>> On 07.07.21 21:05, Michael S. Tsirkin wrote: >>>>> On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: >>>>>> Postcopy never worked properly with 'free-page-hint=on', as there are >>>>>> at least two issues: >>>>>> >>>>>> 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE >>>>>> and consequently won't release free pages back to the OS once >>>>>> migration finishes. >>>>>> >>>>>> The issue is that for postcopy, we won't do a final bitmap sync while >>>>>> the guest is stopped on the source and >>>>>> virtio_balloon_free_page_hint_notify() will only call >>>>>> virtio_balloon_free_page_done() on the source during >>>>>> PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to >>>>>> the destination. >>>>>> >>>>>> 2) Once the VM touches a page on the destination that has been excluded >>>>>> from migration on the source via qemu_guest_free_page_hint() while >>>>>> postcopy is active, that thread will stall until postcopy finishes >>>>>> and all threads are woken up. (with older Linux kernels that won't >>>>>> retry faults when woken up via userfaultfd, we might actually get a >>>>>> SEGFAULT) >>>>>> >>>>>> The issue is that the source will refuse to migrate any pages that >>>>>> are not marked as dirty in the dirty bmap -- for example, because the >>>>>> page might just have been sent. Consequently, the faulting thread will >>>>>> stall, waiting for the page to be migrated -- which could take quite >>>>>> a while and result in guest OS issues. >>>>> >>>>> OK so if source gets a request for a page which is not dirty >>>>> it does not respond immediately? Why not just teach it to >>>>> respond? It would seem that if destination wants a page we >>>>> should just give it to the destination ... >>>> >>>> The source does not know if a page has already been sent (e.g., via the >>>> background migration thread that moves all data over) vs. the page has not >>>> been send because the page was hinted. This is the part where we'd need >>>> additional tracking on the source to actually know that. >>>> >>>> We must not send a page twice, otherwise bad things can happen when placing >>>> pages that already have been migrated, because that scenario can easily >>>> happen with ordinary postcopy (page has already been sent and we're dealing >>>> with a stale request from the destination). >>> >>> OK let me get this straight >>> >>> A. source sends page >>> B. destination requests page >>> C. destination changes page >>> D. source sends page >>> E. destination overwrites page >>> >>> this is what you are worried about right? >> >> IIRC E. is with recent kernels: >> >> E. placing the page fails with -EEXIST and postcopy migration fails >> >> However, the man page (man ioctl_userfaultfd) doesn't describe what is >> actually supposed to happen when double-placing. Could be that it's >> "undefined behavior". >> >> I did not try, though. >> >> >> This is how it works today: >> >> A. source sends page and marks it clean >> B. destination requests page >> C. destination receives page and places it >> D. source ignores request as page is clean > > If it's actually -EEXIST then we could just resend it > and teach destination to ignore -EEXIST errors right? I think checking the received bitmap would be more robust. > > Will actually make things a bit more robust: destination > handles its own consistency instead of relying on source. TBH, I don't think having multiple copies of the same page in flight is neither a very good design, nor robust. In an idea world, the destination would make sure to send a page only once and the source would expect to receive a page only once. This is currently the case except for free page hinting, where a page might not be sent as we're relying on the dirty bitmap to also track what has been already sent. The destination does handle consistently right now by bailing out if it receives the page twice (e.g., -EEXIST). In addition, we could consult the received bitmap to make sure we really only receive stuff once instead of relying on undocumented userfaultfd behavior. On the source, we'd ideally have a "sent bitmap", but I'll really avoid introducing new bitmaps because that can't be the ultimate solution (dirty, clean, received, ...). I just found the comment that describes the current design: migration/ram.c:get_queued_page() "We're sending this page, and since it's postcopy nothing else will dirty it, and we must make sure it doesn't get sent again even if this queue request was received after the background search already sent it." > >>> >>> the fix is to mark page clean in A. >>> then in D to not send page if it's clean? >>> >>> And the problem with hinting is this: >>> >>> A. page is marked clean >>> B. destination requests page >>> C. destination changes page >>> D. source sends page <- does not happen, page is clean! >>> E. destination overwrites page >> >> Simplified it's >> >> A. page is marked clean by hinting code >> B. destination requests page >> D. source ignores request as page is clean >> E. destination stalls until postcopy unregisters uffd >> >> >> Some thoughts >> >> 1. We do have a a recv bitmap where we track received pages on the >> destination (e.g., ramblock_recv_bitmap_test()), however we only use it to >> avoid sending duplicate requests to the hypervisor AFAIKs, and don't check >> it when placing pages. >> >> 2. Changing the migration behavior unconditionally on the source will break >> migration to old QEMU binaries that cannot handle this change. > > We can always make this depend on new machine types. Yes, we could if we want to go down that path. > > >> 3. I think the current behavior is in place to make debugging easier. If >> only a single instance of a page will ever be migrated from source to >> destination, there cannot be silent data corruption. Further, we avoid >> migrating unnecessarily pages twice. >> > > Likely does not matter much for performance, it seems unlikely that > the race is all that common. I cannot really tell, I'd guess it can happen but shouldn't happen too often. For now I'm going to keep it simple and ignore free page hints while postcopy has been enabled as suggested in the other discussion. That should be the easy fix and if someone wants to optimize this scenarios, we can think about a proper way to make it fly (as I said, nobody seemed to really have cared in the past as it's mostly broken right now). Thanks for the helpful discussion! -- Thanks, David / dhildenb From MAILER-DAEMON Thu Jul 08 05:54:05 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1QjN-0000NV-4t for mharc-qemu-stable@gnu.org; Thu, 08 Jul 2021 05:54:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58516) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1QjL-0000IO-1I for qemu-stable@nongnu.org; Thu, 08 Jul 2021 05:54:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36039) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1QjI-0005I2-LS for qemu-stable@nongnu.org; Thu, 08 Jul 2021 05:54:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625738039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5c4BkJkeAWRDuOL6pIC49RBdHBvgH77WwaWtOu9mmU0=; b=Nrh00/+WjwaYhCi8Vmfx3mJztwYPy3JArBo2Pn/Dy+Eihvmvlynk3zOVQU3FKdaLbmpt1L bTtMislvBfJLatjHVOMP6Scrx2x0jVR6i7Vtf4BYHZrMgE+y6btYi0K7O2ssOffxKV2wyc q1Qk1pFGJBNodtMaTAid7yYLw/XSg2Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-45-iZbQ6tUAOBWDI7NIPQvETA-1; Thu, 08 Jul 2021 05:53:58 -0400 X-MC-Unique: iZbQ6tUAOBWDI7NIPQvETA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3853F36319; Thu, 8 Jul 2021 09:53:57 +0000 (UTC) Received: from t480s.redhat.com (ovpn-112-130.ams2.redhat.com [10.36.112.130]) by smtp.corp.redhat.com (Postfix) with ESMTP id E83055D9DD; Thu, 8 Jul 2021 09:53:54 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" , Peter Xu Subject: [PATCH v2 1/2] virtio-balloon: don't start free page hinting if postcopy is possible Date: Thu, 8 Jul 2021 11:53:38 +0200 Message-Id: <20210708095339.20274-2-david@redhat.com> In-Reply-To: <20210708095339.20274-1-david@redhat.com> References: <20210708095339.20274-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.439, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2021 09:54:03 -0000 Postcopy never worked properly with 'free-page-hint=on', as there are at least two issues: 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE and consequently won't release free pages back to the OS once migration finishes. The issue is that for postcopy, we won't do a final bitmap sync while the guest is stopped on the source and virtio_balloon_free_page_hint_notify() will only call virtio_balloon_free_page_done() on the source during PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to the destination. 2) Once the VM touches a page on the destination that has been excluded from migration on the source via qemu_guest_free_page_hint() while postcopy is active, that thread will stall until postcopy finishes and all threads are woken up. (with older Linux kernels that won't retry faults when woken up via userfaultfd, we might actually get a SEGFAULT) The issue is that the source will refuse to migrate any pages that are not marked as dirty in the dirty bmap -- for example, because the page might just have been sent. Consequently, the faulting thread will stall, waiting for the page to be migrated -- which could take quite a while and result in guest OS issues. While we could fix 1) comparatively easily, 2) is harder to get right and might require more involved RAM migration changes on source and destination [1]. As it never worked properly, let's not start free page hinting in the precopy notifier if the postcopy migration capability was enabled to fix it easily. Capabilities cannot be enabled once migration is already running. Note 1: in the future we might either adjust migration code on the source to track pages that have actually been sent or adjust migration code on source and destination to eventually send pages multiple times from the source and and deal with pages that are sent multiple times on the destination. Note 2: virtio-mem has similar issues, however, access to "unplugged" memory by the guest is very rare and we would have to be very lucky for it to happen during migration. The spec states "The driver SHOULD NOT read from unplugged memory blocks ..." and "The driver MUST NOT write to unplugged memory blocks". virtio-mem will move away from virtio_balloon_free_page_done() soon and handle this case explicitly on the destination. [1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: qemu-stable@nongnu.org Cc: Wei Wang Cc: Michael S. Tsirkin Cc: Philippe Mathieu-Daudé Cc: Alexander Duyck Cc: Juan Quintela Cc: "Dr. David Alan Gilbert" Cc: Peter Xu Signed-off-by: David Hildenbrand --- hw/virtio/virtio-balloon.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 4b5d9e5e50..ae7867a8db 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -30,6 +30,7 @@ #include "trace.h" #include "qemu/error-report.h" #include "migration/misc.h" +#include "migration/migration.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" @@ -662,6 +663,18 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data) return 0; } + /* + * Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty + * bitmap and will not get migrated, especially also not when the postcopy + * destination starts using them and requests migration from the source; the + * faulting thread will stall until postcopy migration finishes and + * all threads are woken up. Let's not start free page hinting if postcopy + * is possible. + */ + if (migrate_postcopy_ram()) { + return 0; + } + switch (pnd->reason) { case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC: virtio_balloon_free_page_stop(dev); -- 2.31.1 From MAILER-DAEMON Thu Jul 08 14:52:06 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1Z81-00089x-7V for mharc-qemu-stable@gnu.org; Thu, 08 Jul 2021 14:52:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45374) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Z7y-00088Q-BQ for qemu-stable@nongnu.org; Thu, 08 Jul 2021 14:52:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36086) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1Z7v-0007wH-Cc for qemu-stable@nongnu.org; Thu, 08 Jul 2021 14:52:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625770318; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xf6OO79+mMpwvnAccaQtU7okGnYFJxE+HSIVzmXsFEM=; b=evcwBWqiFWJzH8KWX7SnORbdgzXjul4n/Ytmh9xKcjsVdD9Co9Uay4mlk880yAMFbt3pSK ux4pXt9yjLPLbSXr/FB+TgmFv3jR19Qa7S1KUVcPrj9JKAA1/TilCHcCOI1ZNbzBLDLxJA di6p9PBo5nUrEsjhpv5AMLy89QXe1jQ= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-593-0ueA6xbfNZGF1pC-bivp1w-1; Thu, 08 Jul 2021 14:51:57 -0400 X-MC-Unique: 0ueA6xbfNZGF1pC-bivp1w-1 Received: by mail-qk1-f199.google.com with SMTP id g135-20020a379d8d0000b02903b5097f3998so4546907qke.14 for ; Thu, 08 Jul 2021 11:51:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=xf6OO79+mMpwvnAccaQtU7okGnYFJxE+HSIVzmXsFEM=; b=XAH5E9sMbPNdHA4yVfcooh/G5SphNeQmwS8vKsDTDZ8LWoUeavdjRqFQgXhtiBtMjg igfv7uq6SJBtvhxcZZeRS+c3M3jCwh/v5GfgwOWJeTWs6ad85tBKkeIHa8P/tkOFAEqz TpE9go5Yn327kkcsWvL2GQfx3+73gmOXHparJGOKLj/VOTS822cjxf1N+LyCqM9IO5jg N0mpsKcr7FHdsnJcvGXu6BwPsN2IlE3fugOYaEgvKAdKhkXz7NivZZaAn2Mkrn3Np3YX ovea44IvDnGZfMlnuNVG/ubbkOjv1ntmZnVY/gIRmvH+0Lu6pAcwMQBatceTPYNOhWhY +Ibg== X-Gm-Message-State: AOAM533U+nJo3SFHMh9JUWBhJ0KTd3knj2MpGN5Dh4MBnmNR0phYUCl1 2Q5v1NHJ5NGbd0sXgAOGpgOZfyLlVOuB5/5fTJ8TtfL0/DT/AgpdhWCK/ZubGeXEv8EZ+VyEGiC 30oITJUX7Zif1a26d X-Received: by 2002:ac8:7f07:: with SMTP id f7mr28892548qtk.120.1625770317049; Thu, 08 Jul 2021 11:51:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz90lF9Kp7oJXcaJAG7wv2M1IZka7Z6qSmLsK18d2kTJEWg2Sc/yFi7bzZEmsYUAcrObr2GHg== X-Received: by 2002:ac8:7f07:: with SMTP id f7mr28892532qtk.120.1625770316863; Thu, 08 Jul 2021 11:51:56 -0700 (PDT) Received: from t490s (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id 207sm1431914qki.63.2021.07.08.11.51.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jul 2021 11:51:56 -0700 (PDT) Date: Thu, 8 Jul 2021 14:51:55 -0400 From: Peter Xu To: David Hildenbrand Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , "Michael S . Tsirkin" , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , Alexander Duyck , Juan Quintela , "Dr. David Alan Gilbert" Subject: Re: [PATCH v2 1/2] virtio-balloon: don't start free page hinting if postcopy is possible Message-ID: References: <20210708095339.20274-1-david@redhat.com> <20210708095339.20274-2-david@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210708095339.20274-2-david@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.45, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2021 18:52:03 -0000 On Thu, Jul 08, 2021 at 11:53:38AM +0200, David Hildenbrand wrote: > Postcopy never worked properly with 'free-page-hint=on', as there are > at least two issues: > > 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE > and consequently won't release free pages back to the OS once > migration finishes. > > The issue is that for postcopy, we won't do a final bitmap sync while > the guest is stopped on the source and > virtio_balloon_free_page_hint_notify() will only call > virtio_balloon_free_page_done() on the source during > PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to > the destination. > > 2) Once the VM touches a page on the destination that has been excluded > from migration on the source via qemu_guest_free_page_hint() while > postcopy is active, that thread will stall until postcopy finishes > and all threads are woken up. (with older Linux kernels that won't > retry faults when woken up via userfaultfd, we might actually get a > SEGFAULT) > > The issue is that the source will refuse to migrate any pages that > are not marked as dirty in the dirty bmap -- for example, because the > page might just have been sent. Consequently, the faulting thread will > stall, waiting for the page to be migrated -- which could take quite > a while and result in guest OS issues. > > While we could fix 1) comparatively easily, 2) is harder to get right and > might require more involved RAM migration changes on source and destination > [1]. > > As it never worked properly, let's not start free page hinting in the > precopy notifier if the postcopy migration capability was enabled to fix > it easily. Capabilities cannot be enabled once migration is already > running. > > Note 1: in the future we might either adjust migration code on the source > to track pages that have actually been sent or adjust > migration code on source and destination to eventually send > pages multiple times from the source and and deal with pages > that are sent multiple times on the destination. > > Note 2: virtio-mem has similar issues, however, access to "unplugged" > memory by the guest is very rare and we would have to be very > lucky for it to happen during migration. The spec states > "The driver SHOULD NOT read from unplugged memory blocks ..." > and "The driver MUST NOT write to unplugged memory blocks". > virtio-mem will move away from virtio_balloon_free_page_done() > soon and handle this case explicitly on the destination. > > [1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com > > Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") > Cc: qemu-stable@nongnu.org > Cc: Wei Wang > Cc: Michael S. Tsirkin > Cc: Philippe Mathieu-Daudé > Cc: Alexander Duyck > Cc: Juan Quintela > Cc: "Dr. David Alan Gilbert" > Cc: Peter Xu > Signed-off-by: David Hildenbrand Reviewed-by: Peter Xu Thanks, David. -- Peter Xu From MAILER-DAEMON Thu Jul 08 15:07:57 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1ZNN-0001Kv-Pt for mharc-qemu-stable@gnu.org; Thu, 08 Jul 2021 15:07:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49302) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1ZNM-0001Fw-2J for qemu-stable@nongnu.org; Thu, 08 Jul 2021 15:07:56 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59188) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1ZNF-0004CN-UU for qemu-stable@nongnu.org; Thu, 08 Jul 2021 15:07:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625771269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ak/F/kndEDCjI1BckAzDVn5XSafye85nA4YVY8X6fU0=; b=RIOh4rsVOcba2FXN3ByNMmZmwbtIzxzou2xlNS+oKS0KQEID4uzHDYq2esMuhr2MfdBsJa SnxsmaGEEKpw8Ps8EFqRzS6berdvp97Y43pxB5sw/+3bl1320HC6HRf8sPvUH59X/0MOIJ 2nN4YOflsE9ZAPNGnB42Db3Y2K73Cxw= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-293-44OYlbgIPRedv1uV_X3khg-1; Thu, 08 Jul 2021 15:07:48 -0400 X-MC-Unique: 44OYlbgIPRedv1uV_X3khg-1 Received: by mail-wm1-f72.google.com with SMTP id a13-20020a7bc1cd0000b02902104c012aa3so3787376wmj.9 for ; Thu, 08 Jul 2021 12:07:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ak/F/kndEDCjI1BckAzDVn5XSafye85nA4YVY8X6fU0=; b=KVuBHR8H/xeRhbYCbsjwSM/o+KqYSHrSPmT/AdL66RN2yQ4z37L5b3ghUfTLyQQcvq tGFA4LZR7gbgEUf7iALFbx4syTUE20hydBd61unSiMU7bRiVugBPR99ZPvUcNLHFzK8s 9cjIIE3tS2SoQKi0oVFu8BPVXL3pbfR+xCylqxWG6SzesweEibngCxZchnNVOZizwX4S BJfTqQ8iddvVOTCE89C6CWtOxgvcHtLMk6WDycL1T6Y/wVf08zKlegSez4RzYqri74sE 19NgHaw9hvIH3c/9ZP3nKCTRYp0qFvZubXEU6LKGh8sp+XpBItejAxqS2bAB+isgwjXA GmgA== X-Gm-Message-State: AOAM531VkkBo/55PeZ7mFh2rFn/U/8/0+N2gBuKN6zHyyAzsc+YXX2Cw Ts3dcalaN4IXK17N8CparkPoavHZOz7sIg+Q91BjXdOuCJxMb0SmKQ4squHMofvDmpH7VBdfrE5 5YMKVASvJ6aEJLZRH X-Received: by 2002:a5d:46d1:: with SMTP id g17mr31755991wrs.2.1625771267234; Thu, 08 Jul 2021 12:07:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyJzYqcr+x1zEV0LLDwjO1C5xfP+/dC1mI+iFteGgjD7YxM1MjmU1nIdJHV/pLl52e4/76hVQ== X-Received: by 2002:a5d:46d1:: with SMTP id g17mr31755970wrs.2.1625771267057; Thu, 08 Jul 2021 12:07:47 -0700 (PDT) Received: from work-vm (cpc109021-salf6-2-0-cust453.10-2.cable.virginm.net. [82.29.237.198]) by smtp.gmail.com with ESMTPSA id k5sm3022824wmk.11.2021.07.08.12.07.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jul 2021 12:07:46 -0700 (PDT) Date: Thu, 8 Jul 2021 20:07:44 +0100 From: "Dr. David Alan Gilbert" To: "Michael S. Tsirkin" Cc: David Hildenbrand , qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= , Alexander Duyck , Juan Quintela , Peter Xu Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> <20210707151459-mutt-send-email-mst@kernel.org> <40a148d7-acad-67ee-ac66-e9ad56a23b44@redhat.com> <20210707155413-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 In-Reply-To: <20210707155413-mutt-send-email-mst@kernel.org> User-Agent: Mutt/2.0.7 (2021-05-04) Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dgilbert@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=216.205.24.124; envelope-from=dgilbert@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.45, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2021 19:07:56 -0000 * Michael S. Tsirkin (mst@redhat.com) wrote: > On Wed, Jul 07, 2021 at 09:47:31PM +0200, David Hildenbrand wrote: > > On 07.07.21 21:19, Michael S. Tsirkin wrote: > > > On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote: > > > > On 07.07.21 21:05, Michael S. Tsirkin wrote: > > > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > > > > Postcopy never worked properly with 'free-page-hint=on', as there are > > > > > > at least two issues: > > > > > > > > > > > > 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE > > > > > > and consequently won't release free pages back to the OS once > > > > > > migration finishes. > > > > > > > > > > > > The issue is that for postcopy, we won't do a final bitmap sync while > > > > > > the guest is stopped on the source and > > > > > > virtio_balloon_free_page_hint_notify() will only call > > > > > > virtio_balloon_free_page_done() on the source during > > > > > > PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to > > > > > > the destination. > > > > > > > > > > > > 2) Once the VM touches a page on the destination that has been excluded > > > > > > from migration on the source via qemu_guest_free_page_hint() while > > > > > > postcopy is active, that thread will stall until postcopy finishes > > > > > > and all threads are woken up. (with older Linux kernels that won't > > > > > > retry faults when woken up via userfaultfd, we might actually get a > > > > > > SEGFAULT) > > > > > > > > > > > > The issue is that the source will refuse to migrate any pages that > > > > > > are not marked as dirty in the dirty bmap -- for example, because the > > > > > > page might just have been sent. Consequently, the faulting thread will > > > > > > stall, waiting for the page to be migrated -- which could take quite > > > > > > a while and result in guest OS issues. > > > > > > > > > > OK so if source gets a request for a page which is not dirty > > > > > it does not respond immediately? Why not just teach it to > > > > > respond? It would seem that if destination wants a page we > > > > > should just give it to the destination ... > > > > > > > > The source does not know if a page has already been sent (e.g., via the > > > > background migration thread that moves all data over) vs. the page has not > > > > been send because the page was hinted. This is the part where we'd need > > > > additional tracking on the source to actually know that. > > > > > > > > We must not send a page twice, otherwise bad things can happen when placing > > > > pages that already have been migrated, because that scenario can easily > > > > happen with ordinary postcopy (page has already been sent and we're dealing > > > > with a stale request from the destination). > > > > > > OK let me get this straight > > > > > > A. source sends page > > > B. destination requests page > > > C. destination changes page > > > D. source sends page > > > E. destination overwrites page > > > > > > this is what you are worried about right? > > > > IIRC E. is with recent kernels: > > > > E. placing the page fails with -EEXIST and postcopy migration fails > > > > However, the man page (man ioctl_userfaultfd) doesn't describe what is > > actually supposed to happen when double-placing. Could be that it's > > "undefined behavior". > > > > I did not try, though. > > > > > > This is how it works today: > > > > A. source sends page and marks it clean > > B. destination requests page > > C. destination receives page and places it > > D. source ignores request as page is clean > > If it's actually -EEXIST then we could just resend it > and teach destination to ignore -EEXIST errors right? > > Will actually make things a bit more robust: destination > handles its own consistency instead of relying on source. You have to be careful of a few things; a) If the destination has modified the page, then you must definitely not under any circumstances lose those modifications and replace them by an old version from the source. b) With postcopy recovery I think there is a bitmap to track some of this; but you have to be careful since you don't know whether pages that were sent were actually received. Dave > > > > > > > > the fix is to mark page clean in A. > > > then in D to not send page if it's clean? > > > > > > And the problem with hinting is this: > > > > > > A. page is marked clean > > > B. destination requests page > > > C. destination changes page > > > D. source sends page <- does not happen, page is clean! > > > E. destination overwrites page > > > > Simplified it's > > > > A. page is marked clean by hinting code > > B. destination requests page > > D. source ignores request as page is clean > > E. destination stalls until postcopy unregisters uffd > > > > > > Some thoughts > > > > 1. We do have a a recv bitmap where we track received pages on the > > destination (e.g., ramblock_recv_bitmap_test()), however we only use it to > > avoid sending duplicate requests to the hypervisor AFAIKs, and don't check > > it when placing pages. > > > > 2. Changing the migration behavior unconditionally on the source will break > > migration to old QEMU binaries that cannot handle this change. > > We can always make this depend on new machine types. > > > > 3. I think the current behavior is in place to make debugging easier. If > > only a single instance of a page will ever be migrated from source to > > destination, there cannot be silent data corruption. Further, we avoid > > migrating unnecessarily pages twice. > > > > Likely does not matter much for performance, it seems unlikely that > the race is all that common. > > > Maybe Dave and Peter can spot any flaws in my understanding. > > > > -- > > Thanks, > > > > David / dhildenb > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK From MAILER-DAEMON Fri Jul 09 07:27:49 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m1ofd-0007LX-FV for mharc-qemu-stable@gnu.org; Fri, 09 Jul 2021 07:27:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40546) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1ofX-0007F7-LP for qemu-stable@nongnu.org; Fri, 09 Jul 2021 07:27:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:55336) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m1ofR-0000sJ-Ct for qemu-stable@nongnu.org; Fri, 09 Jul 2021 07:27:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625830056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pc3IM995+4b4bTCS1N72o8rPE0tS1fIlAp5j7BTH0A8=; b=EIZ0lL1AwN+y/Dc/AcB3lfA8HwAkECQPCm3i7wO6PwoyYwvJpg+d32D+eLz5/HPPxW4/CN 8jkLb2Gn5+pY0qw4pYFwn2m9rtPRxIVMzIGc19AzZAAG3qZNfNDaTTAmZqFCjDdAXJyfuS KE68EO+dg7FL3A7QmoIHTIOLQjtJpkg= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-441-zawJk8GbMmCPYtowEcHCKA-1; Fri, 09 Jul 2021 07:27:35 -0400 X-MC-Unique: zawJk8GbMmCPYtowEcHCKA-1 Received: by mail-wm1-f70.google.com with SMTP id z127-20020a1c7e850000b02901e46e4d52c0so5100918wmc.6 for ; Fri, 09 Jul 2021 04:27:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=pc3IM995+4b4bTCS1N72o8rPE0tS1fIlAp5j7BTH0A8=; b=LIXtd8CjCAu7SH2SJpQlX9odW6PArymQ48c+A3W5up2vUp46bYAaO3a6diEZ/arcay 6O8EiGrqNcJxv0Kfsg3rX3cat0EZGfZwy2si3W6yJJwox7VtgE/CjbL3zriOB6KlmA7F /ZCyAKUwqF//1RYYOWpBpoZpTSM7mivFcEW0qc1/kflgLybYcsqXLFqC0SnCOOPyn6PC T0GZAYpg+GNE6nI9wnSXQpz8KxmHRjKHuXUx8DCOS46OZx+BSnJw534p/3jVw7FSNf21 a6GhNjGk2GYG0hDWWtZzQ9srI9PBpoILr8OGamHs7CJ7wOiQU2tgSx9vzekPv4e5Zw10 tDGA== X-Gm-Message-State: AOAM532ODCm9m7+eWvGvwdQq/0TT+i3NR315U4A3A7mO3Nt+7y0bzG5i NvmxuLSmxo3ja+sqdJgRAokBbiYfdsB2sssVCs+OBI9G20bzBT2EGloNoypAU1yTjjmfv9mLSF0 KtpBQoOmXFsxCY2dg X-Received: by 2002:a5d:6984:: with SMTP id g4mr9967689wru.381.1625830054382; Fri, 09 Jul 2021 04:27:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4Wz+/pFbe4yNJ3c54Y4yVCtk5AqSY1NYKiE9Jhpwegj/4apwgQMkzO+xgu7Sm7nwHivA9vw== X-Received: by 2002:a5d:6984:: with SMTP id g4mr9967648wru.381.1625830054008; Fri, 09 Jul 2021 04:27:34 -0700 (PDT) Received: from redhat.com ([2.55.150.102]) by smtp.gmail.com with ESMTPSA id f2sm4956595wrq.69.2021.07.09.04.27.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jul 2021 04:27:33 -0700 (PDT) Date: Fri, 9 Jul 2021 07:27:28 -0400 From: "Michael S. Tsirkin" To: "Dr. David Alan Gilbert" Cc: David Hildenbrand , qemu-devel@nongnu.org, qemu-stable@nongnu.org, Wei Wang , Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= , Alexander Duyck , Juan Quintela , Peter Xu Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT Message-ID: <20210709072635-mutt-send-email-mst@kernel.org> References: <20210707140655.30982-1-david@redhat.com> <20210707140655.30982-3-david@redhat.com> <20210707150038-mutt-send-email-mst@kernel.org> <0391e06b-5885-8000-3c58-ae20493e3e65@redhat.com> <20210707151459-mutt-send-email-mst@kernel.org> <40a148d7-acad-67ee-ac66-e9ad56a23b44@redhat.com> <20210707155413-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mst@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=216.205.24.124; envelope-from=mst@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.45, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2021 11:27:44 -0000 On Thu, Jul 08, 2021 at 08:07:44PM +0100, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (mst@redhat.com) wrote: > > On Wed, Jul 07, 2021 at 09:47:31PM +0200, David Hildenbrand wrote: > > > On 07.07.21 21:19, Michael S. Tsirkin wrote: > > > > On Wed, Jul 07, 2021 at 09:14:00PM +0200, David Hildenbrand wrote: > > > > > On 07.07.21 21:05, Michael S. Tsirkin wrote: > > > > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote: > > > > > > > Postcopy never worked properly with 'free-page-hint=on', as there are > > > > > > > at least two issues: > > > > > > > > > > > > > > 1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE > > > > > > > and consequently won't release free pages back to the OS once > > > > > > > migration finishes. > > > > > > > > > > > > > > The issue is that for postcopy, we won't do a final bitmap sync while > > > > > > > the guest is stopped on the source and > > > > > > > virtio_balloon_free_page_hint_notify() will only call > > > > > > > virtio_balloon_free_page_done() on the source during > > > > > > > PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to > > > > > > > the destination. > > > > > > > > > > > > > > 2) Once the VM touches a page on the destination that has been excluded > > > > > > > from migration on the source via qemu_guest_free_page_hint() while > > > > > > > postcopy is active, that thread will stall until postcopy finishes > > > > > > > and all threads are woken up. (with older Linux kernels that won't > > > > > > > retry faults when woken up via userfaultfd, we might actually get a > > > > > > > SEGFAULT) > > > > > > > > > > > > > > The issue is that the source will refuse to migrate any pages that > > > > > > > are not marked as dirty in the dirty bmap -- for example, because the > > > > > > > page might just have been sent. Consequently, the faulting thread will > > > > > > > stall, waiting for the page to be migrated -- which could take quite > > > > > > > a while and result in guest OS issues. > > > > > > > > > > > > OK so if source gets a request for a page which is not dirty > > > > > > it does not respond immediately? Why not just teach it to > > > > > > respond? It would seem that if destination wants a page we > > > > > > should just give it to the destination ... > > > > > > > > > > The source does not know if a page has already been sent (e.g., via the > > > > > background migration thread that moves all data over) vs. the page has not > > > > > been send because the page was hinted. This is the part where we'd need > > > > > additional tracking on the source to actually know that. > > > > > > > > > > We must not send a page twice, otherwise bad things can happen when placing > > > > > pages that already have been migrated, because that scenario can easily > > > > > happen with ordinary postcopy (page has already been sent and we're dealing > > > > > with a stale request from the destination). > > > > > > > > OK let me get this straight > > > > > > > > A. source sends page > > > > B. destination requests page > > > > C. destination changes page > > > > D. source sends page > > > > E. destination overwrites page > > > > > > > > this is what you are worried about right? > > > > > > IIRC E. is with recent kernels: > > > > > > E. placing the page fails with -EEXIST and postcopy migration fails > > > > > > However, the man page (man ioctl_userfaultfd) doesn't describe what is > > > actually supposed to happen when double-placing. Could be that it's > > > "undefined behavior". > > > > > > I did not try, though. > > > > > > > > > This is how it works today: > > > > > > A. source sends page and marks it clean > > > B. destination requests page > > > C. destination receives page and places it > > > D. source ignores request as page is clean > > > > If it's actually -EEXIST then we could just resend it > > and teach destination to ignore -EEXIST errors right? > > > > Will actually make things a bit more robust: destination > > handles its own consistency instead of relying on source. > > You have to be careful of a few things; > a) If the destination has modified the page, then you must > definitely not under any circumstances lose those modifications > and replace them by an old version from the source. > > b) With postcopy recovery I think there is a bitmap to track some > of this; but you have to be careful since you don't know whether > pages that were sent were actually received. > > Dave what I am trying to say is that userfaultfd already tracks these things in the kernel for us. Ideally we'd just use that ... > > > > > > > > > > > > the fix is to mark page clean in A. > > > > then in D to not send page if it's clean? > > > > > > > > And the problem with hinting is this: > > > > > > > > A. page is marked clean > > > > B. destination requests page > > > > C. destination changes page > > > > D. source sends page <- does not happen, page is clean! > > > > E. destination overwrites page > > > > > > Simplified it's > > > > > > A. page is marked clean by hinting code > > > B. destination requests page > > > D. source ignores request as page is clean > > > E. destination stalls until postcopy unregisters uffd > > > > > > > > > Some thoughts > > > > > > 1. We do have a a recv bitmap where we track received pages on the > > > destination (e.g., ramblock_recv_bitmap_test()), however we only use it to > > > avoid sending duplicate requests to the hypervisor AFAIKs, and don't check > > > it when placing pages. > > > > > > 2. Changing the migration behavior unconditionally on the source will break > > > migration to old QEMU binaries that cannot handle this change. > > > > We can always make this depend on new machine types. > > > > > > > 3. I think the current behavior is in place to make debugging easier. If > > > only a single instance of a page will ever be migrated from source to > > > destination, there cannot be silent data corruption. Further, we avoid > > > migrating unnecessarily pages twice. > > > > > > > Likely does not matter much for performance, it seems unlikely that > > the race is all that common. > > > > > Maybe Dave and Peter can spot any flaws in my understanding. > > > > > > -- > > > Thanks, > > > > > > David / dhildenb > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK From MAILER-DAEMON Fri Jul 23 15:58:55 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m71Jv-0001Lg-8S for mharc-qemu-stable@gnu.org; Fri, 23 Jul 2021 15:58:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43174) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m71Jt-0001LC-Ic for qemu-stable@nongnu.org; Fri, 23 Jul 2021 15:58:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27458) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m71Jq-0001PG-KO for qemu-stable@nongnu.org; Fri, 23 Jul 2021 15:58:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627070329; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=M21HEs2+wKWVDBeRSP0S+SbBwVj4R2HaqJi7xPG/Bxc=; b=Fb7H91MKi0Q3rpq2bfB7APYSB91ueotTZ99iESBMYiVjAfccvDiyxJZpGweM92zeX9FT4W M3uTh+qx6V63NCOjap1bcn3ePK4rsAQFdXGi/uWg+JriEx+bcz0y2OEFY9SerShdbaIv8P vEnV7U5ZqokJpW3ni5YqbMspC/bRigo= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-364-cDLRfRmCOwWeusmk0I3knA-1; Fri, 23 Jul 2021 15:58:46 -0400 X-MC-Unique: cDLRfRmCOwWeusmk0I3knA-1 Received: by mail-wm1-f70.google.com with SMTP id o25-20020a05600c5119b0290218757e2783so1165115wms.7 for ; Fri, 23 Jul 2021 12:58:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=M21HEs2+wKWVDBeRSP0S+SbBwVj4R2HaqJi7xPG/Bxc=; b=PdNcI9XUz/YyxSHKSsT4mwqG7a3AjK03SkkVFqKVdoLkj8R1GzxDblTpXHQcXSosHI Rm/UZwtnuHqWgN+ZNsPzZSrvZYcm1+4l9wuBKh6gyriSMxJqsAX3EHexTPo1wBTVHXzG obz6pUETqq25He5dmcDl9B8aVf+avzKBNNlxt59CkOqeF/i6AKLlFa7xJhfkdgMJn3RK 468KJ6L9e2Ez2zhIyTNB2uDcA0wlqlJtBVvDKqRwu0j13cEZ8KoWqjEEDngID1XDEBjn SuD3F7CPpw7tif5BqRxOb3mMV1HEzpYXDZnao0Oo+pkbAksCucBZmew6aDCElXWwbSgz paYg== X-Gm-Message-State: AOAM531JIJ+bBdTS73mcFnXCn3ofqoPORYBOVwDtkBAFMSmv2yN4B7df rhkst78sPpoosYvKsUAi8I6QTVuwvB6c/StXyqd7KmUp12Z+eSbTbkK7S5NB/DdOjtzCDZbxV9h /pJn0Yw4IYitpH5wY X-Received: by 2002:a5d:608c:: with SMTP id w12mr7071479wrt.53.1627070325448; Fri, 23 Jul 2021 12:58:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx3T0yCH6T3c7QoLEFmdnbC8cj4lvLKKlLT9H+qCAu4ZsJKY8qcJuX6viE7D2lpjwA3TvMLGg== X-Received: by 2002:a5d:608c:: with SMTP id w12mr7071456wrt.53.1627070325264; Fri, 23 Jul 2021 12:58:45 -0700 (PDT) Received: from x1w.. (122.red-83-42-66.dynamicip.rima-tde.net. [83.42.66.122]) by smtp.gmail.com with ESMTPSA id u2sm27627257wmc.42.2021.07.23.12.58.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Jul 2021 12:58:44 -0700 (PDT) From: =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, Alex Williamson , Maxim Levitsky , Max Reitz , Fam Zheng , Kevin Wolf , Stefan Hajnoczi , Eric Auger , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-stable@nongnu.org, =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= Subject: [PATCH-for-6.1 v3] block/nvme: Fix VFIO_MAP_DMA failed: No space left on device Date: Fri, 23 Jul 2021 21:58:43 +0200 Message-Id: <20210723195843.1032825-1-philmd@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=philmd@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=philmd@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.472, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Jul 2021 19:58:53 -0000 When the NVMe block driver was introduced (see commit bdd6a90a9e5, January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning -ENOMEM in case of error. The driver was correctly handling the error path to recycle its volatile IOVA mappings. To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit DMA mappings per container", April 2019) added the -ENOSPC error to signal the user exhausted the DMA mappings available for a container. The block driver started to mis-behave: qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device (qemu) (qemu) info status VM status: paused (io-error) (qemu) c VFIO_MAP_DMA failed: No space left on device (qemu) c VFIO_MAP_DMA failed: No space left on device (The VM is not resumable from here, hence stuck.) Fix by handling the new -ENOSPC error (when DMA mappings are exhausted) without any distinction to the current -ENOMEM error, so we don't change the behavior on old kernels where the CVE-2019-3882 fix is not present. An easy way to reproduce this bug is to restrict the DMA mapping limit (65535 by default) when loading the VFIO IOMMU module: # modprobe vfio_iommu_type1 dma_entry_limit=666 Cc: qemu-stable@nongnu.org Cc: Fam Zheng Cc: Maxim Levitsky Cc: Alex Williamson Reported-by: Michal Prívozník Fixes: bdd6a90a9e5 ("block: Add VFIO based NVMe driver") Buglink: https://bugs.launchpad.net/qemu/+bug/1863333 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/65 Signed-off-by: Philippe Mathieu-Daudé --- v3: Reworded (Fam) v2: KISS checking both errors undistinguishedly (Maxim) --- block/nvme.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/block/nvme.c b/block/nvme.c index 2b5421e7aa6..e8dbbc23177 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1030,7 +1030,29 @@ try_map: r = qemu_vfio_dma_map(s->vfio, qiov->iov[i].iov_base, len, true, &iova); + if (r == -ENOSPC) { + /* + * In addition to the -ENOMEM error, the VFIO_IOMMU_MAP_DMA + * ioctl returns -ENOSPC to signal the user exhausted the DMA + * mappings available for a container since Linux kernel commit + * 492855939bdb ("vfio/type1: Limit DMA mappings per container", + * April 2019, see CVE-2019-3882). + * + * This block driver already handles this error path by checking + * for the -ENOMEM error, so we directly replace -ENOSPC by + * -ENOMEM. Beside, -ENOSPC has a specific meaning for blockdev + * coroutines: it triggers BLOCKDEV_ON_ERROR_ENOSPC and + * BLOCK_ERROR_ACTION_STOP which stops the VM, asking the operator + * to add more storage to the blockdev. Not something we can do + * easily with an IOMMU :) + */ + r = -ENOMEM; + } if (r == -ENOMEM && retry) { + /* + * We exhausted the DMA mappings available for our container: + * recycle the volatile IOVA mappings. + */ retry = false; trace_nvme_dma_flush_queue_wait(s); if (s->dma_map_count) { -- 2.31.1 From MAILER-DAEMON Mon Jul 26 04:49:33 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m7wIn-0005Rf-2T for mharc-qemu-stable@gnu.org; Mon, 26 Jul 2021 04:49:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48428) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7wIm-0005Q2-5B for qemu-stable@nongnu.org; Mon, 26 Jul 2021 04:49:32 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:22241) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7wIh-00049V-Vp for qemu-stable@nongnu.org; Mon, 26 Jul 2021 04:49:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627289367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yoxV/YojjeWROgxEimn+F1ZrfwSc/gYTuhH+Zg9d3ig=; b=WvRYf4d4xrNtpjkAjrzUN04t2HWrCYtTBhxJq7Y2jiEIevKGs1n+wKA3KSzXIIxrv7eclL geS5GY70Goix8gloQxZajaT5lnRH8JDz6QIZSbbULaFltHRM755fV3TikLZ7A5B+0IR/8S FmfTKPUq5HF0WsSzDwmHSdACY8xjalw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-433-1Jq-3gPYN7q99YTNibkVuw-1; Mon, 26 Jul 2021 04:49:23 -0400 X-MC-Unique: 1Jq-3gPYN7q99YTNibkVuw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6250E101C8A5; Mon, 26 Jul 2021 08:49:22 +0000 (UTC) Received: from localhost (ovpn-113-151.ams2.redhat.com [10.36.113.151]) by smtp.corp.redhat.com (Postfix) with ESMTP id ACE581ABD1; Mon, 26 Jul 2021 08:49:14 +0000 (UTC) Date: Mon, 26 Jul 2021 09:49:13 +0100 From: Stefan Hajnoczi To: Philippe =?iso-8859-1?Q?Mathieu-Daud=E9?= Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, Alex Williamson , Maxim Levitsky , Max Reitz , Fam Zheng , Kevin Wolf , Eric Auger , qemu-stable@nongnu.org, Michal =?iso-8859-1?B?UHLtdm96bu1r?= Subject: Re: [PATCH-for-6.1 v3] block/nvme: Fix VFIO_MAP_DMA failed: No space left on device Message-ID: References: <20210723195843.1032825-1-philmd@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210723195843.1032825-1-philmd@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="YtOrPNLJlAQ/xGDu" Content-Disposition: inline Received-SPF: pass client-ip=216.205.24.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.719, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2021 08:49:32 -0000 --YtOrPNLJlAQ/xGDu Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 23, 2021 at 09:58:43PM +0200, Philippe Mathieu-Daud=E9 wrote: > When the NVMe block driver was introduced (see commit bdd6a90a9e5, > January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning > -ENOMEM in case of error. The driver was correctly handling the > error path to recycle its volatile IOVA mappings. >=20 > To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit > DMA mappings per container", April 2019) added the -ENOSPC error to > signal the user exhausted the DMA mappings available for a container. >=20 > The block driver started to mis-behave: >=20 > qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device > (qemu) > (qemu) info status > VM status: paused (io-error) > (qemu) c > VFIO_MAP_DMA failed: No space left on device > (qemu) c > VFIO_MAP_DMA failed: No space left on device >=20 > (The VM is not resumable from here, hence stuck.) >=20 > Fix by handling the new -ENOSPC error (when DMA mappings are > exhausted) without any distinction to the current -ENOMEM error, > so we don't change the behavior on old kernels where the CVE-2019-3882 > fix is not present. >=20 > An easy way to reproduce this bug is to restrict the DMA mapping > limit (65535 by default) when loading the VFIO IOMMU module: >=20 > # modprobe vfio_iommu_type1 dma_entry_limit=3D666 >=20 > Cc: qemu-stable@nongnu.org > Cc: Fam Zheng > Cc: Maxim Levitsky > Cc: Alex Williamson > Reported-by: Michal Pr=EDvozn=EDk > Fixes: bdd6a90a9e5 ("block: Add VFIO based NVMe driver") > Buglink: https://bugs.launchpad.net/qemu/+bug/1863333 > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/65 > Signed-off-by: Philippe Mathieu-Daud=E9 > --- > v3: Reworded (Fam) > v2: KISS checking both errors undistinguishedly (Maxim) > --- > block/nvme.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) Thanks, applied to my block tree: https://gitlab.com/stefanha/qemu/commits/block Stefan --YtOrPNLJlAQ/xGDu Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmD+dwkACgkQnKSrs4Gr c8g03QgApI73N3wCWlkH9mWP6B6de0uyWuJKPwKZlmnSM1hqFWYp6sYwa2eAMxO+ R/CRjoZSFjkfXUq7wa7XdbkD59KgcgVWrASapbCfFu2PrsfrCvVc8XrPaLhUyVf9 d35upa/BeTSZY5hQXH66xEqPJK844yCSGSNGg1WFcxtRQgKjOwt+z0aZGZvS4nsI 4K8YNIfxKAA9C1LGBHD4RhFrqKzDH17WVXPdF4DvetOoilpuoMaoIzLtKzu7pi6n Zg2ZclTayLnPTQwl0tgv/ZaRktIlj4mtRSma/79xDEeziK6hG8Y+rWxPdWeA2wqe BCJtBzD7zazi+gRiZn8vSQ4gFP9IBw== =SLJz -----END PGP SIGNATURE----- --YtOrPNLJlAQ/xGDu-- From MAILER-DAEMON Mon Jul 26 04:53:34 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m7wMg-0001H2-Qa for mharc-qemu-stable@gnu.org; Mon, 26 Jul 2021 04:53:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49500) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7wMc-0001F3-77 for qemu-stable@nongnu.org; Mon, 26 Jul 2021 04:53:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25253) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7wMX-0007FT-4E for qemu-stable@nongnu.org; Mon, 26 Jul 2021 04:53:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627289603; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q6FUDQQ9xK3KK6L6MTUkl3BH3l9IJBdU9X4OvhZCfqE=; b=JImu/9dAE4rifjDLKVruFRe9A8i1HLfiLx4ZGsjbEti086i4ExhQ8aRq5Hg8rqpnZFbwFs a24aNUtHDsbH6GVmG7iohrtLjMMh1eL/kRp/TAA1Vw/wRFTemLzAvI0oYr3oBqNIS/eKSw PfsyKW0z+sPUON5wsd9z+VGMm81do/E= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-592--gyLSU0xP2SJICAGTXK0OA-1; Mon, 26 Jul 2021 04:53:21 -0400 X-MC-Unique: -gyLSU0xP2SJICAGTXK0OA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 68C75CC621; Mon, 26 Jul 2021 08:53:20 +0000 (UTC) Received: from localhost (ovpn-113-151.ams2.redhat.com [10.36.113.151]) by smtp.corp.redhat.com (Postfix) with ESMTP id DFC4F19D9B; Mon, 26 Jul 2021 08:53:12 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org, Peter Maydell Cc: Max Reitz , Stefan Hajnoczi , Fam Zheng , qemu-block@nongnu.org, Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-stable@nongnu.org, Maxim Levitsky , Alex Williamson , =?UTF-8?q?Michal=20Pr=C3=ADvozn=C3=ADk?= Subject: [PULL for-6.1 1/1] block/nvme: Fix VFIO_MAP_DMA failed: No space left on device Date: Mon, 26 Jul 2021 09:53:06 +0100 Message-Id: <20210726085306.729309-2-stefanha@redhat.com> In-Reply-To: <20210726085306.729309-1-stefanha@redhat.com> References: <20210726085306.729309-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: base64 Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.719, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2021 08:53:30 -0000 RnJvbTogUGhpbGlwcGUgTWF0aGlldS1EYXVkw6kgPHBoaWxtZEByZWRoYXQuY29tPgoKV2hlbiB0 aGUgTlZNZSBibG9jayBkcml2ZXIgd2FzIGludHJvZHVjZWQgKHNlZSBjb21taXQgYmRkNmE5MGE5 ZTUsCkphbnVhcnkgMjAxOCksIExpbnV4IFZGSU9fSU9NTVVfTUFQX0RNQSBpb2N0bCB3YXMgb25s eSByZXR1cm5pbmcKLUVOT01FTSBpbiBjYXNlIG9mIGVycm9yLiBUaGUgZHJpdmVyIHdhcyBjb3Jy ZWN0bHkgaGFuZGxpbmcgdGhlCmVycm9yIHBhdGggdG8gcmVjeWNsZSBpdHMgdm9sYXRpbGUgSU9W QSBtYXBwaW5ncy4KClRvIGZpeCBDVkUtMjAxOS0zODgyLCBMaW51eCBjb21taXQgNDkyODU1OTM5 YmRiICgidmZpby90eXBlMTogTGltaXQKRE1BIG1hcHBpbmdzIHBlciBjb250YWluZXIiLCBBcHJp bCAyMDE5KSBhZGRlZCB0aGUgLUVOT1NQQyBlcnJvciB0bwpzaWduYWwgdGhlIHVzZXIgZXhoYXVz dGVkIHRoZSBETUEgbWFwcGluZ3MgYXZhaWxhYmxlIGZvciBhIGNvbnRhaW5lci4KClRoZSBibG9j ayBkcml2ZXIgc3RhcnRlZCB0byBtaXMtYmVoYXZlOgoKICBxZW11LXN5c3RlbS14ODZfNjQ6IFZG SU9fTUFQX0RNQSBmYWlsZWQ6IE5vIHNwYWNlIGxlZnQgb24gZGV2aWNlCiAgKHFlbXUpCiAgKHFl bXUpIGluZm8gc3RhdHVzCiAgVk0gc3RhdHVzOiBwYXVzZWQgKGlvLWVycm9yKQogIChxZW11KSBj CiAgVkZJT19NQVBfRE1BIGZhaWxlZDogTm8gc3BhY2UgbGVmdCBvbiBkZXZpY2UKICAocWVtdSkg YwogIFZGSU9fTUFQX0RNQSBmYWlsZWQ6IE5vIHNwYWNlIGxlZnQgb24gZGV2aWNlCgooVGhlIFZN IGlzIG5vdCByZXN1bWFibGUgZnJvbSBoZXJlLCBoZW5jZSBzdHVjay4pCgpGaXggYnkgaGFuZGxp bmcgdGhlIG5ldyAtRU5PU1BDIGVycm9yICh3aGVuIERNQSBtYXBwaW5ncyBhcmUKZXhoYXVzdGVk KSB3aXRob3V0IGFueSBkaXN0aW5jdGlvbiB0byB0aGUgY3VycmVudCAtRU5PTUVNIGVycm9yLApz byB3ZSBkb24ndCBjaGFuZ2UgdGhlIGJlaGF2aW9yIG9uIG9sZCBrZXJuZWxzIHdoZXJlIHRoZSBD VkUtMjAxOS0zODgyCmZpeCBpcyBub3QgcHJlc2VudC4KCkFuIGVhc3kgd2F5IHRvIHJlcHJvZHVj ZSB0aGlzIGJ1ZyBpcyB0byByZXN0cmljdCB0aGUgRE1BIG1hcHBpbmcKbGltaXQgKDY1NTM1IGJ5 IGRlZmF1bHQpIHdoZW4gbG9hZGluZyB0aGUgVkZJTyBJT01NVSBtb2R1bGU6CgogICMgbW9kcHJv YmUgdmZpb19pb21tdV90eXBlMSBkbWFfZW50cnlfbGltaXQ9NjY2CgpDYzogcWVtdS1zdGFibGVA bm9uZ251Lm9yZwpDYzogRmFtIFpoZW5nIDxmYW1AZXVwaG9uLm5ldD4KQ2M6IE1heGltIExldml0 c2t5IDxtbGV2aXRza0ByZWRoYXQuY29tPgpDYzogQWxleCBXaWxsaWFtc29uIDxhbGV4LndpbGxp YW1zb25AcmVkaGF0LmNvbT4KUmVwb3J0ZWQtYnk6IE1pY2hhbCBQcsOtdm96bsOtayA8bXByaXZv em5AcmVkaGF0LmNvbT4KU2lnbmVkLW9mZi1ieTogUGhpbGlwcGUgTWF0aGlldS1EYXVkw6kgPHBo aWxtZEByZWRoYXQuY29tPgpNZXNzYWdlLWlkOiAyMDIxMDcyMzE5NTg0My4xMDMyODI1LTEtcGhp bG1kQHJlZGhhdC5jb20KRml4ZXM6IGJkZDZhOTBhOWU1ICgiYmxvY2s6IEFkZCBWRklPIGJhc2Vk IE5WTWUgZHJpdmVyIikKQnVnbGluazogaHR0cHM6Ly9idWdzLmxhdW5jaHBhZC5uZXQvcWVtdS8r YnVnLzE4NjMzMzMKUmVzb2x2ZXM6IGh0dHBzOi8vZ2l0bGFiLmNvbS9xZW11LXByb2plY3QvcWVt dS8tL2lzc3Vlcy82NQpTaWduZWQtb2ZmLWJ5OiBQaGlsaXBwZSBNYXRoaWV1LURhdWTDqSA8cGhp bG1kQHJlZGhhdC5jb20+ClNpZ25lZC1vZmYtYnk6IFN0ZWZhbiBIYWpub2N6aSA8c3RlZmFuaGFA cmVkaGF0LmNvbT4KLS0tCiBibG9jay9udm1lLmMgfCAyMiArKysrKysrKysrKysrKysrKysrKysr CiAxIGZpbGUgY2hhbmdlZCwgMjIgaW5zZXJ0aW9ucygrKQoKZGlmZiAtLWdpdCBhL2Jsb2NrL252 bWUuYyBiL2Jsb2NrL252bWUuYwppbmRleCAyYjU0MjFlN2FhLi5lOGRiYmMyMzE3IDEwMDY0NAot LS0gYS9ibG9jay9udm1lLmMKKysrIGIvYmxvY2svbnZtZS5jCkBAIC0xMDMwLDcgKzEwMzAsMjkg QEAgdHJ5X21hcDoKICAgICAgICAgciA9IHFlbXVfdmZpb19kbWFfbWFwKHMtPnZmaW8sCiAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICBxaW92LT5pb3ZbaV0uaW92X2Jhc2UsCiAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICBsZW4sIHRydWUsICZpb3ZhKTsKKyAgICAgICAgaWYgKHIg PT0gLUVOT1NQQykgeworICAgICAgICAgICAgLyoKKyAgICAgICAgICAgICAqIEluIGFkZGl0aW9u IHRvIHRoZSAtRU5PTUVNIGVycm9yLCB0aGUgVkZJT19JT01NVV9NQVBfRE1BCisgICAgICAgICAg ICAgKiBpb2N0bCByZXR1cm5zIC1FTk9TUEMgdG8gc2lnbmFsIHRoZSB1c2VyIGV4aGF1c3RlZCB0 aGUgRE1BCisgICAgICAgICAgICAgKiBtYXBwaW5ncyBhdmFpbGFibGUgZm9yIGEgY29udGFpbmVy IHNpbmNlIExpbnV4IGtlcm5lbCBjb21taXQKKyAgICAgICAgICAgICAqIDQ5Mjg1NTkzOWJkYiAo InZmaW8vdHlwZTE6IExpbWl0IERNQSBtYXBwaW5ncyBwZXIgY29udGFpbmVyIiwKKyAgICAgICAg ICAgICAqIEFwcmlsIDIwMTksIHNlZSBDVkUtMjAxOS0zODgyKS4KKyAgICAgICAgICAgICAqCisg ICAgICAgICAgICAgKiBUaGlzIGJsb2NrIGRyaXZlciBhbHJlYWR5IGhhbmRsZXMgdGhpcyBlcnJv ciBwYXRoIGJ5IGNoZWNraW5nCisgICAgICAgICAgICAgKiBmb3IgdGhlIC1FTk9NRU0gZXJyb3Is IHNvIHdlIGRpcmVjdGx5IHJlcGxhY2UgLUVOT1NQQyBieQorICAgICAgICAgICAgICogLUVOT01F TS4gQmVzaWRlLCAtRU5PU1BDIGhhcyBhIHNwZWNpZmljIG1lYW5pbmcgZm9yIGJsb2NrZGV2Cisg ICAgICAgICAgICAgKiBjb3JvdXRpbmVzOiBpdCB0cmlnZ2VycyBCTE9DS0RFVl9PTl9FUlJPUl9F Tk9TUEMgYW5kCisgICAgICAgICAgICAgKiBCTE9DS19FUlJPUl9BQ1RJT05fU1RPUCB3aGljaCBz dG9wcyB0aGUgVk0sIGFza2luZyB0aGUgb3BlcmF0b3IKKyAgICAgICAgICAgICAqIHRvIGFkZCBt b3JlIHN0b3JhZ2UgdG8gdGhlIGJsb2NrZGV2LiBOb3Qgc29tZXRoaW5nIHdlIGNhbiBkbworICAg ICAgICAgICAgICogZWFzaWx5IHdpdGggYW4gSU9NTVUgOikKKyAgICAgICAgICAgICAqLworICAg ICAgICAgICAgciA9IC1FTk9NRU07CisgICAgICAgIH0KICAgICAgICAgaWYgKHIgPT0gLUVOT01F TSAmJiByZXRyeSkgeworICAgICAgICAgICAgLyoKKyAgICAgICAgICAgICAqIFdlIGV4aGF1c3Rl ZCB0aGUgRE1BIG1hcHBpbmdzIGF2YWlsYWJsZSBmb3Igb3VyIGNvbnRhaW5lcjoKKyAgICAgICAg ICAgICAqIHJlY3ljbGUgdGhlIHZvbGF0aWxlIElPVkEgbWFwcGluZ3MuCisgICAgICAgICAgICAg Ki8KICAgICAgICAgICAgIHJldHJ5ID0gZmFsc2U7CiAgICAgICAgICAgICB0cmFjZV9udm1lX2Rt YV9mbHVzaF9xdWV1ZV93YWl0KHMpOwogICAgICAgICAgICAgaWYgKHMtPmRtYV9tYXBfY291bnQp IHsKLS0gCjIuMzEuMQoK From MAILER-DAEMON Wed Jul 28 14:17:48 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m8o7o-0002Xa-HY for mharc-qemu-stable@gnu.org; Wed, 28 Jul 2021 14:17:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51254) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m8o7m-0002QR-Jc; Wed, 28 Jul 2021 14:17:46 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:44979) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m8o7k-0005Jn-Hr; Wed, 28 Jul 2021 14:17:46 -0400 Received: by mail-wr1-x42f.google.com with SMTP id z4so3650129wrv.11; Wed, 28 Jul 2021 11:17:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YwYKVHNgecu5jiIKoXnoWtjk/FNw2rOw7861LGYxMNs=; b=fy9J5xznGE0VLXw+HWBexvcNxEB0SuThh+YUPyuqh/X+TgrMej8Tz3M63H4ONwFB1u JYzurTtXN3lCT+/MPRvpbNjGnKccz3TbFZIpCkfDow4OM9goCMQxAoHvLXRZk7ZTH1bH LcVpTjbsBY6oxeAoq6BBWa0pjlKZR7Nn1TWQj4qiO+wqJL/UXmJZLr1sE/h/DdDgaXgG 0k2QersmTjfrYvfCDrqNPZDH1iozBuPQj+0m9YXl1bCV+za12sd9+4CaxcEb7d61aR+z tzQpnnddtwNQh5W/627bZiViwgyrwzJD4IvIPh0GEO4UEp3zzgq0SmpUPxPQhKYsaBK+ hvMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=YwYKVHNgecu5jiIKoXnoWtjk/FNw2rOw7861LGYxMNs=; b=HnVIOfUOgKrrmrNJfvXWmPztVKMVpyeGv0jw3/q6V2NmHw2Udf5TXwTSrp04n3G+is 94GILoILwhdhEvIlPZ7OuhC59l8W8NrUTd5AQ2qmS2f3KngPHcmiu7QBTw2HixvefEen CME3GhcM1kNuT2asIw2ab8m4lMMbxlRg1xSZ9zjTLzQkeeNpCyr5CrB+oqns83vFQNff ZAUtzRwJWVtmmJAqCKqmkO5sazcga54d2KTRh0p1va7uzOhKm+Lg/EJRiBb7TU/cPuBq b+ZB2mf2LdPmv/bZ77X3s/iBZ/sfXZS29dUuUZOpNQaJj/2H1fbmR0QYV2GvnGbrXx8A 2pUw== X-Gm-Message-State: AOAM532HJSGgPYxPqU9mnPGG2MyWvL5MpakfHIe5Shp7pM1AP/fEkbwG 9c8LLmAKSzmdt+hrBv47coEaXMvjLoTDCw== X-Google-Smtp-Source: ABdhPJyZYmkpkyVUCqBdcQVKEoSji5VcnZDFm4a411qC3uQzfpzGwoJj7JuoSLL2P0xKIbUluKhIlg== X-Received: by 2002:adf:f1c6:: with SMTP id z6mr674996wro.207.1627496261720; Wed, 28 Jul 2021 11:17:41 -0700 (PDT) Received: from x1w.. (122.red-83-42-66.dynamicip.rima-tde.net. [83.42.66.122]) by smtp.gmail.com with ESMTPSA id v5sm614199wrd.74.2021.07.28.11.17.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Jul 2021 11:17:41 -0700 (PDT) Sender: =?UTF-8?Q?Philippe_Mathieu=2DDaud=C3=A9?= From: =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= To: qemu-devel@nongnu.org Cc: Bin Meng , qemu-block@nongnu.org, qemu-arm@nongnu.org, Alexander Bulekov , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-stable@nongnu.org Subject: [PATCH-for-6.1 2/3] hw/sd/sdcard: Fix assertion accessing out-of-range addresses with CMD30 Date: Wed, 28 Jul 2021 20:17:27 +0200 Message-Id: <20210728181728.2012952-3-f4bug@amsat.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210728181728.2012952-1-f4bug@amsat.org> References: <20210728181728.2012952-1-f4bug@amsat.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=philippe.mathieu.daude@gmail.com; helo=mail-wr1-x42f.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jul 2021 18:17:47 -0000 OSS-Fuzz found sending illegal addresses when querying the write protection bits triggers the assertion added in commit 84816fb63e5 ("hw/sd/sdcard: Assert if accessing an illegal group"): qemu-fuzz-i386-target-generic-fuzz-sdhci-v3: ../hw/sd/sd.c:824: uint32_t sd_wpbits(SDState *, uint64_t): Assertion `wpnum < sd->wpgrps_size' failed. #3 0x7f62a8b22c91 in __assert_fail #4 0x5569adcec405 in sd_wpbits hw/sd/sd.c:824:9 #5 0x5569adce5f6d in sd_normal_command hw/sd/sd.c:1389:38 #6 0x5569adce3870 in sd_do_command hw/sd/sd.c:1737:17 #7 0x5569adcf1566 in sdbus_do_command hw/sd/core.c:100:16 #8 0x5569adcfc192 in sdhci_send_command hw/sd/sdhci.c:337:12 #9 0x5569adcfa3a3 in sdhci_write hw/sd/sdhci.c:1186:9 #10 0x5569adfb3447 in memory_region_write_accessor softmmu/memory.c:492:5 It is legal for the CMD30 to query for out-of-range addresses. Such invalid addresses are simply ignored in the response (write protection bits set to 0). Note, we had an off-by-one in the wpgrps_size check since commit a1bb27b1e98. Since we have a total of 'wpgrps_size' bits, the latest valid group bit is 'wpgrps_size - 1'. Since we now check the group bit is in range, remove the assertion. Include the qtest reproducer provided by Alexander Bulekov: $ make check-qtest-i386 ... Running test qtest-i386/fuzz-sdcard-test qemu-system-i386: ../hw/sd/sd.c:824: sd_wpbits: Assertion `wpnum < sd->wpgrps_size' failed. Cc: qemu-stable@nongnu.org Reported-by: OSS-Fuzz (Issue 29225) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/495 Signed-off-by: Philippe Mathieu-Daudé --- hw/sd/sd.c | 4 ++-- tests/qtest/fuzz-sdcard-test.c | 36 ++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 707dcc12a14..273af75c1be 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -820,8 +820,8 @@ static uint32_t sd_wpbits(SDState *sd, uint64_t addr) wpnum = sd_addr_to_wpnum(addr); - for (i = 0; i < 32; i++, wpnum++, addr += WPGROUP_SIZE) { - assert(wpnum < sd->wpgrps_size); + for (i = 0; i < 32 && wpnum < sd->wpgrps_size - 1; + i++, wpnum++, addr += WPGROUP_SIZE) { if (addr >= sd->size) { /* * If the addresses of the last groups are outside the valid range, diff --git a/tests/qtest/fuzz-sdcard-test.c b/tests/qtest/fuzz-sdcard-test.c index 96602eac7e5..ae14305344a 100644 --- a/tests/qtest/fuzz-sdcard-test.c +++ b/tests/qtest/fuzz-sdcard-test.c @@ -52,6 +52,41 @@ static void oss_fuzz_29225(void) qtest_quit(s); } +/* + * https://gitlab.com/qemu-project/qemu/-/issues/495 + * Used to trigger: + * Assertion `wpnum < sd->wpgrps_size' failed. + */ +static void oss_fuzz_36217(void) +{ + QTestState *s; + + s = qtest_init(" -display none -m 32 -nodefaults -nographic" + " -device sdhci-pci,sd-spec-version=3 " + "-device sd-card,drive=d0 " + "-drive if=none,index=0,file=null-co://,format=raw,id=d0"); + + qtest_outl(s, 0xcf8, 0x80001010); + qtest_outl(s, 0xcfc, 0xe0000000); + qtest_outl(s, 0xcf8, 0x80001004); + qtest_outw(s, 0xcfc, 0x02); + qtest_bufwrite(s, 0xe000002c, "\x05", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x37", 0x1); + qtest_bufwrite(s, 0xe000000a, "\x01", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x29", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x02", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x03", 0x1); + qtest_bufwrite(s, 0xe0000005, "\x01", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x06", 0x1); + qtest_bufwrite(s, 0xe000000c, "\x05", 0x1); + qtest_bufwrite(s, 0xe000000e, "\x20", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x08", 0x1); + qtest_bufwrite(s, 0xe000000b, "\x3d", 0x1); + qtest_bufwrite(s, 0xe000000f, "\x1e", 0x1); + + qtest_quit(s); +} + int main(int argc, char **argv) { const char *arch = qtest_get_arch(); @@ -60,6 +95,7 @@ int main(int argc, char **argv) if (strcmp(arch, "i386") == 0) { qtest_add_func("fuzz/sdcard/oss_fuzz_29225", oss_fuzz_29225); + qtest_add_func("fuzz/sdcard/oss_fuzz_36217", oss_fuzz_36217); } return g_test_run(); -- 2.31.1 From MAILER-DAEMON Thu Jul 29 08:49:08 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m95TI-0007Y5-G6 for mharc-qemu-stable@gnu.org; Thu, 29 Jul 2021 08:49:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39756) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m95TG-0007SN-Tl for qemu-stable@nongnu.org; Thu, 29 Jul 2021 08:49:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:21439) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m95TD-00009V-GC for qemu-stable@nongnu.org; Thu, 29 Jul 2021 08:49:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627562942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yC4/zBDY8QMNvM8ZxjVJNwEtxk0cvUbhU5dWR+Ub3pc=; b=NCNqnwxruNtz2yFAOc+/QqKHQXXCRmkzCYrisvtsAeyLj6BMxGHItUWUjBfUZj9yPXCi/u srKk9XdOJPho8suOFMa8erUIuYHE98hY+zZD0GckH9ygZ8OaxhZNy0IzIz3GF+azyNNrY8 VudWKVlgbbTPNhSUeXuJPgE+4GMOWr8= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-423-OQxv1CwQNpuV7gNqDjCxZg-1; Thu, 29 Jul 2021 08:49:01 -0400 X-MC-Unique: OQxv1CwQNpuV7gNqDjCxZg-1 Received: by mail-wr1-f71.google.com with SMTP id u26-20020adfb21a0000b029013e2b4a9d1eso2207379wra.4 for ; Thu, 29 Jul 2021 05:49:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=yC4/zBDY8QMNvM8ZxjVJNwEtxk0cvUbhU5dWR+Ub3pc=; b=LhR9FxKvURYj/AB1s9DY3X3PwLhO9pIw92nEN5AMfLZ9LqXltlzZEZeQQpU6O8/ueT Wket/6NC6hxiIYJZYjjzbXghsDNJPeXmTtpk4xgbWBF69MQcWc3FjeER1jsx8kCWuuAv 99Jbsa9c97zMJ8NxDh5pxk7tLJr7N4gvf8YP0sGaqUf9yPJkfbR5V/h9Woem2C3aekP8 42H0tVeMiKFINXc/6/fZOz9qsGvwXl6XUFp5q2hJ1qaZf2SwlhsA7wRWtiM0o9rxcfQT PiWuSENLUKcHT9da5Bj80rceFcyK16JAPt++BnXc7S9Q0sHPASqMLtyPIhxIVNlg46tN m08w== X-Gm-Message-State: AOAM533o7MHysLab9UO5ylx0PEnFmUjt3fzF4jV2BC4VIYEGvIZTJV45 IYi55gtpYlJ263S8SguSPSQVte5btxC16pGEYDIRd/2mzoHx997RujAPWgGKkX+wL2LOicqTWg6 ErHYw/au9/3j8h0rmGuXVyBiEJVQIa9kfp9JOOamF/wZMfAs7ABEB6CR1lrfK6TT81Q== X-Received: by 2002:a5d:5905:: with SMTP id v5mr4671111wrd.205.1627562939694; Thu, 29 Jul 2021 05:48:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxi+Yu2T1LWXwsPIiLDIE657QqnNXgo3hLXh7gWuj6N5VIusYRJwF0LFvAaiYrMGor+wt90Ag== X-Received: by 2002:a5d:5905:: with SMTP id v5mr4671090wrd.205.1627562939525; Thu, 29 Jul 2021 05:48:59 -0700 (PDT) Received: from [192.168.1.36] (122.red-83-42-66.dynamicip.rima-tde.net. [83.42.66.122]) by smtp.gmail.com with ESMTPSA id w13sm4182490wru.72.2021.07.29.05.48.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 29 Jul 2021 05:48:59 -0700 (PDT) Subject: Re: [PATCH v2] hw/net/can: sja1000 fix buff2frame_bas and buff2frame_pel when dlc is out of std CAN 8 bytes To: Pavel Pisa , qemu-devel@nongnu.org, Paolo Bonzini , Jason Wang , Qiang Ning Cc: Vikram Garhwal , Jan Charvat , Jin-Yang , qemu-stable@nongnu.org References: <20210729123327.14650-1-pisa@cmp.felk.cvut.cz> From: =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= Message-ID: <5e493064-30d8-17b5-7760-bdf143ddf9a7@redhat.com> Date: Thu, 29 Jul 2021 14:48:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210729123327.14650-1-pisa@cmp.felk.cvut.cz> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=philmd@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=216.205.24.124; envelope-from=philmd@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -35 X-Spam_score: -3.6 X-Spam_bar: --- X-Spam_report: (-3.6 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.717, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.125, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jul 2021 12:49:07 -0000 "hw/net/can: sja1000 fix buff2frame* when dlc is out of std CAN 8 bytes" On 7/29/21 2:33 PM, Pavel Pisa wrote: > Problem reported by openEuler fuzz-sig group. > > The buff2frame_bas function (hw\net\can\can_sja1000.c) > infoleak(qemu5.x~qemu6.x) or stack-overflow(qemu 4.x). Cc: qemu-stable@nongnu.org > Reported-by: Qiang Ning > Signed-off-by: Pavel Pisa Reviewed-by: Philippe Mathieu-Daudé > --- > hw/net/can/can_sja1000.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/hw/net/can/can_sja1000.c b/hw/net/can/can_sja1000.c > index 42d2f99dfb..34eea684ce 100644 > --- a/hw/net/can/can_sja1000.c > +++ b/hw/net/can/can_sja1000.c > @@ -275,6 +275,10 @@ static void buff2frame_pel(const uint8_t *buff, qemu_can_frame *frame) > } > frame->can_dlc = buff[0] & 0x0f; > > + if (frame->can_dlc > 8) { > + frame->can_dlc = 8; > + } > + > if (buff[0] & 0x80) { /* Extended */ > frame->can_id |= QEMU_CAN_EFF_FLAG; > frame->can_id |= buff[1] << 21; /* ID.28~ID.21 */ > @@ -311,6 +315,10 @@ static void buff2frame_bas(const uint8_t *buff, qemu_can_frame *frame) > } > frame->can_dlc = buff[1] & 0x0f; > > + if (frame->can_dlc > 8) { > + frame->can_dlc = 8; > + } > + > for (i = 0; i < frame->can_dlc; i++) { > frame->data[i] = buff[2 + i]; > } > From MAILER-DAEMON Thu Jul 29 09:41:10 2021 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1m96He-0007Si-LO for mharc-qemu-stable@gnu.org; Thu, 29 Jul 2021 09:41:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36768) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m95FU-0004K0-RU; Thu, 29 Jul 2021 08:34:52 -0400 Received: from relay.felk.cvut.cz ([2001:718:2:1611:0:1:0:70]:30812) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m95FS-0007fC-6V; Thu, 29 Jul 2021 08:34:52 -0400 Received: from cmp.felk.cvut.cz (haar.felk.cvut.cz [147.32.84.19]) by relay.felk.cvut.cz (8.15.2/8.15.2) with ESMTP id 16TCXu5e088617; Thu, 29 Jul 2021 14:33:56 +0200 (CEST) (envelope-from pisa@cmp.felk.cvut.cz) Received: from haar.felk.cvut.cz (localhost [127.0.0.1]) by cmp.felk.cvut.cz (8.14.0/8.12.3/SuSE Linux 0.6) with ESMTP id 16TCXtqh006610; Thu, 29 Jul 2021 14:33:55 +0200 Received: (from pisa@localhost) by haar.felk.cvut.cz (8.14.0/8.13.7/Submit) id 16TCXt6C006609; Thu, 29 Jul 2021 14:33:55 +0200 From: Pavel Pisa To: qemu-devel@nongnu.org, Paolo Bonzini , Jason Wang , Qiang Ning , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Cc: Vikram Garhwal , Jan Charvat , Jin-Yang , qemu-stable@nongnu.org, Pavel Pisa Subject: [PATCH v2] hw/net/can: sja1000 fix buff2frame_bas and buff2frame_pel when dlc is out of std CAN 8 bytes Date: Thu, 29 Jul 2021 14:33:27 +0200 Message-Id: <20210729123327.14650-1-pisa@cmp.felk.cvut.cz> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-FELK-MailScanner-Information: X-MailScanner-ID: 16TCXu5e088617 X-FELK-MailScanner: Found to be clean X-FELK-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-0.098, required 6, BAYES_00 -0.50, KHOP_HELO_FCRDNS 0.40, SPF_HELO_NONE 0.00, SPF_NONE 0.00) X-FELK-MailScanner-From: pisa@cmp.felk.cvut.cz X-FELK-MailScanner-Watermark: 1628166837.85924@BdDccHH9XTy5OQS5MPtL/A Received-SPF: none client-ip=2001:718:2:1611:0:1:0:70; envelope-from=pisa@cmp.felk.cvut.cz; helo=relay.felk.cvut.cz X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Thu, 29 Jul 2021 09:41:08 -0400 X-BeenThere: qemu-stable@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jul 2021 12:34:53 -0000 Problem reported by openEuler fuzz-sig group. The buff2frame_bas function (hw\net\can\can_sja1000.c) infoleak(qemu5.x~qemu6.x) or stack-overflow(qemu 4.x). Reported-by: Qiang Ning Signed-off-by: Pavel Pisa --- hw/net/can/can_sja1000.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/hw/net/can/can_sja1000.c b/hw/net/can/can_sja1000.c index 42d2f99dfb..34eea684ce 100644 --- a/hw/net/can/can_sja1000.c +++ b/hw/net/can/can_sja1000.c @@ -275,6 +275,10 @@ static void buff2frame_pel(const uint8_t *buff, qemu_can_frame *frame) } frame->can_dlc = buff[0] & 0x0f; + if (frame->can_dlc > 8) { + frame->can_dlc = 8; + } + if (buff[0] & 0x80) { /* Extended */ frame->can_id |= QEMU_CAN_EFF_FLAG; frame->can_id |= buff[1] << 21; /* ID.28~ID.21 */ @@ -311,6 +315,10 @@ static void buff2frame_bas(const uint8_t *buff, qemu_can_frame *frame) } frame->can_dlc = buff[1] & 0x0f; + if (frame->can_dlc > 8) { + frame->can_dlc = 8; + } + for (i = 0; i < frame->can_dlc; i++) { frame->data[i] = buff[2 + i]; } -- 2.20.1