qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: intermittent hang, s390x host, bios-tables-test test, TPM


From: Daniel P . Berrangé
Subject: Re: intermittent hang, s390x host, bios-tables-test test, TPM
Date: Wed, 11 Jan 2023 09:05:44 +0000
User-agent: Mutt/2.2.9 (2022-11-12)

On Tue, Jan 10, 2023 at 05:02:58PM -0500, Stefan Berger wrote:
> 
> 
> On 1/10/23 14:47, Stefan Berger wrote:
> > 
> > 
> > On 1/10/23 14:27, Daniel P. Berrangé wrote:
> > > On Tue, Jan 10, 2023 at 01:50:26PM -0500, Stefan Berger wrote:
> > > > 
> > > > 
> > > > On 1/6/23 10:16, Stefan Berger wrote:
> > > > > This here seems to be the root cause. An unknown control channel
> > > > > command was received from the TPM emulator backend by the control 
> > > > > channel thread and we end up in g_assert_not_reached().
> > > > > 
> > > > > https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189
> > > > > 
> > > > > 
> > > > > 
> > > > >           ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), 
> > > > > NULL);
> > > > >           if (ret <= 0) {
> > > > >               break;
> > > > >           }
> > > > > 
> > > > >           cmd = be32_to_cpu(cmd);
> > > > >           switch (cmd) {
> > > > >    [...]
> > > > >           default:
> > > > >               g_debug("unimplemented %u", cmd);
> > > > >               g_assert_not_reached();                
> > > > > <------------------
> > > > >           }
> > > > > 
> > > > > I will run this test case in an endless loop on an x86_64 host and 
> > > > > see what we get there ...
> > > > 
> > > > I could not recreate the issue running the  test on a ppc64 and x86_64
> > > > host. There we like >100k test runs on ppc64 and >40k on x86_64. Also
> > > > simulating the reception of an unsupported command did not lead to a
> > > > hang like shown here.
> > > 
> > > Assuming your ppc64 host is running an little endian OS, and
> > > we're only seeing the test failure on s390x, then it points towards
> > > the problem being an endianness issue in the TPM code. Something
> > > missing a byteswap somewhere along the way ?
> > 
> > Yes, my ppc64 machine is also little endian. If the issue  was not an 
> > intermittent but a permanent
> > failure I would look for something like that. I would think it's more some 
> > sort of initialization
> > issue, like a value on the stack that occasionally set to an undesirable 
> > value -- maybe even in a
> > dependency.
> 
> I found I still had access to an s390x machine. ~2700 loops on this test case
> so far but nothing... it would be good to be able to recreate the issue and
> apply the fix but we'll have to do it without testing then I guess.
> 
> Does this look about right? From my tests with injecting an error it at least
> seems to do what it is intended to do.
> 
> diff --git a/tests/qtest/tpm-emu.c b/tests/qtest/tpm-emu.c
> index 2994d1cf42..dbc308a572 100644
> --- a/tests/qtest/tpm-emu.c
> +++ b/tests/qtest/tpm-emu.c
> @@ -36,11 +36,19 @@ void tpm_emu_test_wait_cond(TPMTestState *s)
>      g_mutex_unlock(&s->data_mutex);
>  }
> 
> +static void tpm_emu_close_data_ioc(void *ioc)
> +{
> +    g_debug("CLOSE DATA IOC");
> +    qio_channel_close(ioc, NULL);
> +}
> +
>  static void *tpm_emu_tpm_thread(void *data)
>  {
>      TPMTestState *s = data;
>      QIOChannel *ioc = s->tpm_ioc;
> 
> +    qtest_add_abrt_handler(tpm_emu_close_data_ioc, ioc);
> +
>      s->tpm_msg = g_new(struct tpm_hdr, 1);
>      while (true) {
>          int minhlen = sizeof(s->tpm_msg->tag) + sizeof(s->tpm_msg->len);
> @@ -77,12 +85,19 @@ static void *tpm_emu_tpm_thread(void *data)
>                            &error_abort);
>      }
> 
> +    qtest_remove_abrt_handler(ioc);
>      g_free(s->tpm_msg);
>      s->tpm_msg = NULL;
>      object_unref(OBJECT(s->tpm_ioc));
>      return NULL;
>  }
> 
> +static void tpm_emu_close_ctrl_ioc(void *ioc)
> +{
> +    g_debug("CLOSE CTRL IOC");
> +    qio_channel_close(ioc, NULL);
> +}
> +
>  void *tpm_emu_ctrl_thread(void *data)
>  {
>      TPMTestState *s = data;
> @@ -119,6 +134,8 @@ void *tpm_emu_ctrl_thread(void *data)
>          s->emu_tpm_thread = g_thread_new(NULL, tpm_emu_tpm_thread, s);
>      }
> 
> +    qtest_add_abrt_handler(tpm_emu_close_ctrl_ioc, ioc);

I'd suggest this be done before starting tpm_emu_tpm_thread,
immediately after the "ioc" is created. 

> +
>      while (true) {
>          uint32_t cmd;
>          ssize_t ret;
> @@ -129,6 +146,9 @@ void *tpm_emu_ctrl_thread(void *data)
>          }
> 
>          cmd = be32_to_cpu(cmd);
> +        //g_debug("cmd=%u", cmd);
> +        //if (cmd == 14)
> +        //    cmd = 100;
>          switch (cmd) {
>          case CMD_GET_CAPABILITY: {
>              ptm_cap cap = cpu_to_be64(0x3fff);
> @@ -190,6 +210,8 @@ void *tpm_emu_ctrl_thread(void *data)
>          }
>      }
> 
> +    qtest_remove_abrt_handler(ioc);
> +
>      object_unref(OBJECT(ioc));
>      object_unref(OBJECT(lioc));
>      return NULL;
> 
> > 
> >     Stefan
> > 
> > > 
> > > 
> > > With regards,
> > > Daniel
> > 
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]