qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 00/13] tpm: Extend TPM with state migration s


From: Stefan Berger
Subject: Re: [Qemu-devel] [PATCH v3 00/13] tpm: Extend TPM with state migration support (not 2.11)
Date: Wed, 27 Dec 2017 10:00:08 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

On 12/22/2017 11:13 AM, Marc-André Lureau wrote:
Hi

On Fri, Dec 22, 2017 at 4:59 PM, Stefan Berger
<address@hidden> wrote:
On 12/22/2017 07:49 AM, Marc-André Lureau wrote:
Hi

On Fri, Nov 10, 2017 at 3:11 PM, Stefan Berger
<address@hidden> wrote:
This set of patches implements support for migrating the state of the
external 'swtpm' TPM emulator as well as that of the emulated device
interfaces. I have primarily tested this with the TIS and TPM 1.2 so
far, but it also seems to work with TPM 2.

The TIS is simplified first by reducing the number of buffers and read
and write offsets into these buffers. Following the state machine of the
TIS, a single buffer and r/w offset is enough for all localities since
only one locality can ever be active.

This series applies on top of my tpm-next branch.

One of the challenges that is addressed by this set of patches is the
fact
that the TPM emulator may be processing a command while the state
serialization of the devices is supposed to happen. A necessary first
step
has been implemented here that ensures that a response has been received
from the exernal emulator and the bottom half function, which delivers
the
response and adjusts device registers (TIS or CRB), has been executed,
before the device's state is serialized.

A subsequent extension may need to address the live migration loop and
delay
the serialization of devices until the response from the external TPM has
been received. Though the likelihood that someone executes a long-lasting
TPM command while this is occurring is certainly rare.

     Stefan

Stefan Berger (13):
    tpm_tis: convert uint32_t to size_t
    tpm_tis: limit size of buffer from backend
    tpm_tis: remove TPMSizeBuffer usage
    tpm_tis: move buffers from localities into common location
    tpm_tis: merge read and write buffer into single buffer
    tpm_tis: move r/w_offsets to TPMState
    tpm_tis: merge r/w_offset into rw_offset
    tpm: Implement tpm_sized_buffer_reset
ok for the above (you could queue/pull-req this now)

    tpm: Introduce condition to notify waiters of completed command
    tpm: Introduce condition in TPM backend for notification
    tpm: implement tpm_backend_wait_cmd_completed
    tpm: extend TPM emulator with state migration support
    tpm_tis: extend TPM TIS with state migration support
Much of the complexity from this migration series comes with the
handling & synchronization of the IO thread.

I think having a seperate thread doesn't bring much today TPM thread.
it is a workaround for the chardev API being mostly synchronous. Am I
wrong? (yes, passthrough doesn't use chardev, but it should probably
use qio or chardev internally)

Other kind of devices using a seperate process suffer the same
problem. Just grep for qemu_chr_fe_write and look for the preceding
comment, it's a common flaw in qemu code base. Code use the
synchronous API, and sometime use non-blocking write
(hw/usb/redirect.c seems quite better!)

I would like to get rid of the seperate thread in TPM before we add
migration support. We should try to improve the chardev API to make it
easier to do non-blocking IO. This should considerably simplify the
above patches and benefit the rest of qemu (instead of having everyone
doing its own thing with seperate thread or other kind of continuation
state).

What do you think?

I am wondering whether it will help us to get rid of the
conditions/notifications, like patches 9-11 of this series. I doubt 12 and
13 will change. At some point device state is supposed to be written out and
in case of the TPM we have to wait for the response to have come back into
the backend. We won't start listening on the file descriptor for an
outstanding response, so I guess we will still wait for a notification in
that case as well. So I am not sure which parts are going to be simpler...
Why would qemu need to wait for completion of emulator? Couldn't qemu
& emulator save the current state, including in-flights commands?
That's apparently what usbredir does.

The thread is sending the commands to the external emulator and waits for the reception of the response. Once the response is received, the delivery of the response is handed off to the bottom half. Using the bottom half avoids synchronization primitives since we know that no other thread is in the device emulator when it runs (except for our thread sending to and receiving from the emulator).

This series of patches introduces a tpm_busy flag that the device emulation thread sets once it hands off a command to the thread for processing. It does this in tpm_backend_deliver_request(). It resets that flag once it has scheduled the bottom half to run for delivering the response to the device emulator (TIS, CRB etc.). This in turn is happening in tpm_backend_worker_thread().

For the migration we have the following cases once the device (frontend or backend) is supposed to write out its state: [Once the state of the device is supposed to be written out we know what the OS is paused and no further TIS registers will be written to after that and no further thread is busy emulating writes to TIS registers. TIS being an example here for 'device emulation'].

- no TPM command is currently processed: the tpm_busy flag is false and we can write out the state immediately. The .pre_save function does nothing in this case.

- a TPM command is currently processed: we have to wait for the response to come back and the thread to set the tpm_busy flag to false. This is what tpm_backend_wait_cmd_completed() is good for. Once that has happened, the bottom half will have been scheduled by the thread but the bottom half will not run at this point, so we have to do what the bottom half is supposed to do. All this is happening in the .pre_save function. Here we call tpm_backend_wait_cmd_completed() which returns once the response has been received back. We then check a state variable that indicates whether the front-end is executing a command and run the bottom half's function. (Improvement: we could also return true in case we had to wait for the response to come back and avoid the check on the state variable and just run the bottom half's function due to the preceding wait)

Your refactoring has actually helped a lot in simplifying this and pushed a lot of the code into the common tpm_backend_* functions. The code to suspend/resume looks rather simple at this point. It mostly comes down to refactoring the function the bottom half is supposed to run so it can be run from the .pre_save function as well.

TIS: https://github.com/stefanberger/qemu-tpm/commit/3b0689619be9a1f3fbbea2aa2f98802512f96099 CRB: https://github.com/stefanberger/qemu-tpm/commit/62db4ace06e77c8e5f8274e8cb9240e5f8d663ab SPAPR: https://github.com/stefanberger/qemu-tpm/commit/1f31cd8b7f690f750405d911923f624b5f856a04

I may now modify tpm_backend_wait_cmd_completed() to 'bool run_bh_function = tpm_backend_wait_cmd_completed(s)'. This should help the cause and make the 3 devices' suspend code even look more similar.

Regards,
  Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]