qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit j


From: John Snow
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH v5 00/21] blockjobs: add explicit job management
Date: Tue, 17 Apr 2018 12:56:16 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0


On 04/17/2018 09:44 AM, Markus Armbruster wrote:
> John Snow <address@hidden> writes:
> 
>> This series seeks to address two distinct but closely related issues
>> concerning the job management API.
>>
>> (1) For jobs that complete when a monitor is not attached and receiving
>>     events or notifications, there's no way to discern the job's final
>>     return code. Jobs must remain in the query list until dismissed
>>     for reliable management.
>>
>> (2) Jobs that change the block graph structure at an indeterminate point
>>     after the job starts compete with the management layer that relies
>>     on that graph structure to issue meaningful commands.
>>
>>     This structure should change only at the behest of the management
>>     API, and not asynchronously at unknown points in time. Before a job
>>     issues such changes, it must rely on explicit and synchronous
>>     confirmation from the management API.
>>
>> These changes are implemented by formalizing a State Transition Machine
>> for the BlockJob subsystem.
>>
>> Job States:
>>
>> UNDEFINED       Default state. Internal state only.
>> CREATED         Job has been created
>> RUNNING         Job has been started and is running
>> PAUSED          Job is not ready and has been paused
>> READY           Job is ready and is running
>> STANDBY         Job is ready and is paused
>>
>> WAITING         Job is waiting on peers in transaction
>> PENDING         Job is waiting on ACK from QMP
>> ABORTING        Job is aborting or has been cancelled
>> CONCLUDED       Job has finished and has a retcode available
>> NULL            Job is being dismantled. Internal state only.
>>
>> Job Verbs:
>>

Backporting your quote up here:

> For each job verb and job state: what's the new job state?
>

That's not always 1:1, though I tried to address it in the commit messages.


>> CANCEL          Instructs a running job to terminate with error,
>>                 (Except when that job is READY, which produces no error.)

CANCEL will take a job to either NULL... (this is the early abort
pathway, prior to the job being fully realized.)

...or to ABORTING (from CREATED once it has fully realized the job, or
from RUNNING, READY, WAITING, or PENDING.)

>> PAUSE           Request a job to pause.

issued to RUNNING or READY, transitions to PAUSED or STANDBY respectively.

>> RESUME          Request a job to resume from a pause.

issued to PAUSED or STANDBY, transitions to RUNNING or READY respectively.

>> SET-SPEED       Change the speed limiting parameter of a job.

No run state change.

>> COMPLETE        Ask a READY job to finish and exit.
>>

Issued to a READY job, transitions to WAITING.

>> FINALIZE        Ask a PENDING job to perform its graph finalization.

Issued to a PENDING job, transitions to CONCLUDED.

>> DISMISS         Finish cleaning up an empty job.
> 

Issued to a CONCLUDED job, transitions to NULL.


>> And here's my stab at a diagram:
>>
>>                  +---------+
>>                  |UNDEFINED|
>>                  +--+------+
>>                     |
>>                  +--v----+
>>        +---------+CREATED+-----------------+
>>        |         +--+----+                 |
>>        |            |                      |
>>        |         +--+----+     +------+    |
>>        +---------+RUNNING<----->PAUSED|    |
>>        |         +--+-+--+     +------+    |
>>        |            | |                    |
>>        |            | +------------------+ |
>>        |            |                    | |
>>        |         +--v--+       +-------+ | |
>>        +---------+READY<------->STANDBY| | |
>>        |         +--+--+       +-------+ | |
>>        |            |                    | |
>>        |         +--v----+               | |
>>        +---------+WAITING<---------------+ |
>>        |         +--+----+                 |
>>        |            |                      |
>>        |         +--v----+                 |
>>        +---------+PENDING|                 |
>>        |         +--+----+                 |
>>        |            |                      |
>>     +--v-----+   +--v------+               |
>>     |ABORTING+--->CONCLUDED|               |
>>     +--------+   +--+------+               |
>>                     |                      |
>>                  +--v-+                    |
>>                  |NULL<--------------------+
>>                  +----+
> 
> Is this diagram missing a few arrowheads?  E.g. on the edge between
> RUNNING and WAITING.
> 

Apparently yes. :\

(Secretly fixed up in my reply.)

> Might push the limits of ASCII art, but here goes anyway: can we label
> the arrows with job verbs?
> 

Can you recommend a tool to help me do that? I've been using asciiflow
infinity (http://asciiflow.com) and it's not very good, but I don't have
anything better.

> Can you briefly explain how this state machine addresses (1) and (2)?
> 

(1) The CONCLUDED state allows jobs to persist in the job query list
after they would have disappeared in 2.11-era QEMU. This lets us query
for completion codes and to dismiss the job at our own leisure.

(2) The PENDING state allows jobs to wait in a nearly-completed state,
pending authorization from the QMP client to make graph changes.
Otherwise, the job has to asynchronously perform this cleanup and the
exact point in time is unknowable to the QMP client. By making a PENDING
state and a finalize callback (.prepare), we can make this portion of a
job's task synchronous.



"John, you added more than two states..."

Yup, this was to help simplify the existing state machine, believe it or
not. I modeled all jobs as transactions to eliminate different cleanup
routing and added two new interim states;

- WAITING
- ABORTING

to help make assertions about the valid transitions jobs can make. The
ABORTING state helps make it clear when a job is allowed to fail (and
emit QMP events related to such).

The WAITING state is simply advisory to help a client know that a job is
"finished" but cannot yet receive further instruction because of peers
in a transaction. This helps me to add nice QMP errors for any verbs
issued to such jobs. "Sorry pal, this job is waiting and can't hear you
right now!"

This kept the code cleaner than adding a bunch of very fragile boolean
error-checking pathways in dozens of helper functions to help avoid
illegal instructions on jobs not prepared to receive those instructions.

So these two new states don't help accomplish (1) or (2) strictly, but
they do facilitate the code additions that _do_ a lot less ugly.

--js




reply via email to

[Prev in Thread] Current Thread [Next in Thread]