[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v5 10/21] blockjobs: add NULL state
From: |
John Snow |
Subject: |
Re: [Qemu-block] [PATCH v5 10/21] blockjobs: add NULL state |
Date: |
Mon, 12 Mar 2018 12:23:23 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 |
On 03/12/2018 12:07 PM, Kevin Wolf wrote:
> Am 12.03.2018 um 16:41 hat John Snow geschrieben:
>> On 03/12/2018 11:28 AM, Kevin Wolf wrote:
>>> Am 10.03.2018 um 09:27 hat John Snow geschrieben:
>>>> Add a new state that specifically demarcates when we begin to permanently
>>>> demolish a job after it has performed all work. This makes the transition
>>>> explicit in the STM table and highlights conditions under which a job may
>>>> be demolished.
>>>>
>>>> Alongside this state, add a new helper command "block_job_decommission",
>>>> which transitions to the NULL state and puts down our implicit reference.
>>>> This separates instances in the code for "block_job_unref" which merely
>>>> undo a matching "block_job_ref" with instances intended to initiate the
>>>> full destruction of the object.
>>>>
>>>> This decommission action also sets a number of fields to make sure that
>>>> block internals or external users that are holding a reference to a job
>>>> to see when it "finishes" are convinced that the job object is "done."
>>>> This is necessary, for instance, to do a block_job_cancel_sync on a
>>>> created object which will not make any progress.
>>>>
>>>> Now, all jobs must go through block_job_decommission prior to being
>>>> freed, giving us start-to-finish state machine coverage for jobs.
>>>>
>>>>
>>>> Transitions:
>>>> Created -> Null: Early failure event before the job is started
>>>> Concluded -> Null: Standard transition.
>>>>
>>>> Verbs:
>>>> None. This should not ever be visible to the monitor.
>>>>
>>>> +---------+
>>>> |UNDEFINED|
>>>> +--+------+
>>>> |
>>>> +--v----+
>>>> +---------+CREATED+------------------+
>>>> | +--+----+ |
>>>> | | |
>>>> | +--v----+ +------+ |
>>>> +---------+RUNNING<----->PAUSED| |
>>>> | +--+-+--+ +------+ |
>>>> | | | |
>>>> | | +------------------+ |
>>>> | | | |
>>>> | +--v--+ +-------+ | |
>>>> +---------+READY<------->STANDBY| | |
>>>> | +--+--+ +-------+ | |
>>>> | | | |
>>>> +--v-----+ +--v------+ | |
>>>> |ABORTING+--->CONCLUDED<-------------+ |
>>>> +--------+ +--+------+ |
>>>> | |
>>>> +--v-+ |
>>>> |NULL<---------------------+
>>>> +----+
>>>>
>>>> Signed-off-by: John Snow <address@hidden>
>>>
>>>> +static void block_job_decommission(BlockJob *job)
>>>> +{
>>>> + assert(job);
>>>> + job->completed = true;
>>>> + job->busy = false;
>>>> + job->paused = false;
>>>> + job->deferred_to_main_loop = true;
>>>
>>> Why do we set all of these fields now? I don't see the use of it, and
>>> overwriting fields here potentially makes debugging harder.
>>>
>>> Especially for deferred_to_main_loop I might expect an assert() that it
>>> already is true, but shouldn't setting it always be done while actually
>>> deferring to the main loop?
>>>
>>> Can we turn all of these assignments into asserts or are there some that
>>> actually aren't already guaranteed, but that we want anyway?
>>>
>>>> + block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
>>>> + block_job_unref(job);
>>>> +}
>>>
>>> Kevin
>>>
>>
>> Gonna be real honest; we probably only need to set maybe one field
>> (job->completed = true) but it was late and I started hitting things
>> with big hammers.
>>
>> The problem is that if jobs do not look "done" to functions like
>> finish_sync, they will loop forever trying to make progress on a job
>> that doesn't do anything.
>>
>> I set a bunch of fields here more as a semantic statement than a
>> necessity, to be really really honest. ("Well, the job definitely has
>> these properties if it made it here, so let's update these fields to be
>> correct and the rest of the code will hopefully Do The Right Thing.")
>
> So essentially, we want this to be assert(), but currently that breaks
> for some reasons and we can't figure out why before the freeze?
>
Nah, I knew exactly why it broke.
> I guess that's fair enough, but then it would be good to use the freeze
> period to find the offenders and actually turn it into assertions.
>
> Kevin
>
I appear to be horridly confused, and you haven't seen the intermediate
mess that caused my confusion. A veritable maelstrom of confusion. Mr
Babbage would not be able to rightly comprehend, &c.
Let's give this another shot.
I added that code at a time when my local branch was not calling
block_job_completed, because I declared in v4's STM that a pre-created
job "shall not pass go, and shall not collect $200" -- that CREATED jobs
should either go to RUNNING or NULL.
The discovery here is that directly decommissioning a created job
actually breaks finish_sync because it polls on the completed boolean,
which nothing ever sets. So, under the reasoning I gave you in my last
reply;
"I'm simply setting these booleans based on the facts of the state
machine at this point: we ARE completed, we AREN'T busy, we AREN'T
paused, and we have technically now deferred back to the main loop (we
never entered it.)"
this was enough for finish_sync to get out of the way, but there were
other problems with the approach -- a CREATED job has pre-2.12 called
the abort/commit callbacks, so I went back to the 2.11 style and added
the appropriate transition into the graph. I forgot to check if this
code was still necessary at that point.
So actually, as of right now, these lines in decommission are useless;
but they shouldn't be assertions ... they might fail in a few cases. for
instance, "deferred to main loop" won't be set when we cancel a CREATED job.
- Re: [Qemu-block] [PATCH v5 01/21] blockjobs: fix set-speed kick, (continued)
- [Qemu-block] [PATCH v5 03/21] Blockjobs: documentation touchup, John Snow, 2018/03/10
- [Qemu-block] [PATCH v5 06/21] iotests: add pause_wait, John Snow, 2018/03/10
- [Qemu-block] [PATCH v5 12/21] blockjobs: ensure abort is called for cancelled jobs, John Snow, 2018/03/10
- [Qemu-block] [PATCH v5 08/21] blockjobs: add ABORTING state, John Snow, 2018/03/10
- [Qemu-block] [PATCH v5 15/21] blockjobs: add prepare callback, John Snow, 2018/03/10
- [Qemu-block] [PATCH v5 10/21] blockjobs: add NULL state, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 11/21] blockjobs: add block_job_dismiss, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 13/21] blockjobs: add commit, abort, clean helpers, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 05/21] blockjobs: add state transition table, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 02/21] blockjobs: model single jobs as transactions, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 16/21] blockjobs: add waiting status, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 18/21] blockjobs: add block-job-finalize, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 04/21] blockjobs: add status enum, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 20/21] iotests: test manual job dismissal, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 21/21] tests/test-blockjob: test cancellations, John Snow, 2018/03/10
[Qemu-block] [PATCH v5 17/21] blockjobs: add PENDING status and event, John Snow, 2018/03/10