qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 09/11] blockjob: add block_job_start


From: John Snow
Subject: Re: [Qemu-devel] [PATCH v2 09/11] blockjob: add block_job_start
Date: Mon, 17 Oct 2016 14:00:21 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0



On 10/06/2016 06:44 PM, John Snow wrote:


On 10/05/2016 11:17 AM, Kevin Wolf wrote:
Am 01.10.2016 um 00:00 hat John Snow geschrieben:
Instead of automatically starting jobs at creation time via backup_start
et al, we'd like to return a job object pointer that can be started
manually at later point in time.

For now, add the block_job_start mechanism and start the jobs
automatically as we have been doing, with conversions job-by-job coming
in later patches.

Of note: cancellation of unstarted jobs will perform all the normal
cleanup as if the job had started, particularly abort and clean. The
only difference is that we will not emit any events, because the job
never actually started.

Signed-off-by: John Snow <address@hidden>

Should we make sure that jobs are only added to the block_jobs list once
they are started? It doesn't sound like a good idea to make a job
without a coroutine user-accessible.

Kevin


That would certainly help limit exposure to a potentially dangerous
object, but some operations on these un-started jobs are still perfectly
reasonable, like cancellation. Even things like "set speed" are
perfectly reasonable on an unstarted job.

I'd rather just verify that having an accessible job cannot be operated
on unsafely via the public interface, even though that's more work.

So here's the list:

-block_job_next: Harmless.

-block_job_get: Harmless.

-block_job_set_speed: Depends on the set_speed implementation, but
backup_set_speed makes no assumptions and that's the only job I am
attempting to convert in this series.

-block_job_cancel: Edited to specifically support pre-started jobs in
this patch.

-block_job_complete: Edited to check for a coroutine specifically, but
even without, a job will not be able to become ready without running first.

-block_job_query: Harmless* (*I could probably expose a 'started'
variable so that libvirt didn't get confused as to why there are jobs
that exist but are not busy nor paused.)

-block_job_pause: Harmless**

-block_job_user_pause: Harmless**

-block_job_user_paused: Harmless, if meaningless.

-block_job_resume: **We will attempt to call block_job_enter, but
conditional on job->co, we do nothing, so it's harmless if not
misleading that you can pause/resume to your heart's content.

-block_job_user_resume: ^ http://i.imgur.com/2zYxrIe.png ^

-block_job_cancel_sync: Safe, due to edits to block_job_cancel. Polling
loop WILL complete as normal, because all jobs will finish through
block_job_completed one way or another.

-block_job_cancel_sync_all: As safe as the above.

-block_job_complete_sync: Safe, complete will return error for unstarted
jobs.

-block_job_iostatus_reset: Harmless, I think -- backup does not
implement this method. (Huh, *no* jobs implement iostatus_reset at the
moment...)

-block_job_txn_new: Doesn't operate on jobs.

-block_job_txn_unref: Also doesn't operate on jobs.

-block_job_get_aio_context: Harmless.

-block_job_txn_add_job: Safe and intended! There is trickery here,
though, as once a job is introduced into transactions it opens it up to
the private interface. This adds the following functions to considerations:

-block_job_completed: Safe, does not assume a coroutine anywhere.

-block_job_completed_single: Specifically updated to be context-aware of
if we are pre-started or not. This is the "real" completion mechanism
for BlockJobs that gets run for transactional OR individual jobs.

-block_job_completed_txn_abort: ***BUG***! The problem with the patch as
I've sent it is that cancel calls completed (under the premise that
nobody else would ever get to be able to), but we call both cancel AND
completed from this function, which will cause us to call completed
twice on pre-started jobs. I will fix this for the next version.


Actually, I was mistaken. The way I wrote it is fine, and it will work like this:

qmp_transaction encounters an error during prepare and calls the abort method for each action.

- The abort method calls block_job_cancel on the not-yet-started job,
- Which in turn calls block_job_completed with ret = -ECANCELED

which then either calls:
- block_job_completed_single with a failure mode (which calls .abort and .clean) if we are not in a transaction, or
- block_job_completed_txn_abort

txn_abort is going to:

1) Set txn->aborting to true, and
2) Cancel every other job in the transaction using block_job_cancel_sync.

These jobs will also come to txn_abort, but since txn->aborting is true, will become a NOP.

Then, finally, we will call block_job_completed_single on every job in the transaction which will perform our proper cleanup by calling .abort and .clean for each job.

No duplication, I was mistaken...!

-block_job_completed_txn_success: Should never be called; if it IS, the
presence of an unstarted job in the transaction will cause an early
return. And even if I am utterly wrong and every job in the transaction
completes successfully (somehow?) calling block_job_completed_single is
perfectly safe by design.


Everything else is either internal to block job instances or the
blockjob core.


There may be:
(A) Bugs in my code/thinking, or
(B) Improvements we can make to the readability,

but I believe that this is (Apart from the mentioned bug) not dangerous.

--js




reply via email to

[Prev in Thread] Current Thread [Next in Thread]