Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use blo

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use blo

From:	John Snow
Subject:	Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context
Date:	Thu, 6 Oct 2016 16:22:33 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0



On 10/05/2016 10:02 AM, Kevin Wolf wrote:

Am 01.10.2016 um 00:00 hat John Snow geschrieben:

There are a few places where we're fishing it out for ourselves.
Let's not do that and instead use the helper.

Signed-off-by: John Snow <address@hidden>


That change makes a difference when the block job is running its
completion part after block_job_defer_to_main_loop(). The commit message
could be more explicit about whether this is intentional or whether this
case is expected to happen at all.

I suspect that if it can happen, this is a bug fix. Please check and
update the commit message accordingly.

Because I'm bad with being concise, I wrote a TLDR at the bottom.Otherwise, enjoy this wall of text.

Kevin


Intentional under the premise of:

(1) Acquiring the context for which a job is not actually running underis likely incorrect (or at the very least misleading), and

(2) If using the main thread context for any would-be callers isincorrect, this is a problem with the job lifetime that needs to becorrected anyway.

In general, if we are acquiring the context to secure exclusive accessto the BlockJob state itself, using the getter here is perfectly safe.If we are acquiring context for other reasons, we need to consider morecarefully.



The callers are:

(A) bdrv_drain_all (block/io)

Obtains context for the sake of pause/resume. Pauses all jobs beforedraining all BDSes. For starters, pausing a job that has deferred tomain has no effect (and neither does resuming). This usage appearsslightly erroneous, though, in that if we are not running from the mainthread, we are definitely not securing exclusive rights to the blockjob. We could, in theory, race on reads/writes to the pause count field.This would be a bugfix.


(B) find_block_job (all monitor context)

        Acquires context as a courtesy for its callers:
        - qmp_block_job_set_speed
        - qmp_block_job_cancel
        - qmp_block_job_pause
        - qmp_block_job_resume
        - qmp_block_job_complete

In an "already deferred to main" sense... in general, if the job hasalready deferred to main we don't need to acquire the block's context toget safe access to the job, because we're already running in the maincontext. Further, none of these functions actually have any meaning fora job in such a state.


        - set_speed: Sets speed parameters, harmless either way.

- cancel: Will set the cancelled boolean, reset iostatus, then attemptto enter the job. Since a job that enters the main loop remains busy,the enter is a NOP. The BlockBackend AIO context here is thereforeextraneous, and the getter is safe.

        - pause: Only increments a counter, and will have no effect.

- resume: Decrements a counter. Attempts to enter(), but as statedabove this is a NOP.- complete: Calls .complete(), for which the only implementation ismirror_complete. Uh, this actually seems messy. Looks like there'snothing to prevent us from calling this after we've already told it tocomplete once. This could be a legitimate bug that this patch doesnothing in particular to address. If complete() is shored up such thatit can be called precisely once, this becomes safe.


(C) qmp_query_block_jobs (monitor context)

        Just a getter. Using get_context is safe in either state.

(D) run_block_job (qemu-img)

Never called when the context is in the main loop anyway. Effectivelyno change here.

So, with the exception of .complete, I think this is a safe change as itstands... However... Paolo wants to complicate my life and get rid ofthis getter for his own fiendish purposes. He suggests pushing downcontext acquisition into blockjob.c directly for any QMP callers:


- qmp_block_job_set_speed -> block_job_set_speed
- qmp_block_job_cancel -> block_job_cancel
- qmp_block_job_pause -> block_job_user_pause
- qmp_block_job_resume -> block_job_user_resume
- qmp_block_job_complete -> block_job_complete
- qmp_query_block_jobs -> block_job_query

Most of these have only one caller in the QMP layer:

block_job_set_speed
block_job_user_pause
block_job_user_resume
block_job_query

These can easily just take the context they need, removing external usesof job->blk for purposes of acquiring the context.


block_job_cancel and block_job_complete are different.

block_job_cancel is called in many places, but we can just add a similarblock_job_user_cancel if we wanted a version which takes care to acquirecontext and one that does not. (Or we could just acquire the contextregardless, but Paolo warned me ominously that recursive locks are EVIL.He sounded serious.)

block_job_complete has no direct callers outside of QMP, but it is alsoused as a callback by block_job_complete_sync, used in qemu-img forrun_block_job. I can probably rewrite qemu_img to avoid this usage.




TLDR:

- This change should be perfectly safe, but Paolo wants to get rid ofthis usage anyway.

- At least 5/6 uses of external context grabbing can be internalized easily.

- qemu-img's run_block_job needs to be refactored a bit, though I don'thave an idea for that yet, but as you pointed out it needs to be donefor the public/private split anyway.

- block_job_complete needs to be touched up no matter what we do.

- The aio_context getter can probably be removed entirely as per Paolo'swishes, but I'll have to change bdrv_drain_all a bit. Ablock_job_pause_all and block_job_resume all would work, though that's abit special purpose. I could craft up a block_job_apply_all for thepurpose instead. (e.g. block_job_apply_all(block_job_pause))


I think that answers everyone's questions...
--js

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-block] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context, Kevin Wolf, 2016/10/05
- Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context, John Snow <=
  - Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context, Paolo Bonzini, 2016/10/07
    - Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context, John Snow, 2016/10/12
    - Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context, Paolo Bonzini, 2016/10/13

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions
Next by Date: Re: [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes
Previous by thread: Re: [Qemu-block] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context
Next by thread: Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context
Index(es):
- Date
- Thread