qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.0? 1/3] job: Add job_wait_unpaused() for block-job-comp


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH for-6.0? 1/3] job: Add job_wait_unpaused() for block-job-complete
Date: Fri, 9 Apr 2021 13:07:08 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0

09.04.2021 12:51, Max Reitz wrote:
On 08.04.21 19:26, Vladimir Sementsov-Ogievskiy wrote:
08.04.2021 20:04, John Snow wrote:
On 4/8/21 12:58 PM, Vladimir Sementsov-Ogievskiy wrote:
job-complete command is async. Can we instead just add a boolean like 
job->completion_requested, and set it if job-complete called in STANDBY state, 
and on job_resume job_complete will be called automatically if this boolean is 
true?

job_complete has a synchronous setup, though -- we lose out on a lot of 
synchronous error checking in that circumstance.

yes, that's a problem..


I was not able to audit it to determine that it'd be safe to attempt that setup 
during a drained section -- I imagine it won't work and will fail, though.

So I thought we'd have to signal completion and run the setup *later*, but what do we do 
if we get an error then? Does the entire job fail? Do we emit some new event? 
("BLOCK_JOB_COMPLETION_FAILED" ?) Is it recoverable?


Isn't it possible even now, that after successful job-complete job still fails 
and we report BLOCK_JOB_COMPLETED with error?

And actually, how much benefit user get from the fact that job-complete may 
fail?

We can make job-complete a simple always-success boolean flag setter like 
job-pause.

I wanted to say the following:

  But job-pause does always succeed, in contrast to block-job-complete.

  block-job-complete is more akin to job-finalize, which too is a
  synchronous operation.

But when I wrote that last sentence, I asked myself whether what 
mirror_complete() does isn’t actually a remnant of what we had to do when we 
didn’t have job-finalize yet.  Shouldn’t that all be in mirror_exit_common()?  
What’s the advantage of opening the backing chain or putting blockers on the 
to-replace node in block-job-complete? Aren’t that all graph-changing 
operation, basically, i.e. stuff that should be done in job-finalize?

If we move everything to mirror_exit_common(), all that remains to do is 
basically set some should_complete flag (could even be part of the Job struct), 
and then the whole problem disappears.

Thoughts?


Sounds good.. ButI want to understand first one simple thing: can job fail even 
after block-job-complete succeeded?

As I understand current users think that it can't. And block-job-complete is documented 
as "This command completes an active background block operation synchronously". 
So it's assumed that if block-job-complete succeeded we are totally done.

But maybe, it's wrong? Can mirror_prepare fail after mirror_complete success? 
And user must check job status after job is finalized? Or check error in 
BLOCK_JOB_COMPLETED event?



--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]