[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Jobs 2.0 QAPI [RFC]
From: |
John Snow |
Subject: |
Re: [Qemu-devel] Jobs 2.0 QAPI [RFC] |
Date: |
Fri, 18 Dec 2015 16:24:47 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 12/18/2015 01:07 PM, Eric Blake wrote:
> On 12/16/2015 05:50 PM, John Snow wrote:
>> In working through a prototype to enable multiple block jobs. A few
>> problem spots in our API compatibility become apparent.
>>
>
>> ==== qmp_query_block_jobs ====
>>
>
>> Let's consider using a new command. I'm not sure if this is strictly
>> possible with current QAPI, but Eric will yell at me if it isn't --
>> forgive the wiggly pseudo-specification.
>>
>> query-jobs
>>
>> Will return a list of all jobs.
>>
>> query-jobs sys="block"
>>
>> Will return a list of all block jobs. This will be the only valid
>> subsystem to start with -- we don't have non-block jobs yet and it may
>> be some time before we do.
>>
>> Each subsystem will have a sys= enumeration, and the jobs for that
>> particular subsystem can subclass the default return type. The QAPI
>> commands can be defined to accept a union of the subclasses so we don't
>> have to specify in advance which type each command accepts, but we can
>> allow the responsible subsystem to interpret it on demand dynamically.
>>
>> The base type can be something along the lines of:
>>
>> { 'struct': 'JobInfo',
>> 'data': {
>> 'type': JobTypeEnum,
>> 'sys': JobSysEnum,
>> 'id': str,
>> 'busy': bool,
>> 'paused': bool,
>> 'ready': bool
>> }
>> }
>>
>> And the BlockJobInfo can inherit/extend, looking like this:
>>
>> { 'struct': 'BlockJobInfo',
>> 'base': JobInfo
>> 'data': {
>> 'len': int,
>> 'offset': int,
>> 'speed': 'int',
>> 'io-status': BlockDeviceIoStatus,
>> 'devices': <see below>,
>> }
>> }
>
> Done in QAPI via flat unions. All the common fields, including the
> discriminator 'sys', are present in the base type, and then each
> subclass of a job adds additional parameters to the QMP type by
> specifying their subtype as a branch of the union. So should be doable.
>
>>
>> 'devices' will now report a more nuanced view of this Job's associated
>> nodes in our graph, as in the following:
>>
>> { "devices": [
>> { "name": "drive0",
>> "nodes": [
>> { "id": "#node123",
>> "permissions": "+RW-X"
>> },
>> { "id": "#node456",
>> "permissions": "+R"
>> }
>> ]
>> }
>> }
>>
>> This lets us show exactly what nodes we are touching, what BlockBackends
>> they belong to, and what permissions are involved. Unambiguously. You
>> can use your own imagination for a permissions string that will make
>> sense -- This depends on Jeff Cody's work primarily.
>
> We'd probably want it to look more JSON-y, maybe something like:
>
> { "id": "#node456",
> "permissions": [ { "name":"read", "mode":"require" },
> { "name":"write", "mode":"allow" },
> { "name":"meta", "mode":"unspecified" } ] }
>
> but I agree that how it is represented does not affect the proposal here
> of merely representing a list of all associated BDS affected by this
> job, broken out by permissions per BDS (as some jobs will affect
> multiple BDS, and not all of them with the same permissions).
>
And as not shown here, some may affect multiple devices, so this should
give the full granularity.
The more JSON-y permissions is fine, also. It will likely evolve to
match whatever Jeff Cody needs for his fine-grained op blocker idea.
>>
>> The new command gives a chance to make a clean break and any users of
>> this new command won't be confused about the information they receive.
>>
>> The legacy qmp-query-block-jobs command can continue to operate
>> providing a best-effort approximation of the data it finds. The legacy
>> command will *abort* and return error if it detects any job that was
>> launched by a new-style command, e.g. if it detects there is more than
>> one job attached to any node under a single BlockBackend -- clearly the
>> user has been using new features -- and should abort announcing this
>> command is deprecated.
>
> Makes sense. If an old client only uses the old commands, things will
> still work; while a new client, once converted to use the new commands,
> should do the conversion completely and not rely on the old view.
>
Yes. All or nothing. I prefer this approach to an incomplete report. I
shouldn't have used "best-effort" above; I should have said:
"The legacy command will report back jobs if it has sufficient
resolution to do so." e.g. no conflicting device or BDS id results to
the query.
I don't very much like the idea of a query command that can give you
incomplete data, especially to an unwitting querier.
>>
>> If the graph looks reasonably plain, it can report back the information
>> it always has, except that it will now also report the "id" field in
>> addition to be perfectly unambiguous about how to issue commands to this
>> job, like I outlined in the first paragraph of this section.
>
> Old clients won't care about the new information, but it doesn't hurt to
> supply it, especially if it lets us share code with the new query command.
>
That's the thought. The wrapper will investigate the list for
suitability and terminate if it finds duplicate BB/BDS IDs.
Though, I'll need to fill in the legacy device field anyway, so I could
just strip out the "devices" graph
>>
>>
>> ==== block-job-cancel ====
>>
>> This command can remain as a legacy command. Provided the "device"
>> provided has precisely one job attached, we can allow this command to
>> cancel it. If more complicated configurations are found, abort and
>> encourage the user to use a new API.
>
> Again, won't affect old client usage patterns, and new clients should
> make the lock-step conversion to using only the new interface. Sounds
> reasonable.
>
>>
>> We can refuse to augment block-job-cancel; forcing users who want to
>> specify IDs to cancel to use the new API.
>>
>> We can add a new "job-cancel" command which removes the block specificity.
>
> Adding a new job-cancel sounds right.
>
>>
>> We can introduce sub-classed arguments as needed for e.g. BlockJob
>> cancellations:
>>
>> { 'command': 'job-cancel',
>> 'data': { 'id': str,
>> '*force': bool,
>> 'options': JobCancelOpts
>> }
>> }
>>
>> The idea is that JobCancelOpts would be a subclassable type that can be
>> passed directly to the subsystem's implementation of the cancel
>> operation for further interpretation, so different subsystems can get
>> relevant arguments as needed. We don't currently have any such need for
>> BlockJobCancelOpts, but it can't hurt to be prepared for it.
>
> To be subclassable, we need a flat union, and the discriminator must be
> non-optional within that union. Either 'options' is that flat union
> (and we have a layer of {} nesting in QMP}, or we directly make the
> 'data' of job-cancel be the flat union (no need for an "options":{}
> nesting in QMP); the latter still depends on some of my pending patches,
> but we'll get there in plenty of time.
>
Ah, shame. It would have been nice to delay interpretation of the union
pending the result of the Job lookup.
If the discriminator must always be present, though, then so be it.
I am fine with either approach; the approach outlined above was a hope
to delay the evaluation of the union to allow for type resolution based
on the ID. If that's a non-starter, I'm fine with the flat union:
{ 'id': str,
'sys': str,
'*force': bool,
<subclass options>
}
where if the 'sys' of the job you find is not the sys you supplied, then
it errors out with "Wrong subsystem ID provided for job '#job123',
expected 'blk', given 'abc'" or so.
>>
>> The type can be interpreted through the lens of whichever type the job
>> we've identified expects to see. (Block Jobs expect to see
>> BlockJobCancelOpts, etc.)
>>
>>
>> ==== block-job-pause, block-job-resume, block-job-complete ====
>>
>> Same story as above. Introduce job-pause, job-resume and job-complete
>> with subclassible job structures we can use for the block subsystem.
>>
>> Legacy commands continue to operate only as long as the "device"
>> argument is unambiguous to decipher.
>
> Seems reasonable. As you surmised, I'm willing to help come up with the
> proper QAPI representation, if that becomes a sticking point in the design.
>
I'll give it a college try and send some actual patches for x-job-query,
x-job-cancel, etc.
For the flat union support on the top level as hinted for job-cancel,
what branch of yours should I develop on top of?
>>
>>
>> ==== block-job-set-speed ====
>>
>> This one is interesting since it definitely does apply only to Block Jobs.
>>
>> We can augment it and limp it along, allowing e.g.
>>
>> { 'data':
>> { '*device': 'str',
>> '*id': 'str',
>> 'speed': 'int'
>> }
>> }
>>
>> Where we follow the "One of ID or Device but not both nor either"
>> pattern that we've applied in the past.
>
> Or even "at least one of ID or Device, and if you pass both, they must
> both resolve to the same object". But "exactly one" works - old clients
> will pass "device", and it will always resolve (because they never used
> the new API to confuse things); new clients will pass only "id" and it
> will be the right thing.
>
Less work and never potentially ambiguous.
>>
>> Or, we can say "If there isn't one job per device, reject the command
>> and use the new API"
>>
>> where the new API is:
>>
>> job-set-option :
>> { 'id': str,
>> 'options': {
>> '*etc': ...,
>> '*etc2': ...,
>> }
>> }
>
> That has a bit more flexibility, especially if we add new options for
> other commands, more than just block-job speed.
>
>>
>> Similar to the other commands, the idea would be to take the subclassed
>> options structure that belongs to that Job's Subclass (e.g.
>> BlockJobOptions) but allow any field inside to be absent.
>>
>> Then we could have commands like this:
>>
>> job-set-option \
>> arguments={ 'id': '#job789',
>> 'options': {
>> 'speed': 10000
>> }
>> }
>>
>> And so on. These would remain a flexible way of setting any various
>> post-launch options a job would need, and the relevant job-subsystem can
>> be responsible for rejecting certain options, etc.
>
> Keeping type-safety via a flat union may require it to look more like:
>
> job-set-option \
> arguments={ 'id': '#job789',
> 'sys': 'block',
> 'speed': 10000
> }
>
> where the use of the 'sys' discriminator tells what other fields are
> being supplied, and we can avoid the "options":{} nesting. Then we'd
> need a sanity check in the code that the 'sys' requested by
> job-set-option matches the sys that owns '#job789', and fail if they differ.
>
Yes, I was assuming again we could resolve the job first and then
interpret the data, but if it's easiest to always require the
discriminator (instead of having to /look/ for it dynamically), then
your outline is perfectly acceptable to me.
>>
>> I believe that covers all the existing jobs interface in the QMP, and
>> that should be sufficiently vague and open-ended enough to behave well
>> with a generic jobs system if we roll one out in the future.
>>
>> Eric, Markus: Please inform me of all the myriad ways my fever dreams
>> for QAPI are wrong. :~)
>
> At this stage, your concepts seem reasonable, even if the concrete way
> of representing a subclass in qapi is not quite spelled out.
>
Thanks!
I'll get rolling on some x-versions to mix alongside my multi-block-job
patches.
--js