qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH qemu v7 06/14] spapr_iommu: Introduce "enabled"


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [PATCH qemu v7 06/14] spapr_iommu: Introduce "enabled" state for TCE table
Date: Wed, 27 May 2015 01:00:15 +1000
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

On 05/27/2015 12:24 AM, Paolo Bonzini wrote:


On 26/05/2015 16:17, Alexey Kardashevskiy wrote:
On 05/27/2015 12:03 AM, Paolo Bonzini wrote:


On 26/05/2015 16:00, Alexey Kardashevskiy wrote:
On 05/26/2015 11:48 PM, Paolo Bonzini wrote:


On 26/05/2015 15:42, Alexey Kardashevskiy wrote:


The next patch of this patchset changes:
spapr_tce_table_do_enable()
       memory_region_init_iommu(&iommu)
       memory_region_add_subregion(&root, &iommu)

spapr_tce_table_disable()
       memory_region_del_subregion(&root, &iommu)
       object_unref(&iommu)

These spapr_tce_xxx are called by request from the guest. &root is a
container and exists as long as sPAPRTCETable exists.

Where do I get a leaking child property here?

When you unref iommu and not unparent it.  The next
memory_region_init_iommu creates a second child property, and the first
is gone.

But when do I get this child property? In memory_region_add_subregion()?
And memory_region_del_subregion() does not do the opposite thing
(unparent)?

In memory_region_init_iommu.

Ah. So I need at least s/object_unref/object_unparent/ in my current
code, right?

Yes, and then you hit the situation documented in docs/memory.txt.

Oh. ok.


Why do you need different regions?  Why can't you have always the same
IOMMU regions, and either:

They may change a size.

That's not a problem, there's memory_region_set_size for that.


It was not there when I started doing this DDW :) If so, I can keep the existing structure and just set size to zero instead of memory_region_del_subregion().


These are dynamic DMA windows, guest may remove
all and create randomly. Each region is backed by a separate TCE table
with different page size.

Okay.

1) create/destroy an alias to that region

How does this change things compared to iommus in regard to parenting?

Aliases do not have the same restriction.  But this doesn't help your
case if you have separate TCE tables etc.

I need windows appear and disappear on a bus dynamically, that's it. The actual sPAPRTCETable objects exist always. Aliases will do the job as far as I can tell.

2) change the behavior of the translation function, while keeping a
single region?

Have one sPAPRTCETable object with 0, 1 or 2 (and potentially more)
actual TCE tables? I can do that too but I thought subregions are just
natural for that.

They may be.  You may need more than one though.

I fail to see when :)


What guest actions trigger the change?  Is it a hypercall?  If so, what
hypercall is it so I can look at the documentation?

It is a bunch of RTAS calls which are highly classified in PAPR spec :)

Linux guests do this:
1. load a driver
2. driver calls set_dma_mask()
3. if mask < 64, usual old-style &dma_iommu_ops is used; exit
4. platform code calls enable_ddw()
5. enable_ddw() looks at PHB "ddw-applicable"
6. enable_ddw() calls ibm,query-pe-dma-window (returns page mask supported)
7. enable_ddw() calls ibm,create-pe-dma-window to create actual window with specific size (which is entire guest RAM in the case of linux but might be different for the other OS) and know its bus address (rtas returns it, the guest does not choose it) 8. enable_ddw() calls H_PUT_TCE in a loop to map all guest RAM pages onto a bus and does set_dma_ops(dev, &dma_direct_ops) so H_PUT_TCE is not called again till guest reboot.

If any step in 5..8 fails, then &dma_iommu_ops is used.

The pseries platform expects the default DMA window (4K pages, <=2GB) to exist. And there is an extra ibm,remove-pe-dma-window call to remove any window (including default one) so a following ibm,create-pe-dma-window will create a new window at zero offset on a bus (as big as the guest RAM and page size bigger than 4K).

Aaaaand there is an extension - ibm,reset-pe-dma-window which should delete all windows and create the default one (kernels before v3.10 or so used to do this). The machine reset should do the same thing.



I even wanted to create sPAP
RTCETable' dynamically but
this would break migration (because we cannot start QEMU with an
additional sPAPRTCETable if it exists in the source which is not always
the case).

Creating sPAPRTCETables dynamically would be a fix as well.  You _can_
unparent the sPAPRTCETable whenever you want.  But it's not necessarily
the right solution.

Why does it break migration?  There is only one migration handler for
all htabs, I think.  Or is this a different thing than the htabs?


sPAPRTCETable stores the actual table and if I want it to migrate, the destination QEMU must have the object created-and-vmstate_register'ated. But the table (and class) may be absent or present on the source side so I need to start the destination with or without -device sPAPRTCETable, and if I need to create this object, I need to make it a child of a PHB and last time I checked - there is no command line interface for linking children.



The sPAPRTCETable would be created in its parent device's post_load handler.

Ok. I'll redo this thing again and try using less QOM objects...

Wait, I haven't understood the problem yet.

Oookay :)

But I started thinking that always having 2 sPAPRTCETable objects (some may be "disabled") it not better than a single sPAPRTCETable with multiple TCE tables...


--
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]