qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 2/2] nvme: allow cmb and pmr to be enabled on same device


From: Andrzej Jakowski
Subject: Re: [PATCH v4 2/2] nvme: allow cmb and pmr to be enabled on same device
Date: Tue, 7 Jul 2020 21:44:39 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0

On 7/6/20 12:15 AM, Klaus Jensen wrote:
> On Jul  2 16:33, Andrzej Jakowski wrote:
>> On 7/2/20 10:51 AM, Klaus Jensen wrote:
>>> On Jul  2 08:07, Andrzej Jakowski wrote:
>>>> On 7/2/20 3:31 AM, Klaus Jensen wrote:
>>>>> Aight, an update here. This only happens when QEMU is run with a virtual
>>>>> IOMMU. Otherwise, the kernel is happy.
>>>>>
>>>>> With the vIOMMU, qemu also craps out a bit:
>>>>>
>>>>> qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error 
>>>>> (iova=0xfd200000, level=0x2, slpte=0x0, write=0)
>>>>> qemu-system-x86_64: vtd_iommu_translate: detected translation failure 
>>>>> (dev=03:00:00, iova=0xfd200000)
>>>>>
>>>>> So I think we are back in QEMU land for the bug.
>>>>
>>>> Can you share command line for that?
>>>>
>>>>
>>>
>>> qemu-system-x86_64 \
>>>   -nodefaults \
>>>   -display none \
>>>   -device intel-iommu,pt,intremap=on,device-iotlb=on \
>>>   -machine type=q35,accel=kvm,kernel_irqchip=split \
>>>   -cpu host \
>>>   -smp 4 \
>>>   -m 8G \
>>>   -nic user,model=virtio-net-pci,hostfwd=tcp::2222-:22 \
>>>   -device virtio-rng-pci \
>>>   -drive 
>>> id=boot,file=/home/kbj/work/src/vmctl/state/pmr/boot.qcow2,format=qcow2,if=virtio,discard=on,detect-zeroes=unmap
>>>  \
>>>   -device pcie-root-port,id=pcie_root_port1,chassis=1,slot=0 \
>>>   -device x3130-upstream,id=pcie_upstream1,bus=pcie_root_port1 \
>>>   -device 
>>> xio3130-downstream,id=pcie_downstream1,bus=pcie_upstream1,chassis=1,slot=1 \
>>>   -drive 
>>> id=nvme0n1,file=/home/kbj/work/src/vmctl/state/pmr/nvme0n1.img,format=raw,if=none,discard=on,detect-zeroes=unmap
>>>  \
>>>   -object memory-backend-file,id=pmr,share=on,mem-path=pmr.bin,size=1M \
>>>   -device 
>>> nvme,id=nvme0,serial=deadbeef,bus=pcie_downstream1,drive=nvme0n1,msix_qsize=1,pmrdev=pmr,cmb_size_mb=2
>>>  \
>>>   -pidfile /home/kbj/work/src/vmctl/run/pmr/pidfile \
>>>   -kernel /home/kbj/work/src/kernel/linux/arch/x86_64/boot/bzImage \
>>>   -append root=/dev/vda1 console=ttyS0,115200 audit=0 nokaslr \
>>>   -virtfs 
>>> local,path=/home/kbj/work/src/kernel/linux,security_model=none,readonly,mount_tag=modules
>>>  \
>>>   -serial mon:stdio \
>>>   -trace pci_nvme*
>>>
>>>
>>
>> I focused on reproduction and it looks to me that my patch doesn't 
>> necessarily introduce regression. I run it w/ and w/o patch in both cases
>> getting error while registering. Here is kernel guest log:
>>
>> [   87.606482] nvme nvme0: pci function 0000:00:04.0
>> [   87.635577] dev=0000000095b0a83b bar=2 size=134217728 offset=0
>> [   87.636593] nvme nvme0: failed to register the CMB ret=-95
>> [   87.643262] nvme nvme0: 12/0/0 default/read/poll queues
>>
>> Any thoughts?
>>
> 
> Hmm, that's not what I am seeing.
> 
> With kernel v5.8-rc4, I'm not seeing any issues with CMB with and
> without IOMMU on QEMU master. With your patches, my kernel (v5.8-rc4)
> pukes both with and without iommu.
> 
> BUT! This doesn't mean that your patch is bad, it looks more like an
> issue in the kernel. I still think the BAR configuration looks sane, but
> I am not expert on this.
> 
> To satisify my curiosity I tried mending your patch to put the CMB on
> offset 0 and move the MSI-X vector table and PBA to BAR 0 (like I
> suggested back in the day). That works. With and without IOMMU. So, I
> think it is an issue with the Linux kernel not being too happy about the
> CMB being at an offset. It doesn't directly look like an issue in the
> nvme driver since the issue shows up far lower in the memory subsystem,
> but it would be nice to have the linux nvme gang at least acknowledge
> the issue.
> 

I have managed to reproduce that problem and played with patch to see
when the problem occurs vs not. 
When I put MSIX back to BAR2 (no PMR at all) and CMB left at BAR4 but 
starting at offset 0 I was still able to reproduce issue.
So then I've played with memory region API and interesting observed that
problem occurs when region overlaying is used via:

memory_region_init(&n->bar4, OBJECT(n), "nvme-bar4",  bar_size);$
$  
if (n->params.cmb_size_mb) {$
    memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,$
                          "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));$
$  
    memory_region_add_subregion_overlap(&n->bar4, cmb_offset, &n->ctrl_mem, 1);$
}$

on the other hand when cmb memory region is initialized w/o region
overlaying that is:

memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,$
                      "nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));

I get no reproduction.

Also observed qemu complaing about failed translation:
qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error 
(iova=0xfe400000, level=0x2, slpte=0x0, write=0)
qemu-system-x86_64: vtd_iommu_translate: detected translation failure 
(dev=03:00:00, iova=0xfe400000)

Not sure how we want to proceed. Any suggestions?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]