qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 1/1] pci-dma-api-v2


From: Anthony Liguori
Subject: Re: [Qemu-devel] [RFC 1/1] pci-dma-api-v2
Date: Sun, 30 Nov 2008 16:50:24 -0600
User-agent: Thunderbird 2.0.0.17 (X11/20080925)

Andrea Arcangeli wrote:
On Fri, Nov 28, 2008 at 07:50:01PM +0100, Andrea Arcangeli wrote:

This patch is too large.  It should be split up.

Signed-off-by: Andrea Arcangeli <address@hidden>

Index: block_int.h
===================================================================
--- block_int.h (revision 5818)
+++ block_int.h (working copy)
@@ -55,6 +55,8 @@
         int64_t sector_num, const uint8_t *buf, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque);
     void (*bdrv_aio_cancel)(BlockDriverAIOCB *acb);
+    BlockDriverAIOIOV *bdrv_aio_readv;
+    BlockDriverAIOIOV *bdrv_aio_writev;
     int aiocb_size;

Please maintain consistency with the rest of the struct by not introducing typedefs for these function pointers.

Index: exec.c
===================================================================
--- exec.c      (revision 5818)
+++ exec.c      (working copy)
@@ -2807,7 +2807,7 @@
 /* physical memory access (slow version, mainly for debug) */
 #if defined(CONFIG_USER_ONLY)
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
-                            int len, int is_write)
+                            size_t len, int is_write)
 {
     int l, flags;
     target_ulong page;
@@ -2848,7 +2848,7 @@
#else
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
-                            int len, int is_write)
+                            size_t len, int is_write)
 {
     int l, io_index;
     uint8_t *ptr;
@@ -2938,6 +2938,111 @@
     }
 }

This API change should be a separate patch.  It's orthogonal to this one.

+uint8_t *cpu_physical_memory_can_dma(target_phys_addr_t addr,
+                                    size_t len, int is_write,
+                                    int alignment)

Your patch is tab-damaged. This API seems flawed. It's duplicating most of what is in cpu_physical_memory_rw.

I think you need something more sophisticated that can split up a scatter/gather list into a set of partial bounce buffers and partial direct copies. Just checking the vector once is a bit hackish.

+           /* qemu doesn't execute guest code directly, but kvm does
+              therefore fluch instruction caches */
+           if (kvm_enabled())
+               flush_icache_range((unsigned long)ptr,
+                                  ((unsigned long)ptr)+l);

This is incorrect for upstream QEMU and just as wrong for kvm-userspace. This is a nasty hack for ia64. flush_icache_range() is dyngen generated. ia64 support is not in upstream QEMU.

@@ -135,6 +149,9 @@
         /* add synchronous IO emulation layer */
         bdrv->bdrv_read = bdrv_read_em;
         bdrv->bdrv_write = bdrv_write_em;
+        bdrv->bdrv_aio_readv = bdrv_aio_readv_em; /* FIXME */
+        bdrv->bdrv_aio_writev = bdrv_aio_writev_em; /* FIXME */
+        bdrv->bdrv_aio_cancel = bdrv_aio_cancel_em; /* FIXME */

What should be fixed?  Are these emulated functions wrong?

There's a lack of symmetry here. We should have a bdrv_readv and bdrv_aio_readv. bdrv_read and bdrv_aio_read should disappear. We can maintain wrappers that create a compatible interface for older code but just adding a new API is wrong.

+BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num,
+                                struct iovec *iov, int iovcnt, size_t len,
+                                BlockDriverCompletionFunc *cb, void *opaque)
+{

struct iovec does not exist on Windows. I also don't think you've got the abstraction right. Reading and writing may require actions to happen. I don't think you can just introduce something as simple as an iovec here. I think we need something more active.

Index: hw/ide.c
===================================================================
--- hw/ide.c    (revision 5818)
+++ hw/ide.c    (working copy)
@@ -31,6 +31,7 @@
 #include "qemu-timer.h"
 #include "sysemu.h"
 #include "ppc_mac.h"
+#include "pci_dma.h"

All of the changes to IDE/SCSI should be separate patches. That will help isolate failures when bisecting.

diff --git a/qemu/hw/pci_dma.c b/qemu/hw/pci_dma.c
new file mode 100644
index 0000000..48762a8
--- /dev/null
+++ b/qemu/hw/pci_dma.c
@@ -0,0 +1,366 @@
+/*
+ * QEMU PCI DMA operations
+ *
+ * Copyright (c) 2008 Red Hat, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <assert.h>
+#include "pci_dma.h"
+
+#define MAX_DMA_BOUNCE_BUFFER (1024*1024)
+//#define DEBUG_BOUNCE
+//#define MAX_DMA_BOUNCE_BUFFER (512)
+//#define DISABLE_IOVEC_CACHE
+
+typedef struct QEMUPciDmaSgParam {
+    QEMUPciDmaSgSubmit *pci_dma_sg_submit;
+    QEMUPciDmaSgComplete *pci_dma_sg_complete;
+    void *pci_dma_sg_opaque;
+    int dma_to_memory;
+    int alignment;
+    uint8_t *bounce;
+    QEMUPciDmaSg *sg;
+    int iovcnt;
+    int restart_iovcnt;
+    size_t restart_offset;
+    int curr_restart_iovcnt;
+    size_t curr_restart_offset;
+    size_t curr_len;
+#ifndef DISABLE_IOVEC_CACHE
+    struct QEMUPciDmaSgParam *next;
+#endif
+    struct iovec iov;
+} QEMUPciDmaSgParam;

What is the IOVEC_CACHE and why do we need it? Either way, it should be using sys-queue.

I think you missed the mark here. This API needs to go through the PCIBus and be pluggable at that level. There can be a default implementation but it may differ for each PCI bus.

Please read the previous threads about a DMA API. All of this has been discussed at length.

Regards,

Anthony Liguori





reply via email to

[Prev in Thread] Current Thread [Next in Thread]