[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 08/10] dp8393x: don't force 32-bit register access

From: Mark Cave-Ayland
Subject: Re: [PATCH v2 08/10] dp8393x: don't force 32-bit register access
Date: Sat, 3 Jul 2021 13:04:30 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote:

On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
On 02/07/2021 05:36, Finn Thain wrote:

On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
all accesses to the registers were 32-bit

No, that assumption was not made there. Just take a look at my commits in
Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident,
it probably just reflects my inadequate knowledge of QEMU internals.

but this is actually not the case. The access size is determined by
the CPU instruction used and not the number of physical address lines.

I think that's an over-simplification (in the context of commit

Let me try and clarify this a bit more: there are 2 different changes
incorporated into 3fe9a838ec. The first (as you mention below and also
detailed in the commit messge), is related to handling writes to the
upper 16-bits of a word from the device and ensuring that 32-bit
accesses are handled correctly. This part seems correct to me based upon
reading the datasheet and the commit message:

@@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
int offset,
                          uint16_t val)
      if (s->big_endian) {
-        s->data[offset * width + width - 1] = cpu_to_be16(val);
+        if (width == 2) {
+            s->data[offset * 2] = 0;
+            s->data[offset * 2 + 1] = cpu_to_be16(val);
+        } else {
+            s->data[offset] = cpu_to_be16(val);
+        }
      } else {
-        s->data[offset * width] = cpu_to_le16(val);
+        if (width == 2) {
+            s->data[offset * 2] = cpu_to_le16(val);
+            s->data[offset * 2 + 1] = 0;
+        } else {
+            s->data[offset] = cpu_to_le16(val);
+        }

The second change incorporated into 3fe9a838ec (and the one this patch
fixes) is a similar change made to the MMIO *register* accesses:

@@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
addr, unsigned int size)

      DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);

-    return val;
+    return s->big_endian ? val << 16 : val;


@@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
addr, uint64_t data,
      dp8393xState *s = opaque;
      int reg = addr >> s->it_shift;
+    uint32_t val = s->big_endian ? data >> 16 : data;

This is not correct since the QEMU memory API handles any access size
and endian conversion before the MMIO access reaches the device. It is
this change which breaks the 32-bit accesses used by MacOS to read/write
the dp8393x registers because it applies an additional endian swap on
top of that already done by the memory API.

The big_endian workaround applied to the register read/writes was
actually caused by forcing the access size to 32-bit when the guest OS
was using a 16-bit access. Since the registers are 16-bit then we can
simply set .impl.min_access to 2 and then the memory API will
automatically do the right thing for both 16-bit accesses used by
Linux and 32-bit accesses used by the MacOS toolbox ROM.

Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses
so we end using kludge to hide bugs" pattern. Can you provide a QTest
(ideally) or a "-trace memory_region_ops_\*" log of your firmware
accessing the dp8393x please?

The DP83932 chip is highly configurable, so I'm not sure that the
behaviour of any given firmware would resolve the question.

Indeed, I don't think that will help much here. Phil, if you would still
like me to send you some traces after reading the explanation above then
do let me know.

I read it and still would have few traces. We can hand-write them too.

I'd like to add qtests for some read/write,16/32(A)==B.

Sigh. I was just about to attempt some traces when I realised looking at the patch that I made a mistake, and that in order for the memory API to automatically handle the 4 byte accesses as 2 x 2 byte accesses then both .impl.min_access_size and .impl.max_access_size need to be set to 2 :( The remainder of the patch is the same but dp8393x_ops now looks like this:

static const MemoryRegionOps dp8393x_ops = {
    .read = dp8393x_read,
    .write = dp8393x_write,
    .impl.min_access_size = 2,
    .impl.max_access_size = 2,
    .endianness = DEVICE_NATIVE_ENDIAN,

I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking seems fine after a quick test in each OS. The slight curiosity is that the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses, but since MacOS only uses the bottom 16-bit of the result and ignores the top 16-bits then despite there being 2 accesses, everything still works as expected (see below).


dp8393x_read reg=0x28 [SR] val=0x0004 size=2
memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4 size 2
dp8393x_read reg=0x28 [SR] val=0x0004 size=2
memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4 size 2
memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value 0x40004 
size 4


memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value 0x248fe8 
size 4
memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value 0x24 size 
dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value 0x8fe8 
size 2
dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2

If you're happy with this, I'll resubmit a revised version of the patch but with an updated commit message: the Fixes: tag is still the same, but the updated fix is to ensure that .impl.min_access_size and .impl.max_access_size match the real-life 16-bit register size.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]