[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 04/10] block/pflash_cfi02: Implement interele
From: |
Stephen Checkoway |
Subject: |
Re: [Qemu-devel] [PATCH v4 04/10] block/pflash_cfi02: Implement intereleaved flash devices |
Date: |
Mon, 24 Jun 2019 12:36:05 -0700 |
> On Jun 24, 2019, at 12:05, Philippe Mathieu-Daudé <address@hidden> wrote:
>
>> On 6/22/19 2:25 PM, Philippe Mathieu-Daudé wrote:
>> Hi Stephen,
>>
>> This series haven't fall through the cracks, however it is taking me
>> longer than expected to review it.
>>
>>> On 4/26/19 6:26 PM, Stephen Checkoway wrote:
>>> It's common for multiple narrow flash chips to be hooked up in parallel
>>> to support wider buses. For example, four 8-bit wide flash chips (x8)
>>> may be combined in parallel to produce a 32-bit wide device. Similarly,
>>> two 16-bit wide chips (x16) may be combined.
>>>
>>> This commit introduces `device-width` and `max-device-width` properties,
>>> similar to pflash_cfi01, with the following meanings:
>>> - `width`: The width of the logical, qemu device (same as before);
>>> - `device-width`: The width of an individual flash chip, defaulting to
>>> `width`; and
>>> - `max-device-width`: The maximum width of an individual flash chip,
>>> defaulting to `device-width`.
>>>
>>> Nothing needs to change to support reading such interleaved devices but
>>> commands (e.g., erase and programming) must be sent to all devices at
>>> the same time or else the various chips will be in different states.
>>
>> After some thoughts on this, I'd rather we model how hardware manage
>> interleaved devices: do it at the bus level, and instanciate N devices
>> in an interleaved config.
>> I believe that would drastically reduce this device complexity, and we
>> would match the real internal state machine.
>> Also this could be reused by other parallel devices used in a such config.
>>
>>> For example, a 4-byte wide logical device can be composed of four x8/x16
>>> devices in x8 mode. That is, each device supports both x8 or x16 and
>>> they're being used in the byte, rather than word, mode. This
>>> configuration would have `width=4`, `device-width=1`, and
>>> `max-device-width=2`.
>>
>>
>> I'm thinking of this draft:
>>
>> FlashDevice # x8
>> MemoryRegionOps
>> .valid.max_access_size = 1
>>
>> FlashDevice # x16
>> MemoryRegionOps
>> .valid.min_access_size = 2
>> .valid.max_access_size = 2
>>
>> FlashDevice # x8/x16
>> MemoryRegionOps
>> .valid.min_access_size = 1
>> .valid.max_access_size = 2
>>
>> We might use .impl.min_access_size = 2 and consider all NOR flash using
>> 16-bit words internally.
>> .impl.max_access_size = 2 is implicit.
>>
>> So for you example we'd instanciate one:
>>
>> InterleaverDevice
>> Property
>> .bus_width = 4 # 4-byte wide logical device, `width=4`
>> .device_width = 1 # `device-width=1`
>> MemoryRegionOps
>> .valid.max_access_size = .bus_width # 4, set at realize()
>> .impl.max_access_size = .device_width # 1, set at realize()
>>
>> Then instanciate 4 pflash devices, and link them to the interleaver
>> using object_property_set_link().
>>
>> typedef struct {
>> SysBusDevice parent_obj;
>> MemoryRegion iomem;
>> char *name;
>> /*
>> * On a 64-bit wide bus we can have at most
>> * 8 devices in 8-bit access mode.
>> */
>> MemoryRegion device[8];
>> unsigned device_count;
>> unsigned device_index_mask;
>> /* Properties */
>> unsigned bus_width;
>> unsigned device_width;
>> } InterleaverDeviceState;
>>
>> static Property interleaver_properties[] = {
>> DEFINE_PROP_LINK("device[0]", InterleaverDeviceState,
>> device[0],
>> TYPE_MEMORY_REGION, MemoryRegion *),
>> ...
>> DEFINE_PROP_LINK("device[7]", InterleaverDeviceState,
>> device[7],
>> TYPE_MEMORY_REGION, MemoryRegion *),
>> DEFINE_PROP_END_OF_LIST(),
>> };
>>
>> Then previous to call InterleaverDevice.realize():
>>
>> In the board realize():
>>
>>
>> for (i = 0; i < interleaved_devices; i++) {
>> pflash[i] = create_pflash(...);
>> ...
>> }
>>
>> ild = ... create InterleaverDevice ...
>> for (i = 0; i < interleaved_devices; i++) {
>> char *propname = g_strdup_printf("device[%u]", i);
>>
>>
>> object_property_set_link(OBJECT(&ild->device[i]),
>> OBJECT(pflash[i]),
>> propname, &err);
>> ...
>> }
>>
>> Finally,
>>
>> static void interleaved_realize(DeviceState *dev, Error **errp)
>> {
>> InterleaverDeviceState *s = INTERLEAVER_DEVICE(opaque);
>>
>> s->device_count = s->bus_width / s->device_width;
>> s->device_index_mask = ~(s->device_count - 1);
>> ...
>> }
>>
>> static void interleaved_write(void *opaque, hwaddr offset,
>> uint64_t value, unsigned size)
>> {
>> InterleaverDeviceState *s = INTERLEAVER_DEVICE(opaque);
>> MemoryRegion *mr;
>>
>> /*
>> * Since we set .impl.max_access_size = device_width,
>> * access_with_adjusted_size() always call this with
>> * size = device_width.
>> *
>> * Adjust the address (offset).
>> */
>> offset >>= size;
>> /* Access the N interleaved device */
>> mr = s->device[offset & s->device_index_mask];
>> memory_region_dispatch_write(mr, offset, &value, size,
>> MEMTXATTRS_UNSPECIFIED);
>> }
>>
>> I'll try a PoC.
>
> So I have a PoC, but then realize I can not use the same flash dump...
>
> I need to deinterleave is. This is easily fixed with few lines of
> Python, then if I want to store/share the dump (aka 'backend storage') I
> have to re-interleave it.
>
> I wonder if it would be possible/easy to add a 'interleave' option to
> blockdev to be able to have 2 pflash devices sharing the same backend...
> Is it worthwhile? Kevin/Markus/Max any thought on this?
Hi Phil,
Sorry for the delay, I’ve been traveling.
I considered something like this approach and I think it could work.
Ultimately, I opted not to go that route for a few reasons:
- duplicated CFI tables and other state waste (a small amount of) memory when
the flash chips are the same (the usual case in my limited experience)
- it adds an extra layer of read/write calls plus recombining from/splitting
into the component parts
- duplicated timers firing to walk the programming/erasing state machine
forward for each chip
- the firmware or data stored in the chips is likely already interleaved
necessitating either splitting it up before use or adding functionality to a
lower layer to split it (as you’ve suggested here)
None of the above seem like a big deal separately or together but I didn’t find
the advantages of this approach to be sufficiently compelling to justify it.
Namely, it allows using a heterogeneous set of flash chips to implement a
logical device.
Nevertheless, if that’s the route you think is best, I have no objections.
Cheers,
Steve
>
> Thanks,
>
> Phil.
>
>>> In addition to commands being sent to all devices, guest firmware
>>> expects the status and CFI queries to be replicated for each device.
>>> (The one exception to the response replication is that each device gets
>>> to report its own status bit DQ7 while programming because its value
>>> depends on the value being programmed which will usually differ for each
>>> device.)
>>>
>>> Testing is limited to 16-bit wide devices due to the current inability
>>> to override the properties set by `pflash_cfi02_register`, but multiple
>>> configurations are tested.
>>>
>>> Stop using global_qtest. Instead, package the qtest variable inside the
>>> FlashConfig structure.
>>>
>>> Signed-off-by: Stephen Checkoway <address@hidden>
>>> Acked-by: Thomas Huth <address@hidden>
>>> ---
>>> hw/block/pflash_cfi02.c | 270 +++++++++++++++------
>>> tests/pflash-cfi02-test.c | 476 ++++++++++++++++++++++++++++++--------
>>> 2 files changed, 576 insertions(+), 170 deletions(-)