|
From: | Xiao Guangrong |
Subject: | Re: [Qemu-devel] [PATCH v3 27/32] nvdimm: support DSM_CMD_IMPLEMENTED function |
Date: | Wed, 14 Oct 2015 22:50:40 +0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 10/14/2015 05:40 PM, Stefan Hajnoczi wrote:
On Sun, Oct 11, 2015 at 11:52:59AM +0800, Xiao Guangrong wrote:static void dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size) { + NVDIMMState *state = opaque; + MemoryRegion *dsm_ram_mr; + dsm_in *in; + dsm_out *out; + uint32_t revision, function, handle; + if (val != NOTIFY_VALUE) { fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val); } + + dsm_ram_mr = memory_region_find(&state->mr, state->page_size, + state->page_size).mr; + memory_region_unref(dsm_ram_mr); + in = memory_region_get_ram_ptr(dsm_ram_mr);This looks suspicious. Shouldn't the memory_region_unref(dsm_ram_mr) happen after we're done using it?
This region is keep-alive during QEMU's running, it is okay. The same style is applied to other codes, for example: line 208 in hw/s390x/sclp.c.
+ out = (dsm_out *)in; + + revision = in->arg1; + function = in->arg2; + handle = in->handle; + le32_to_cpus(&revision); + le32_to_cpus(&function); + le32_to_cpus(&handle); + + nvdebug("UUID " UUID_FMT ".\n", in->arg0[0], in->arg0[1], in->arg0[2], + in->arg0[3], in->arg0[4], in->arg0[5], in->arg0[6], + in->arg0[7], in->arg0[8], in->arg0[9], in->arg0[10], + in->arg0[11], in->arg0[12], in->arg0[13], in->arg0[14], + in->arg0[15]); + nvdebug("Revision %#x Function %#x Handler %#x.\n", revision, function, + handle); + + if (revision != DSM_REVISION) { + nvdebug("Revision %#x is not supported, expect %#x.\n", + revision, DSM_REVISION); + goto exit; + } + + if (!handle) { + if (!dsm_is_root_uuid(in->arg0)) {Please don't dereference 'in' or pass it to other functions. Avoid race conditions with guest vcpus by coping in the entire dsm_in struct. This is like a system call - the kernel cannot trust userspace memory and must copy in before accessing data. The same rules apply.
It's little different for QEMU: - the memory address is always valid to QEMU, it's not always true for Kernel due to context-switch - we have checked the header before use it's data, for example, when we get data from GET_NAMESPACE_DATA, we have got the @offset and @length from the memory, then copy memory based on these values, that means the userspace has no chance to cause buffer overflow by increasing these values at runtime. The scenario for our case is simple but Kernel is difficult to do check_all_before_use as many paths may be involved. - guest changes some data is okay, the worst case is that the label data is corrupted. This is caused by guest itself. Kernel also supports this kind of behaviour, e,g. network TX zero copy, the userspace page is being transferred while userspace can still access it. - it's 4K size on x86, full copy wastes CPU time too much.
[Prev in Thread] | Current Thread | [Next in Thread] |