qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] exec: fetch the alignment of Linux devdax pmem character dev


From: Liu, Jingqi
Subject: Re: [PATCH] exec: fetch the alignment of Linux devdax pmem character device nodes
Date: Wed, 8 Apr 2020 10:25:27 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0

On 4/8/2020 2:28 AM, Joao Martins wrote:
On 4/7/20 5:55 PM, Dan Williams wrote:
On Tue, Apr 7, 2020 at 4:01 AM Joao Martins <address@hidden> wrote:
On 4/1/20 4:13 AM, Jingqi Liu wrote:
If the backend file is devdax pmem character device, the alignment
specified by the option 'align=NUM' in the '-object memory-backend-file'
needs to match the alignment requirement of the devdax pmem character device.

This patch fetches the devdax pmem file 'align', so that we can compare
it with the NUM of 'align=NUM'.
The NUM needs to be larger than or equal to the devdax pmem file 'align'.

It also fixes the problem that mmap() returns failure in qemu_ram_mmap()
when the NUM of 'align=NUM' is less than the devdax pmem file 'align'.

Cc: Dan Williams <address@hidden>
Signed-off-by: Jingqi Liu <address@hidden>
---
  exec.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
  1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index de9d949902..8221abffec 100644
--- a/exec.c
+++ b/exec.c
@@ -1736,6 +1736,42 @@ static int64_t get_file_size(int fd)
      return size;
  }

+static int64_t get_file_align(int fd)
+{
+    int64_t align = -1;
+#if defined(__linux__)
+    struct stat st;
+
+    if (fstat(fd, &st) < 0) {
+        return -errno;
+    }
+
+    /* Special handling for devdax character devices */
+    if (S_ISCHR(st.st_mode)) {
+        g_autofree char *subsystem_path = NULL;
+        g_autofree char *subsystem = NULL;
+
+        subsystem_path = g_strdup_printf("/sys/dev/char/%d:%d/subsystem",
+                                         major(st.st_rdev), minor(st.st_rdev));
+        subsystem = g_file_read_link(subsystem_path, NULL);
+
+        if (subsystem && g_str_has_suffix(subsystem, "/dax")) {
+            g_autofree char *align_path = NULL;
+            g_autofree char *align_str = NULL;
+
+            align_path = g_strdup_printf("/sys/dev/char/%d:%d/device/align",
+                                    major(st.st_rdev), minor(st.st_rdev));
+
Perhaps, you meant instead:

         /sys/dev/char/%d:%d/align

Hmm, are you sure that's working?
It is, except that I made the slight mistake of testing with a bunch of wip
patches on top which one of them actually adds the 'align' to child dax device.

Argh, my apologies - and thanks for noticing.

I expect the alignment to be found
in the region device:

/sys/class/dax:
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus1/region1/dax1.1/dax1.0
$(readlink -f /sys/dev/char/253\:263)/../align
$(readlink -f /sys/dev/char/253\:263)/device/align


/sys/bus/dax:
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus1/region1/dax1.0/dax1.0
$(readlink -f /sys/dev/char/253\:265)/../align
$(readlink -f /sys/dev/char/253\:265)/device/align <-- No such file

The use of the /sys/dev/char/%d:%d/device is only supported by the
deprecated /sys/class/dax.

Hi Dan,

Thanks for your comments.

Seems it is a mistake.

It should be: $(readlink -f /sys/dev/char/253\:263)/../../align

I don't have the deprecated dax class enabled as could you tell, so the second
case is what I was testing. Except it wasn't a namespace/nvdimm but rather an
hmem device-dax.

'../align' though covers only one case? What about hmem which '../align' returns
ENOENT; perhaps using '../dax_region/align' instead which is common to both?
Albeit that wouldn't address the sub-division devices (that I mention above)

Seems that you mean to use $(readlink -f /sys/dev/char/253\:263)/../../dax_region/align.

Right ?

Thanks,

Jingqi

The current /sys/bus/dax device-model can
be a drop in replacement as long as software is not written to the
/sys/class sysfs layout, i.e. it uses ../ instead of device/ to walk
to the region properties.

/nods



reply via email to

[Prev in Thread] Current Thread [Next in Thread]