qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file


From: Gabriel L. Somlo
Subject: Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file
Date: Mon, 20 Jul 2015 17:19:36 -0400
User-agent: Mutt/1.5.23 (2014-03-12)

New working version of fw_cfg sysfs module enclosed at the end of this
mail, featuring:

        - probing for the appropriate fw_cfg port/address for the
          architecture we're on. It's either that, or preprocessor
          #ifdef voodoo to try only the right access method matching
          the architecture we're compiling for.

                - the module compiles and runs fine now on both
                  x86_64 and armv7hl+lpae, not tested on ppc or sun.

        - Reworked initialization to prevent leaks when bailing out
          due to errors (thanks Matt for pointing out one of those !)

On Wed, Jul 15, 2015 at 01:00:45PM +0100, Matt Fleming wrote:
> On Mon, 2015-07-13 at 16:09 -0400, Gabriel L. Somlo wrote:
> >    Speaking of the kernel: My default plan is to subscribe to
> >    address@hidden and submit it there (this is after
> >    all baby's first kernel module :) but if there's a better route for
> >    pushing it upstream, please feel free to suggest it to me.
> 
> Congrats on your first kernel patch! Cc'ing kernelnewbies is OK but this
> patches warrants a wider audience. Cc Greg Kroah-Hartman
> (address@hidden) since he polices things related to sysfs,
> the x86 maintainers (address@hidden), and address@hidden
> The closest thing we have to a Linux system firmware list is probably
> address@hidden
> 
> By all means, mention that it's your first kernel patch when submitting
> this.
> 
> Oh, you're also going to want to add documentation under
> Documentation/ABI/testing/sysfs-firmware-fw_cfg as part of your patch
> series.

Also thanks for the good advice. I'm out on PTO next week, so planning
to submit this to the kernel after I get back -- wouldn't want to not
be around when I start getting feedback from the kernel list... :)

On Wed, Jul 15, 2015 at 01:15:08PM +0200, Laszlo Ersek wrote:
> On 07/15/15 13:06, Matt Fleming wrote:
> > On Tue, 2015-07-14 at 13:00 -0400, Gabriel L. Somlo wrote:
> >>
> >> That being said, I did reimplement systemd's escape method in cca. 30
> >> lines of C, so that shouldn't be too onerous.
> > 
> > I really don't see a reason to use systemd's naming scheme instead of
> > the one already provided by the kernel.
> > 
> >> Besides, Laszlo said he liked a real folder hierarchy, and I do too,
> >> so I'm still pondering how much doing that would complicate the module
> >> init function. I'm somewhat worried about what the added complexity
> >> will mean in terms of upstream acceptance in the linux kernel :)
> > 
> > Yeah, going that route will complicate the patch and you're going to get
> > asked "Umm.. why?" by Greg Kroah-Hartman (definitely Cc Greg when
> > sending this to the kernel mailing lists btw).
> > 
> > But if you can provide sound technical arguments for the added
> > complexity I'm sure the kernel folks will be happy. Laszlo's argument
> > makes sense to me, i.e. wanting to use standard utilities to know
> > whether a blob is available.
> > 
> > Instead of rolling all this kobject-creation logic into your driver I'd
> > suggest writing a patch to lib/kobject.c., since the problem sounds like
> > something that should be solved with the kobject API.
> 
> Haha, this feature just got extended by six months. :)
> 
> > The question is, how simple can you make the code ;-)
> 
> The first question is how strong Gabriel's nerves are, but (from past
> experience) they *are* strong. :)

Sir, you are too kind... :)

Speaking of which, I think I hit the first snag, and it's of the
design/bikeshedding kind:

The code to build nested ksets (represending sub-sub-directories of
/sys/firmware/fw_cfg/...) and cleaning them up on exit doesn't promise
to be *too* horrible or bulky, but as I was getting ready to start
writing it, I realized that, in theory, nothing is to stop the fw_cfg
device from having files named e.g.

        "etc/foo"

and

        "etc/foo/bar"

That doesn't happen right now on the qemu side, but it could in
theory, and I have no idea how I'd deal with the file/directory
duality of "foo" in this situation, short of refusing to load the
module, or leaving out one fw_cfg file or the other. Unless there's
a way around this, I think it's safer to either stick with the default
's/\//!/g' scheme of the kobject library, or implement something like
systemd's string escaping scheme that's been suggested earlier in this
thread...

Or maybe leaving out the occasional poorly-named file is still better
than giving up the ability to run find on the fw_cfg sysfs folder ?

Anyway, here goes version 0.2...

Thanks much,
--Gabriel


/*
 * drivers/firmware/fw_cfg.c
 *
 * Expose entries from QEMU's firmware configuration (fw_cfg) device in
 * sysfs (read-only, under "/sys/firmware/fw_cfg/...").
 *
 * Copyright 2015 Gabriel L. Somlo <address@hidden>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License v2 as published
 * by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful, but
 * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
 * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
 * for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA
 */

#include <linux/module.h>
#include <linux/capability.h>
#include <linux/slab.h>
#include <linux/io.h>
#include <linux/ioport.h>
#include <linux/ctype.h>

/* selector values for "well-known" fw_cfg entries */
#define FW_CFG_SIGNATURE  0x00
#define FW_CFG_FILE_DIR   0x19

/* size in bytes of fw_cfg signature */
#define FW_CFG_SIG_SIZE 4

/* fw_cfg "file name" is up to 56 characters (including terminating nul) */
#define FW_CFG_MAX_FILE_PATH 56

/* fw_cfg file directory entry type */
struct fw_cfg_file {
        uint32_t size;
        uint16_t select;
        uint16_t reserved;
        char name[FW_CFG_MAX_FILE_PATH];
};

/* fw_cfg device i/o access options type */
struct fw_cfg_access {
        phys_addr_t start;
        uint8_t size;
        uint8_t ctrl_offset;
        uint8_t data_offset;
        bool is_mmio;
        const char *name;
};

/* fw_cfg device i/o access available options for known architectures */
static struct fw_cfg_access fw_cfg_modes[] = {
        { 0x510,       2, 0, 1, false, "fw_cfg on i386, sun4u" },
        { 0x9020000,  10, 8, 0,  true, "fw_cfg on arm" },
        { 0xd00000510, 3, 0, 2,  true, "fw_cfg on sun4m" },
        { 0xf0000510,  3, 0, 2,  true, "fw_cfg on ppc/mac" },
        { }
};

/* fw_cfg device i/o currently selected option set */
static struct fw_cfg_access *fw_cfg_mode;

/* fw_cfg device i/o register addresses */
static void __iomem *fw_cfg_dev_base;
static void __iomem *fw_cfg_dev_ctrl;
static void __iomem *fw_cfg_dev_data;

/* atomic access to fw_cfg device (potentially slow i/o, so using mutex) */
static DEFINE_MUTEX(fw_cfg_dev_lock);

/* type for fw_cfg "directory scan" visitor/callback function */
typedef int (*fw_cfg_file_callback)(const struct fw_cfg_file *f);

/* run a given callback on each fw_cfg directory entry */
static int fw_cfg_scan_dir(fw_cfg_file_callback callback)
{
        int ret = 0;
        uint32_t count, i;
        struct fw_cfg_file f;

        mutex_lock(&fw_cfg_dev_lock);
        iowrite16(FW_CFG_FILE_DIR, fw_cfg_dev_ctrl);
        ioread8_rep(fw_cfg_dev_data, &count, sizeof(count));
        for (i = 0; i < be32_to_cpu(count); i++) {
                ioread8_rep(fw_cfg_dev_data, &f, sizeof(f));
                ret = callback(&f);
                if (ret)
                        break;
        }
        mutex_unlock(&fw_cfg_dev_lock);
        return ret;
}

/* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
static inline void fw_cfg_read_blob(uint16_t select,
                                     void *buf, loff_t pos, size_t count)
{
        mutex_lock(&fw_cfg_dev_lock);
        iowrite16(select, fw_cfg_dev_ctrl);
        while (pos-- > 0)
                ioread8(fw_cfg_dev_data);
        ioread8_rep(fw_cfg_dev_data, buf, count);
        mutex_unlock(&fw_cfg_dev_lock);
}

/* clean up fw_cfg device i/o setup */
static void fw_cfg_io_cleanup(void)
{
        if (fw_cfg_mode->is_mmio) {
                iounmap(fw_cfg_dev_base);
                release_mem_region(fw_cfg_mode->start, fw_cfg_mode->size);
        } else {
                ioport_unmap(fw_cfg_dev_base);
                release_region(fw_cfg_mode->start, fw_cfg_mode->size);
        }
}

/* probe and map fw_cfg device */
static int __init fw_cfg_io_probe(void)
{
        char sig[FW_CFG_SIG_SIZE];

        for (fw_cfg_mode = &fw_cfg_modes[0];
             fw_cfg_mode->start; fw_cfg_mode++) {

                phys_addr_t start = fw_cfg_mode->start;
                uint8_t size = fw_cfg_mode->size;

                /* reserve and map mmio or ioport region */
                if (fw_cfg_mode->is_mmio) {
                        if (!request_mem_region(start, size, fw_cfg_mode->name))
                                continue;
                        fw_cfg_dev_base = ioremap(start, size);
                        if (!fw_cfg_dev_base) {
                                release_mem_region(start, size);
                                continue;
                        }
                } else {
                        if (!request_region(start, size, fw_cfg_mode->name))
                                continue;
                        fw_cfg_dev_base = ioport_map(start, size);
                        if (!fw_cfg_dev_base) {
                                release_region(start, size);
                                continue;
                        }
                }

                /* set control and data register addresses */
                fw_cfg_dev_ctrl = fw_cfg_dev_base + fw_cfg_mode->ctrl_offset;
                fw_cfg_dev_data = fw_cfg_dev_base + fw_cfg_mode->data_offset;

                /* verify fw_cfg device signature */
                fw_cfg_read_blob(FW_CFG_SIGNATURE, sig, 0, FW_CFG_SIG_SIZE);
                if (memcmp(sig, "QEMU", FW_CFG_SIG_SIZE) == 0)
                        /* success, we're done */
                        return 0;

                /* clean up before probing next access mode */
                fw_cfg_io_cleanup();
        }

        return -ENODEV;
}

/* fw_cfg_sysfs_entry type */
struct fw_cfg_sysfs_entry {
        struct kobject kobj;
        struct fw_cfg_file f;
        struct list_head list;
};

/* get fw_cfg_sysfs_entry from kobject member */
static inline struct fw_cfg_sysfs_entry *to_entry(struct kobject *kobj)
{
        return container_of(kobj, struct fw_cfg_sysfs_entry, kobj);
}

/* fw_cfg_sysfs_attribute type */
struct fw_cfg_sysfs_attribute {
        struct attribute attr;
        ssize_t (*show)(struct fw_cfg_sysfs_entry *entry, char *buf);
};

/* get fw_cfg_sysfs_attribute from attribute member */
static inline struct fw_cfg_sysfs_attribute *to_attr(struct attribute *attr)
{
        return container_of(attr, struct fw_cfg_sysfs_attribute, attr);
}

/* global cache of fw_cfg_sysfs_entry objects */
static LIST_HEAD(fw_cfg_entry_cache);

/* kobjects removed lazily by kernel, mutual exclusion needed */
static DEFINE_SPINLOCK(fw_cfg_cache_lock);

static inline void fw_cfg_sysfs_cache_enlist(struct fw_cfg_sysfs_entry *entry)
{
        spin_lock(&fw_cfg_cache_lock);
        list_add_tail(&entry->list, &fw_cfg_entry_cache);
        spin_unlock(&fw_cfg_cache_lock);
}

static inline void fw_cfg_sysfs_cache_delist(struct fw_cfg_sysfs_entry *entry)
{
        spin_lock(&fw_cfg_cache_lock);
        list_del(&entry->list);
        spin_unlock(&fw_cfg_cache_lock);
}

static void fw_cfg_sysfs_cache_cleanup(void)
{
        struct fw_cfg_sysfs_entry *entry, *next;

        list_for_each_entry_safe(entry, next, &fw_cfg_entry_cache, list) {
                /* will end up invoking fw_cfg_sysfs_cache_delist()
                 * via each object's release() method (i.e. destructor) */
                kobject_put(&entry->kobj);
        }
}

/* default_attrs: per-entry attributes and show methods */

#define FW_CFG_SYSFS_ATTR(_attr) \
struct fw_cfg_sysfs_attribute fw_cfg_sysfs_attr_##_attr = { \
        .attr = { .name = __stringify(_attr), .mode = 0400 }, \
        .show = fw_cfg_sysfs_show_##_attr, \
}

static ssize_t fw_cfg_sysfs_show_size(struct fw_cfg_sysfs_entry *e, char *buf)
{
        return sprintf(buf, "%d\n", e->f.size);
}

static ssize_t fw_cfg_sysfs_show_select(struct fw_cfg_sysfs_entry *e, char *buf)
{
        return sprintf(buf, "%d\n", e->f.select);
}

static ssize_t fw_cfg_sysfs_show_name(struct fw_cfg_sysfs_entry *e, char *buf)
{
        return sprintf(buf, "%s\n", e->f.name);
}

static FW_CFG_SYSFS_ATTR(size);
static FW_CFG_SYSFS_ATTR(select);
static FW_CFG_SYSFS_ATTR(name);

static struct attribute *fw_cfg_sysfs_entry_attrs[] = {
        &fw_cfg_sysfs_attr_size.attr,
        &fw_cfg_sysfs_attr_select.attr,
        &fw_cfg_sysfs_attr_name.attr,
        NULL,
};

/* sysfs_ops: find fw_cfg_[entry, attribute] and call appropriate show method */
static ssize_t fw_cfg_sysfs_attr_show(struct kobject *kobj, struct attribute *a,
                                      char *buf)
{
        struct fw_cfg_sysfs_entry *entry = to_entry(kobj);
        struct fw_cfg_sysfs_attribute *attr = to_attr(a);

        if (!capable(CAP_SYS_ADMIN))
                return -EACCES;

        return attr->show(entry, buf);
}

static const struct sysfs_ops fw_cfg_sysfs_attr_ops = {
        .show = fw_cfg_sysfs_attr_show,
};

/* release: destructor, to be called via kobject_put() */
static void fw_cfg_sysfs_release_entry(struct kobject *kobj)
{
        struct fw_cfg_sysfs_entry *entry = to_entry(kobj);

        fw_cfg_sysfs_cache_delist(entry);
        kfree(entry);
}

/* kobj_type: ties together all properties required to register an entry */
static struct kobj_type fw_cfg_sysfs_entry_ktype = {
        .default_attrs = fw_cfg_sysfs_entry_attrs,
        .sysfs_ops = &fw_cfg_sysfs_attr_ops,
        .release = fw_cfg_sysfs_release_entry,
};

/* raw-read method and attribute */
static ssize_t fw_cfg_sysfs_read_raw(struct file *filp, struct kobject *kobj,
                                     struct bin_attribute *bin_attr,
                                     char *buf, loff_t pos, size_t count)
{
        struct fw_cfg_sysfs_entry *entry = to_entry(kobj);

        if (!capable(CAP_SYS_ADMIN))
                return -EACCES;

        if (pos > entry->f.size)
                return -EINVAL;

        if (count > entry->f.size - pos)
                count = entry->f.size - pos;

        fw_cfg_read_blob(entry->f.select, buf, pos, count);
        return count;
}

static struct bin_attribute fw_cfg_sysfs_attr_raw = {
        .attr = { .name = "raw", .mode = 0400 },
        .read = fw_cfg_sysfs_read_raw,
};

/* object set to represent fw_cfg sysfs subfolder */
static struct kset *fw_cfg_kset;

static int __init fw_cfg_register_file(const struct fw_cfg_file *f)
{
        int err;
        struct fw_cfg_sysfs_entry *entry;

        /* allocate new entry */
        entry = kzalloc(sizeof(*entry), GFP_KERNEL);
        if (!entry)
                return -ENOMEM;

        /* set file entry information */
        entry->f.size = be32_to_cpu(f->size);
        entry->f.select = be16_to_cpu(f->select);
        strcpy(entry->f.name, f->name);

        //FIXME: we could walk the '/' separated tokens of the file
        //       name, creating kset subdirectories of fw_cfg_kset,
        //       then use the last one as the parent for the kobj
        //       being created below:
        /* register sysfs entry for this file */
        entry->kobj.kset = fw_cfg_kset;
        /* NOTE: any '/' char in filename automatically turned into '!' */
        err = kobject_init_and_add(&entry->kobj, &fw_cfg_sysfs_entry_ktype,
                                   NULL, "%s", entry->f.name);
        if (err)
                goto err_register;

        /* add raw binary content access */
        err = sysfs_create_bin_file(&entry->kobj, &fw_cfg_sysfs_attr_raw);
        if (err)
                goto err_add_raw;

        /* success, add entry to global cache */
        fw_cfg_sysfs_cache_enlist(entry);
        return 0;

err_add_raw:
        kobject_del(&entry->kobj);
err_register:
        kfree(entry);
        return err;
}

static int __init fw_cfg_sysfs_init(void)
{
        int err;

        err = fw_cfg_io_probe();
        if (err)
                return err;

        /* create fw_cfg folder in sysfs */
        fw_cfg_kset = kset_create_and_add("fw_cfg", NULL, firmware_kobj);
        if (!fw_cfg_kset) {
                err = -ENOMEM;
                goto err_sysfs;
        }

        /* process fw_cfg file directory entry, registering each file */
        err = fw_cfg_scan_dir(fw_cfg_register_file);
        if (err)
                goto err_scan;

        /* success */
        pr_debug("fw_cfg: loaded.\n");
        return 0;

err_scan:
        fw_cfg_sysfs_cache_cleanup();
        //FIXME: we could recursively (depth-first) walk the kset
        //       hierarchy, unregistering from the leaf end toward
        //       the root, once the actual leaves (cache entries)
        //       are gone via fw_cfg_sysfs_cache_cleanup() above.
        kset_unregister(fw_cfg_kset);
err_sysfs:
        fw_cfg_io_cleanup();
        return err;
}

static void __exit fw_cfg_sysfs_exit(void)
{
        pr_debug("fw_cfg: unloading.\n");
        fw_cfg_sysfs_cache_cleanup();
        kset_unregister(fw_cfg_kset);   //FIXME: depth-first recursion again
        fw_cfg_io_cleanup();
}

module_init(fw_cfg_sysfs_init);
module_exit(fw_cfg_sysfs_exit);
MODULE_AUTHOR("Gabriel L. Somlo <address@hidden>");
MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
MODULE_LICENSE("GPL");



reply via email to

[Prev in Thread] Current Thread [Next in Thread]