Re: [PATCH] Added a JFFS2 pickle

poke-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Added a JFFS2 pickle

From:	Jose E. Marchesi
Subject:	Re: [PATCH] Added a JFFS2 pickle
Date:	Thu, 04 Mar 2021 11:21:36 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Hi Matt.

Thanks for this contribution, looks great! :)

I have got some comments.  Please see below.

> For anyone unfamiliar, JFFS2 is a logging file system that's used with
> flash chips. I come across JFFS2 file systems frequently when looking
> at firmware image files.  Hopefully someone else finds this useful!
>
>
> Along with the JFFS2 parsing code, I included a function
> (jffs2_print_fs_tree), that provides a quick way to view the contents
> of a JFFS2 file system. It prints them out like the "tree" command.

In other pickles we have used the convention of:

1) using "dump" instead of "print" in functions that are not used in
   struct pretty-printers.

2) To put the dump functions in a separated FOO-dump.pk pickle.

What do you think?

> +/* Multibyte constants are cast as uint<x> to account for variable 
> endianness */
> +var JFFS2_MAGIC = 0x1985 as uint<16>;
> +var JFFS2_MAGIC_OLD = 0x1984 as uint<16>;
> +var JFFS2_MAGIC_REV = 0x8519 as uint<16>;
> +
> +var JFFS2_NODE_ACCURATE = 0x2000 as uint<16>;

A more concise alternative there is to use the UH suffix:

var JFFS2_MAGIC = 0x1985UH;
var JFFS2_MAGIC_OLD = 0x1984UH;
var JFFS2_MAGIC_REV = 0x8519UH;

Same for the other variables in the file  using casts to uint<16>.

But I am not sure I understand the "variable endianness" part?  What do
you mean exactly?

> +var JFFS2_NODETYPES = [
> +    JFFS2_NODETYPE_DIRENT,
> +    JFFS2_NODETYPE_INODE,
> +    JFFS2_NODETYPE_CLEANMARKER,
> +    JFFS2_NODETYPE_PADDING,
> +    JFFS2_NODETYPE_SUMMARY,
> +    JFFS2_NODETYPE_XATTR,
> +    JFFS2_NODETYPE_XREF
> +];

In pickles distributed with poke we use lower-case identifiers for
variables in general, and upper-case identifiers for variables holding
constants.

So the above would be:

var jffs2_nodetypes = [
    JFFS2_NODETYPE_DIRENT,
    JFFS2_NODETYPE_INODE,
    JFFS2_NODETYPE_CLEANMARKER,
    JFFS2_NODETYPE_PADDING,
    JFFS2_NODETYPE_SUMMARY,
    JFFS2_NODETYPE_XATTR,
    JFFS2_NODETYPE_XREF
];

Same for the other array variables defined below in the file.

> +/*
> + * Similar to a linux dirent - represents an entry within
> + * a directory. Could be a regular file, dir, device, link,
> + * etc...
> + */
> +type JFFS2_Dirent = struct
> +{

In pickles distributed with poke we use the following style:

type JFFS2_Dirent =
  struct
  {
    [...]
  };

> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD, 
> JFFS2_MAGIC_REV];
> +    uint<16> node_type : node_type in JFFS2_NODETYPES;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> pino; // Parent inode number
> +    uint<32> version;
> +    uint<32> ino; // inode number
> +    uint<32> mctime; // time of the last modification in secs
> +    offset<uint<8>, B> nsize; // length of the file name
> +    uint<8> dir_type : dir_type in JFFS2_DIRENT_TYPES;
> +    uint<8>[2] unused;

You may want to use an anonymous field there, to make displaying values
of this struct less verbose:

uint<8>[2];

> +    uint<32> node_crc; // crc of data from node magic to unused
> +    uint<32> name_crc; // crc of name[nsize]
> +    char[0] name; // file name
> +
> +    method get_name = string:
> +    {
> +        return catos(char[nsize] @ name'offset);
> +    }

Hmm, why not just using this to define the `name' field:

  char[nsize] name;

Then

a) You won't need to define a method `get_name' and
b) The struct type will be "agnostic" i.e. it will work as well mapped
   and not mapped.

Also, as it is now the size of a JFFS2_Dirent wont include the space
occupied by the name string!  (I see you use the total_len in JFFS2_Node
below, but I think it would be better for each struct type to be as
autonomous as possible.)

> +};
> +
> +/*
> + * Holds file data and metadata
> + * A fresh FS will have (page_size % file_size) inodes for each
> + * file, each one representing 1 page of data, and each one with an 
> incremented
> + * version value. Each time a change is made to a file (either its data or 
> metedata),
> + * a new inode is created with offset set to the location in the file the
> + * data was changed and version incremented by 1. Obselete inodes are 
> eventually
> + * garbage collected
> + */
> +type JFFS2_Inode = struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD, 
> JFFS2_MAGIC_REV];
> +    uint<16> node_type : node_type in JFFS2_NODETYPES;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> ino; // inode number
> +    uint<32> version; // log version
> +    uint<32> mode; // stat
> +    uint<16> uid; // user id
> +    uint<16> gid; // group id
> +    uint<32> isize; // file size
> +    uint<32> atime; // last access time
> +    uint<32> mtime; // last modification time
> +    uint<32> ctime; // last time of status change
> +    offset<uint<32>, B> offset; // where to begin write
> +    offset<uint<32>, B> csize;  // compressed data size
> +    offset<uint<32>, B> dsize; // uncompressed data size
> +    uint<8> compr : compr in JFFS2_INODE_COMP_TYPES;
> +    uint<8> usercompr : usercompr in JFFS2_INODE_COMP_TYPES;
> +    uint<16> flags;
> +    uint<32> data_crc; // crc of data[csize]
> +    uint<32> node_crc; // crc from jffs2 to magic to flags
> +    byte[0] data;
> +
> +    method get_data = byte[]:
> +    {
> +        return byte[csize] @ data'offset;
> +    }
> +};

Here the situation may be different, because the data described by the
inode is not really part of the inode.

But in that case, why having `data' defined in the inode at all?

> +/*
> + * Holds a single extended attribute name/value pair
> + *
> + * Associated to a file (ino) via an XRef node.
> + * This allows XAttr nodes to be reused by multiple files
> + */
> +type JFFS2_XAttr = struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD, 
> JFFS2_MAGIC_REV];
> +    uint<16> node_type : node_type in JFFS2_NODETYPES;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> xid;
> +    uint<32> version;
> +    uint<8> xprefix;
> +    offset<uint<8>, B> name_len;
> +    offset<uint<16>, B> value_len;
> +    uint<32> data_crc;
> +    uint<32> node_crc;
> +    byte[0] data;
> +
> +    method get_name = string:
> +    {
> +        return catos(char[name_len] @ data'offset);
> +    }
> +
> +    method get_value = byte[]:
> +    {
> +        return byte[value_len] @ data'offset + name_len;
> +    }
> +};

Why not having two fields `name' and `data' in the struct definition?

  char[name_len] name;
  byte[0] data;

If extended attributes don't always have names, you could make it an
optional field.

> +/*
> + * Represets in a dirent in a summary node
> + */
> +type JFFS2_Sum_Dirent = struct
> +{
> +    uint<16> node_type : node_type in JFFS2_NODETYPES;
> +    uint<32> totlen; /* record length */
> +    offset<uint<32>, B> offset; /* ofset on jeb */
> +    uint<32> pino; /* parent inode */
> +    uint<32> version; /* dirent version */
> +    uint<32> ino; /* == zero for unlink */
> +    offset<uint<8>, B> nsize; /* dirent name size */
> +    uint<8> dir_type; /* dirent type */
> +    char[0] name; /* dirent name */
> +
> +    byte[0] @ OFFSET + nsize;
> +
> +    method get_name = string:
> +    {
> +        return catos(char[nsize] @ name'offset);
> +    }
> +};

Same comment than above regarding `name'.  Why not having it defined
directly as `char[nsize] name' ?

> +/*
> + * JFFS2 Summary node record
> + * Exists within a summary node, but represents a JFFS2
> + * in the surrounding erase block
> + */
> +type JFFS2_Sum_Rec = struct
> +{
> +    byte[0] record;
> +    var node_type = uint<16> @ record'offset;
> +
> +    union {
> +        JFFS2_Sum_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
> +        JFFS2_Sum_Inode inode : node_type == JFFS2_NODETYPE_INODE;
> +        JFFS2_Sum_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
> +        JFFS2_Sum_XRef xref : node_type == JFFS2_NODETYPE_XREF;
> +        JFFS2_Sum_Unk unknown;
> +    } data;
> +
> +    method get_node_type = uint<16>:
> +    {
> +        return node_type;
> +    }
> +};

This is an interesting idiom you are introducing here.  Never saw it
before :)

You are basically using a trick to access the first 16-bits of the
struct types below, that share a common header.

Problem with this approach is that the type is not agnostic, i.e. it
will only work if you map a value o the struct, not if you construct one
in memory.

I can think on two strategies for turning JFFS2_Sum_Rec into an agnostic
type:

a) Rewrite the method to access the fields in the union instead of doing
   a mapping:

   method get_node_type = uint<16>:
   {
     try return dirent.node_type; catch if E_elem {}
     try return inode.node_type; catch if E_elem {}
     ...
     return unknown.node_type;
   }

   Note how I am relying on the field name instead of the value of
   node_type.  This is so the union tag logic doesn't get replicated in
   the method.  The only thing the method assumes is that `unknown' is
   the last alternative.

b) Use a pinned struct:

   type JFFS2_Sum_Rec =
     pinned struct
     {
       uint<16> node_type;

       union {
           JFFS2_Sum_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
           JFFS2_Sum_Inode inode : node_type == JFFS2_NODETYPE_INODE;
           JFFS2_Sum_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
           JFFS2_Sum_XRef xref : node_type == JFFS2_NODETYPE_XREF;
           JFFS2_Sum_Unk unknown;
       } data;
      };

    or, alternatively, a label in `data':

     type JFFS2_Sum_Rec =
       struct
       {
         uint<16> node_type;

         union {
             JFFS2_Sum_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
             JFFS2_Sum_Inode inode : node_type == JFFS2_NODETYPE_INODE;
             JFFS2_Sum_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
             JFFS2_Sum_XRef xref : node_type == JFFS2_NODETYPE_XREF;
             JFFS2_Sum_Unk unknown;
         } data @ 0#B;
        };

    This way you don't need a method at all :)

> +/*
> + * An Erase Block Summary (EBS) provides a "summary" of a JFFS2 erase
> + * block. If a jffs2 fs has summaries they will be at the end of
> + * each erase block, and will contain a summary record for each JFFS2
> + * node (dirents, inodes, xattrs, and xrefs) in the erase block.
> + * They were added to make mounting JFFS2 file system faster: instead
> + * of parsing through all the data in an erase block, you could jump
> + * to the end and parse the summary node
> + */
> +type JFFS2_Summary = struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD, 
> JFFS2_MAGIC_REV];
> +    uint<16> node_type : node_type in JFFS2_NODETYPES;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> sum_num;
> +    uint<32> cln_mkr;
> +    uint<32> padded;
> +    uint<32> sum_crc;
> +    uint<32> node_crc;
> +    uint<32>[0] records;
> +
> +    method get_records = JFFS2_Sum_Rec[]:
> +    {
> +        return JFFS2_Sum_Rec[sum_num] @ records'offset;
> +    }
> +
> +};

The size of JFFS2_Summary won't include the records...  are the records
conceptually part of the summary?

> +/*
> + * This is essentially a union of all the node types that allows
> + * you to get the node_type, total_len, and header_crc without
> + * knowing what the node is
> + */
> +type JFFS2_Node = struct
> +{
> +    byte[0] node;
> +
> +    var node_type = uint<16> @ node'offset + 16#b;
> +    var total_len = offset<uint<32>, B> @ node'offset + 32#b;
> +    var header_crc = uint<32> @ node'offset + 64#b;
> +
> +    union {
> +        JFFS2_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
> +        JFFS2_Inode inode : node_type == JFFS2_NODETYPE_INODE;
> +        JFFS2_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
> +        JFFS2_XRef xref : node_type == JFFS2_NODETYPE_XREF;
> +        JFFS2_Summary summary : node_type == JFFS2_NODETYPE_SUMMARY;
> +        JFFS2_Unk_Node padding : node_type == JFFS2_NODETYPE_PADDING;
> +        JFFS2_Unk_Node clean_marker : node_type == 
> JFFS2_NODETYPE_CLEANMARKER;
> +        JFFS2_Unk_Node unknown;
> +    } data;

Here you have the same situation than in JFFS2_Sum_Rec: all the
alternatives in `data' share some prefix fields: magic, node_type,
total_len and header_crc.

For this particular case I suggest the solution b) above: abstract the
magic, node_type, total_len and header_crc into a JFFS2_Node_Header
type, and then:

type JFFS2_Node =
  pinned struct
  {
    JFFS2_Node_Header header;
    union
    {
      JFFS2_Dirent dirent;
      JFFS2_Inode inode;
      ...
      JFFS2_Unk_Node unknown;
    } data;
  };

Then, for example, JFSF2_Dirent:

type JFFS2_Dirent =
  struct
  {
    JFFS2_Node_Header header : header.magic == JFFS2_NODETYPE_DIRENT;
    [...]
  };

Note how it is now JFFS2_Dirent who knows its applicable magic values,
not the union in JFFS2_Node.

This approach also allows the user to poke node headers by
themselves... which may be useful if she is experimenting adding a new
node type to JFFS2 for example.

> +    byte[0] @ total_len + alignto(total_len, JFFS2_ALIGNMENT);
> +
> +    method get_node_type = uint<16>:
> +    {
> +        return node_type;
> +    }
> +
> +    method get_total_len = offset<uint<32>, B>:
> +    {
> +        return total_len;
> +    }
> +
> +    method get_header_crc = uint<32>:
> +    {
> +        return header_crc;
> +    }
> +};

Using the approach above these methods are no longer needed.  The user
can just access node.header.node_type, etc.

> +/*
> + * JFFS2 is a logging file system that exists on flash as
> + * a circular buffer of node structures. When changes are made to
> + * a file, new nodes are added to the end of the buffer
> + * with incremented "version" fields. Old nodes are garbage collected.
> + */
> +type JFFS2_FS = struct
> +{
> +    JFFS2_Node[] nodes;
> +
> +    method get_nodes_by_type = (uint<16> node_type) JFFS2_Node[]:
> +    {
> +        var res_nodes = JFFS2_Node[]();
> +        for (node in nodes) {
> +            if (node.get_node_type() == node_type) {
> +                res_nodes += [node];
> +            }
> +        }

That loop can be simplified:

for (node in nodes where node.get_node_type() == node_type)
  res_nodes += [node];

> +
> +        return res_nodes;
> +    }
> +};
> +
> +/* --- Nothing below here is part of the JFFS2 spec. Just utility types and 
> funcitons --- */

I will review the rest of the file once we settle the points above.

Thanks! :)
[Prev in Thread]
Current Thread
[Next in Thread]
[PATCH] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/04
- Re: [PATCH] Added a JFFS2 pickle, Jose E. Marchesi <=
  - Re: [PATCH] Added a JFFS2 pickle, Jose E. Marchesi, 2021/03/04
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/04
  - Re: [Patch] Added a JFFS2 pickle, Jose E. Marchesi, 2021/03/05
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/07
  - Re: [Patch] Added a JFFS2 pickle, Jose E. Marchesi, 2021/03/08
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/08
  - Re: [Patch] Added a JFFS2 pickle, Jose E. Marchesi, 2021/03/08
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/08
  - Re: [Patch] Added a JFFS2 pickle, Jose E. Marchesi, 2021/03/08
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/08
Prev by Date: [PATCH] Added a JFFS2 pickle
Next by Date: [Bug Compiler/26695] Abortion on construction of struct with an array field with non-const boundary
Previous by thread: [PATCH] Added a JFFS2 pickle
Next by thread: Re: [PATCH] Added a JFFS2 pickle
Index(es):
- Date
- Thread