[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Added a JFFS2 pickle
From: |
Jose E. Marchesi |
Subject: |
Re: [PATCH] Added a JFFS2 pickle |
Date: |
Thu, 04 Mar 2021 11:21:36 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
Hi Matt.
Thanks for this contribution, looks great! :)
I have got some comments. Please see below.
> For anyone unfamiliar, JFFS2 is a logging file system that's used with
> flash chips. I come across JFFS2 file systems frequently when looking
> at firmware image files. Hopefully someone else finds this useful!
>
>
> Along with the JFFS2 parsing code, I included a function
> (jffs2_print_fs_tree), that provides a quick way to view the contents
> of a JFFS2 file system. It prints them out like the "tree" command.
In other pickles we have used the convention of:
1) using "dump" instead of "print" in functions that are not used in
struct pretty-printers.
2) To put the dump functions in a separated FOO-dump.pk pickle.
What do you think?
> +/* Multibyte constants are cast as uint<x> to account for variable
> endianness */
> +var JFFS2_MAGIC = 0x1985 as uint<16>;
> +var JFFS2_MAGIC_OLD = 0x1984 as uint<16>;
> +var JFFS2_MAGIC_REV = 0x8519 as uint<16>;
> +
> +var JFFS2_NODE_ACCURATE = 0x2000 as uint<16>;
A more concise alternative there is to use the UH suffix:
var JFFS2_MAGIC = 0x1985UH;
var JFFS2_MAGIC_OLD = 0x1984UH;
var JFFS2_MAGIC_REV = 0x8519UH;
Same for the other variables in the file using casts to uint<16>.
But I am not sure I understand the "variable endianness" part? What do
you mean exactly?
> +var JFFS2_NODETYPES = [
> + JFFS2_NODETYPE_DIRENT,
> + JFFS2_NODETYPE_INODE,
> + JFFS2_NODETYPE_CLEANMARKER,
> + JFFS2_NODETYPE_PADDING,
> + JFFS2_NODETYPE_SUMMARY,
> + JFFS2_NODETYPE_XATTR,
> + JFFS2_NODETYPE_XREF
> +];
In pickles distributed with poke we use lower-case identifiers for
variables in general, and upper-case identifiers for variables holding
constants.
So the above would be:
var jffs2_nodetypes = [
JFFS2_NODETYPE_DIRENT,
JFFS2_NODETYPE_INODE,
JFFS2_NODETYPE_CLEANMARKER,
JFFS2_NODETYPE_PADDING,
JFFS2_NODETYPE_SUMMARY,
JFFS2_NODETYPE_XATTR,
JFFS2_NODETYPE_XREF
];
Same for the other array variables defined below in the file.
> +/*
> + * Similar to a linux dirent - represents an entry within
> + * a directory. Could be a regular file, dir, device, link,
> + * etc...
> + */
> +type JFFS2_Dirent = struct
> +{
In pickles distributed with poke we use the following style:
type JFFS2_Dirent =
struct
{
[...]
};
> + uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD,
> JFFS2_MAGIC_REV];
> + uint<16> node_type : node_type in JFFS2_NODETYPES;
> + offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> + uint<32> node_header_crc;
> + uint<32> pino; // Parent inode number
> + uint<32> version;
> + uint<32> ino; // inode number
> + uint<32> mctime; // time of the last modification in secs
> + offset<uint<8>, B> nsize; // length of the file name
> + uint<8> dir_type : dir_type in JFFS2_DIRENT_TYPES;
> + uint<8>[2] unused;
You may want to use an anonymous field there, to make displaying values
of this struct less verbose:
uint<8>[2];
> + uint<32> node_crc; // crc of data from node magic to unused
> + uint<32> name_crc; // crc of name[nsize]
> + char[0] name; // file name
> +
> + method get_name = string:
> + {
> + return catos(char[nsize] @ name'offset);
> + }
Hmm, why not just using this to define the `name' field:
char[nsize] name;
Then
a) You won't need to define a method `get_name' and
b) The struct type will be "agnostic" i.e. it will work as well mapped
and not mapped.
Also, as it is now the size of a JFFS2_Dirent wont include the space
occupied by the name string! (I see you use the total_len in JFFS2_Node
below, but I think it would be better for each struct type to be as
autonomous as possible.)
> +};
> +
> +/*
> + * Holds file data and metadata
> + * A fresh FS will have (page_size % file_size) inodes for each
> + * file, each one representing 1 page of data, and each one with an
> incremented
> + * version value. Each time a change is made to a file (either its data or
> metedata),
> + * a new inode is created with offset set to the location in the file the
> + * data was changed and version incremented by 1. Obselete inodes are
> eventually
> + * garbage collected
> + */
> +type JFFS2_Inode = struct
> +{
> + uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD,
> JFFS2_MAGIC_REV];
> + uint<16> node_type : node_type in JFFS2_NODETYPES;
> + offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> + uint<32> node_header_crc;
> + uint<32> ino; // inode number
> + uint<32> version; // log version
> + uint<32> mode; // stat
> + uint<16> uid; // user id
> + uint<16> gid; // group id
> + uint<32> isize; // file size
> + uint<32> atime; // last access time
> + uint<32> mtime; // last modification time
> + uint<32> ctime; // last time of status change
> + offset<uint<32>, B> offset; // where to begin write
> + offset<uint<32>, B> csize; // compressed data size
> + offset<uint<32>, B> dsize; // uncompressed data size
> + uint<8> compr : compr in JFFS2_INODE_COMP_TYPES;
> + uint<8> usercompr : usercompr in JFFS2_INODE_COMP_TYPES;
> + uint<16> flags;
> + uint<32> data_crc; // crc of data[csize]
> + uint<32> node_crc; // crc from jffs2 to magic to flags
> + byte[0] data;
> +
> + method get_data = byte[]:
> + {
> + return byte[csize] @ data'offset;
> + }
> +};
Here the situation may be different, because the data described by the
inode is not really part of the inode.
But in that case, why having `data' defined in the inode at all?
> +/*
> + * Holds a single extended attribute name/value pair
> + *
> + * Associated to a file (ino) via an XRef node.
> + * This allows XAttr nodes to be reused by multiple files
> + */
> +type JFFS2_XAttr = struct
> +{
> + uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD,
> JFFS2_MAGIC_REV];
> + uint<16> node_type : node_type in JFFS2_NODETYPES;
> + offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> + uint<32> node_header_crc;
> + uint<32> xid;
> + uint<32> version;
> + uint<8> xprefix;
> + offset<uint<8>, B> name_len;
> + offset<uint<16>, B> value_len;
> + uint<32> data_crc;
> + uint<32> node_crc;
> + byte[0] data;
> +
> + method get_name = string:
> + {
> + return catos(char[name_len] @ data'offset);
> + }
> +
> + method get_value = byte[]:
> + {
> + return byte[value_len] @ data'offset + name_len;
> + }
> +};
Why not having two fields `name' and `data' in the struct definition?
char[name_len] name;
byte[0] data;
If extended attributes don't always have names, you could make it an
optional field.
> +/*
> + * Represets in a dirent in a summary node
> + */
> +type JFFS2_Sum_Dirent = struct
> +{
> + uint<16> node_type : node_type in JFFS2_NODETYPES;
> + uint<32> totlen; /* record length */
> + offset<uint<32>, B> offset; /* ofset on jeb */
> + uint<32> pino; /* parent inode */
> + uint<32> version; /* dirent version */
> + uint<32> ino; /* == zero for unlink */
> + offset<uint<8>, B> nsize; /* dirent name size */
> + uint<8> dir_type; /* dirent type */
> + char[0] name; /* dirent name */
> +
> + byte[0] @ OFFSET + nsize;
> +
> + method get_name = string:
> + {
> + return catos(char[nsize] @ name'offset);
> + }
> +};
Same comment than above regarding `name'. Why not having it defined
directly as `char[nsize] name' ?
> +/*
> + * JFFS2 Summary node record
> + * Exists within a summary node, but represents a JFFS2
> + * in the surrounding erase block
> + */
> +type JFFS2_Sum_Rec = struct
> +{
> + byte[0] record;
> + var node_type = uint<16> @ record'offset;
> +
> + union {
> + JFFS2_Sum_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
> + JFFS2_Sum_Inode inode : node_type == JFFS2_NODETYPE_INODE;
> + JFFS2_Sum_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
> + JFFS2_Sum_XRef xref : node_type == JFFS2_NODETYPE_XREF;
> + JFFS2_Sum_Unk unknown;
> + } data;
> +
> + method get_node_type = uint<16>:
> + {
> + return node_type;
> + }
> +};
This is an interesting idiom you are introducing here. Never saw it
before :)
You are basically using a trick to access the first 16-bits of the
struct types below, that share a common header.
Problem with this approach is that the type is not agnostic, i.e. it
will only work if you map a value o the struct, not if you construct one
in memory.
I can think on two strategies for turning JFFS2_Sum_Rec into an agnostic
type:
a) Rewrite the method to access the fields in the union instead of doing
a mapping:
method get_node_type = uint<16>:
{
try return dirent.node_type; catch if E_elem {}
try return inode.node_type; catch if E_elem {}
...
return unknown.node_type;
}
Note how I am relying on the field name instead of the value of
node_type. This is so the union tag logic doesn't get replicated in
the method. The only thing the method assumes is that `unknown' is
the last alternative.
b) Use a pinned struct:
type JFFS2_Sum_Rec =
pinned struct
{
uint<16> node_type;
union {
JFFS2_Sum_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
JFFS2_Sum_Inode inode : node_type == JFFS2_NODETYPE_INODE;
JFFS2_Sum_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
JFFS2_Sum_XRef xref : node_type == JFFS2_NODETYPE_XREF;
JFFS2_Sum_Unk unknown;
} data;
};
or, alternatively, a label in `data':
type JFFS2_Sum_Rec =
struct
{
uint<16> node_type;
union {
JFFS2_Sum_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
JFFS2_Sum_Inode inode : node_type == JFFS2_NODETYPE_INODE;
JFFS2_Sum_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
JFFS2_Sum_XRef xref : node_type == JFFS2_NODETYPE_XREF;
JFFS2_Sum_Unk unknown;
} data @ 0#B;
};
This way you don't need a method at all :)
> +/*
> + * An Erase Block Summary (EBS) provides a "summary" of a JFFS2 erase
> + * block. If a jffs2 fs has summaries they will be at the end of
> + * each erase block, and will contain a summary record for each JFFS2
> + * node (dirents, inodes, xattrs, and xrefs) in the erase block.
> + * They were added to make mounting JFFS2 file system faster: instead
> + * of parsing through all the data in an erase block, you could jump
> + * to the end and parse the summary node
> + */
> +type JFFS2_Summary = struct
> +{
> + uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD,
> JFFS2_MAGIC_REV];
> + uint<16> node_type : node_type in JFFS2_NODETYPES;
> + offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> + uint<32> node_header_crc;
> + uint<32> sum_num;
> + uint<32> cln_mkr;
> + uint<32> padded;
> + uint<32> sum_crc;
> + uint<32> node_crc;
> + uint<32>[0] records;
> +
> + method get_records = JFFS2_Sum_Rec[]:
> + {
> + return JFFS2_Sum_Rec[sum_num] @ records'offset;
> + }
> +
> +};
The size of JFFS2_Summary won't include the records... are the records
conceptually part of the summary?
> +/*
> + * This is essentially a union of all the node types that allows
> + * you to get the node_type, total_len, and header_crc without
> + * knowing what the node is
> + */
> +type JFFS2_Node = struct
> +{
> + byte[0] node;
> +
> + var node_type = uint<16> @ node'offset + 16#b;
> + var total_len = offset<uint<32>, B> @ node'offset + 32#b;
> + var header_crc = uint<32> @ node'offset + 64#b;
> +
> + union {
> + JFFS2_Dirent dirent : node_type == JFFS2_NODETYPE_DIRENT;
> + JFFS2_Inode inode : node_type == JFFS2_NODETYPE_INODE;
> + JFFS2_XAttr xattr : node_type == JFFS2_NODETYPE_XATTR;
> + JFFS2_XRef xref : node_type == JFFS2_NODETYPE_XREF;
> + JFFS2_Summary summary : node_type == JFFS2_NODETYPE_SUMMARY;
> + JFFS2_Unk_Node padding : node_type == JFFS2_NODETYPE_PADDING;
> + JFFS2_Unk_Node clean_marker : node_type ==
> JFFS2_NODETYPE_CLEANMARKER;
> + JFFS2_Unk_Node unknown;
> + } data;
Here you have the same situation than in JFFS2_Sum_Rec: all the
alternatives in `data' share some prefix fields: magic, node_type,
total_len and header_crc.
For this particular case I suggest the solution b) above: abstract the
magic, node_type, total_len and header_crc into a JFFS2_Node_Header
type, and then:
type JFFS2_Node =
pinned struct
{
JFFS2_Node_Header header;
union
{
JFFS2_Dirent dirent;
JFFS2_Inode inode;
...
JFFS2_Unk_Node unknown;
} data;
};
Then, for example, JFSF2_Dirent:
type JFFS2_Dirent =
struct
{
JFFS2_Node_Header header : header.magic == JFFS2_NODETYPE_DIRENT;
[...]
};
Note how it is now JFFS2_Dirent who knows its applicable magic values,
not the union in JFFS2_Node.
This approach also allows the user to poke node headers by
themselves... which may be useful if she is experimenting adding a new
node type to JFFS2 for example.
> + byte[0] @ total_len + alignto(total_len, JFFS2_ALIGNMENT);
> +
> + method get_node_type = uint<16>:
> + {
> + return node_type;
> + }
> +
> + method get_total_len = offset<uint<32>, B>:
> + {
> + return total_len;
> + }
> +
> + method get_header_crc = uint<32>:
> + {
> + return header_crc;
> + }
> +};
Using the approach above these methods are no longer needed. The user
can just access node.header.node_type, etc.
> +/*
> + * JFFS2 is a logging file system that exists on flash as
> + * a circular buffer of node structures. When changes are made to
> + * a file, new nodes are added to the end of the buffer
> + * with incremented "version" fields. Old nodes are garbage collected.
> + */
> +type JFFS2_FS = struct
> +{
> + JFFS2_Node[] nodes;
> +
> + method get_nodes_by_type = (uint<16> node_type) JFFS2_Node[]:
> + {
> + var res_nodes = JFFS2_Node[]();
> + for (node in nodes) {
> + if (node.get_node_type() == node_type) {
> + res_nodes += [node];
> + }
> + }
That loop can be simplified:
for (node in nodes where node.get_node_type() == node_type)
res_nodes += [node];
> +
> + return res_nodes;
> + }
> +};
> +
> +/* --- Nothing below here is part of the JFFS2 spec. Just utility types and
> funcitons --- */
I will review the rest of the file once we settle the points above.
Thanks! :)
- [PATCH] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/04
- Re: [PATCH] Added a JFFS2 pickle,
Jose E. Marchesi <=
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/04
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/07
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/08
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/08
- Re: [Patch] Added a JFFS2 pickle, Matt Ihlenfield, 2021/03/08