poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Patch] Added a JFFS2 pickle


From: Jose E. Marchesi
Subject: Re: [Patch] Added a JFFS2 pickle
Date: Fri, 05 Mar 2021 09:55:52 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Hi Matt.

Thanks for the updated patch.

> Thanks for reviewing this! A v2 of the patch is at the end.
>
> ------- Review comments: -------
>
>> In other pickles we have used the convention of:
>>
>> 1.  using "dump" instead of "print" in functions that are not used in
>>     struct pretty-printers.
>>
>> 2.  To put the dump functions in a separated FOO-dump.pk pickle.
>>
>>     What do you think?
>
> My original intention was for jffs2_print_fs_tree to be a method 
> (print_fs_tree)
> on JFFS2_FS, but I didn't want to stuff all the JFFS2_Entry and 
> JFFS2_Entry_Table
> stuff in to that struct, so I kept them separate. I think I'd prefer to keep
> the function in jffs2.pk and rename it to jffs2_dump_fs_tree, which I think is
> what 1. is suggesting?

Yes that will work.
I promise we will support passing arguments to struct types soon.

>> But I am not sure I understand the "variable endianness" part? What do
>> you mean exactly?
>
> I just meant that a JFFS2 file system could be little endian or big endian.
> The original JFFS2 paper says that it's host-endian, but mkfs.jffs2 and 
> sumtool
> both let you specify an endianness when creating jffs2 file systems.
>
> Ill remove the `as uint<16>` stuff from v2 of the patch though. I realized it 
> was
> something that I had done early on when I was confused about how poke
> handled endianness.

Ok :)

>> In pickles distributed with poke we use lower-case identifiers for
>> variables in general, and upper-case identifiers for variables holding
>> constants.
>>
>> So the above would be:
>>
>> var jffs2_nodetypes = [
>> JFFS2_NODETYPE_DIRENT,
>> JFFS2_NODETYPE_INODE,
>> JFFS2_NODETYPE_CLEANMARKER,
>> JFFS2_NODETYPE_PADDING,
>> JFFS2_NODETYPE_SUMMARY,
>> JFFS2_NODETYPE_XATTR,
>> JFFS2_NODETYPE_XREF
>> ];
>
> These arrays are just there to make my field constraints a little cleaner. A 
> user
> probably shouldn't change them, so I was thinking of them as constants. But I 
> can
> still make the change if you'd prefer it.

I was thinking on the scenario of a new node type being added to the
spec.

>> Here the situation may be different, because the data described by the
>> inode is not really part of the inode.
>>
>> But in that case, why having `data' defined in the inode at all?
>
> I would say the data is part of the inode for jffs2, but the data is
> potentially large, so mapping all the data in all the inodes takes
> some time. I think I could make the JFFS2_Inode struct more
> independent by adding this field after the data field:
>
> byte[0] data_end @ total_len;
>
> I could do something similar with the JFFS2_Summary and JFFS2_XAttr structs 
> too.

Yes... I would have an anonymous

  byte[0] @ total_len;

in the struct types, and a method `get_data_offset'.

>> This is an interesting idiom you are introducing here. Never saw it
>> before :)
>
> My new idiom seems to be the result of not reading the manual well enough :)

Still, it is fun :D

>> The method `get_node_type' may not be necessary at all, since the usual
>> way in poke to check whether a given union value is of some alternative
>> foo is to:
>>
>>   try [.. do something with value.FOO ..]; catch if E_elem {...}
>
> I decided to go with a get_node_type function because it makes
> checking for node types in jffs2_create_entry_tables and
> JFFS2_FS.get_nodes_by_type easier. A user could still do the
> try...catch method if they wanted to.

Yeah, if you see the other thread, we are discussing on how to add some
syntax sugar to make it easier for asking an union value about its type.

> +/*
> + * Holds file data and metadata
> + * A fresh FS will have (file_size / page_size) inodes for each
> + * file, each one representing 1 page of data, and each one with an 
> incremented
> + * version value. Each time a change is made to a file (either its data or 
> metedata),
> + * a new inode is created with offset set to the location in the file the
> + * data was changed and version incremented by 1. Obselete inodes are 
> eventually
> + * garbage collected
> + */
> +type JFFS2_Inode =
> +struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD];
> +    uint<16> node_type = JFFS2_NODETYPE_INODE;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> ino; // inode number
> +    uint<32> version; // log version
> +    uint<32> mode; // stat
> +    uint<16> uid; // user id
> +    uint<16> gid; // group id
> +    uint<32> isize; // file size
> +    uint<32> atime; // last access time
> +    uint<32> mtime; // last modification time
> +    uint<32> ctime; // last time of status change
> +    offset<uint<32>, B> offset; // where to begin write
> +    offset<uint<32>, B> csize;  // compressed data size
> +    offset<uint<32>, B> dsize; // uncompressed data size
> +    uint<8> compr : compr in JFFS2_INODE_COMP_TYPES;
> +    uint<8> usercompr : usercompr in JFFS2_INODE_COMP_TYPES;
> +    uint<16> flags;
> +    uint<32> data_crc; // crc of data[csize]
> +    uint<32> node_crc; // crc from jffs2 to magic to flags
> +    byte[0] data; // This could be large, don't want to map it unless needed
> +    byte[0] data_end @ total_len;
> +
> +    // padding
> +    byte[0] @ total_len + alignto(total_len, JFFS2_ALIGNMENT);
> +
> +    method get_data = byte[]:
> +    {
> +        return byte[csize] @ data'offset;
> +    }
> +};

I think in situations like this it is better to have a method to return
the offset of data, `get_data_offset', than a method that maps an array
of bytes.

The reason is that `data' is a payload in this case, i.e. it can contain
all sort of things, unrestricted.  Therefore, the user will most
probably want to map stuff there: maybe an array of bytes yes, but also
maybe an JPEG file.

If you give the user an array of bytes and she is interested in mapping
something else in that area, she will just use get_data'offset and
get_data'size as coordinates for mapping, and the array of bytes will be
mapped for nothing.

> +/*
> + * Holds a single extended attribute name/value pair
> + *
> + * Associated to a file (ino) via an XRef node.
> + * This allows XAttr nodes to be reused by multiple files
> + */
> +type JFFS2_XAttr =
> +struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD];
> +    uint<16> node_type = JFFS2_NODETYPE_XATTR;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> xid;
> +    uint<32> version;
> +    uint<8> xprefix;
> +    offset<uint<8>, B> name_len;
> +    offset<uint<16>, B> value_len;
> +    uint<32> data_crc;
> +    uint<32> node_crc;
> +    char[name_len] name if name_len != 0#B;
> +    byte[0] data;
> +    byte[0] data_end @ total_len;

Why data_end needs to be a named field?

> +
> +    // padding
> +    byte[0] @ total_len + alignto(total_len, JFFS2_ALIGNMENT);
> +
> +    method get_name = string:
> +    {
> +        if (name_len != 0#B) {
> +            return catos(name);
> +        } else {
> +            return "";
> +        }
> +    }

catos does the right thing with empty arrays of characters:

(poke) var a = char[]()
(poke) a
[]
(poke) catos (a)
""

So I think you can simplify that.

Also, you may want to add set_name methods too, using stoca.  This is
an example from id3v1.pk:

 method set_comment = (string val) void:
 {
   try stoca (val, data.comment, ' ');
   catch if E_elem { stoca (val, data.extended.comment, ' '); }
 }

> +    method get_data = byte[]:
> +    {
> +        return byte[value_len] @ data'offset;
> +    }
> +};

> +/*
> + * JFFS2 Summary node record
> + * Exists within a summary node, but represents a JFFS2
> + * in the surrounding erase block
> + */
> +type JFFS2_Sum_Rec =
> +union {
> +    JFFS2_Sum_Dirent dirent;
> +    JFFS2_Sum_Inode inode;
> +    JFFS2_Sum_XAttr xattr;
> +    JFFS2_Sum_XRef xref;
> +    JFFS2_Sum_Unk unknown;
> +
> +     method get_node_type = uint<16>:
> +     {
> +       try return dirent.node_type; catch if E_elem {}
> +       try return inode.node_type; catch if E_elem {}
> +       try return xattr.node_type; catch if E_elem {}
> +       try return xref.node_type; catch if E_elem {}
> +
> +       return unknown.node_type;
> +     }
> +};

Nice! :)

> +/*
> + * An Erase Block Summary (EBS) provides a "summary" of a JFFS2 erase
> + * block. If a jffs2 fs has summaries they will be at the end of
> + * each erase block, and will contain a summary record for each JFFS2
> + * node (dirents, inodes, xattrs, and xrefs) in the erase block.
> + * They were added to make mounting JFFS2 file system faster: instead
> + * of parsing through all the data in an erase block, you could jump
> + * to the end and parse the summary node
> + */
> +type JFFS2_Summary =
> +struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD];
> +    uint<16> node_type = JFFS2_NODETYPE_SUMMARY;
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +    uint<32> sum_num;
> +    uint<32> cln_mkr;
> +    uint<32> padded;
> +    uint<32> sum_crc;
> +    uint<32> node_crc;
> +    byte[0] records; // This will be large. Don't want to map it unless 
> needed
> +    byte[0] records_end @ total_len;

Why having records and records_end?
I would just have a get_records_offset method.

> +
> +    // padding
> +    byte[0] @ total_len + alignto(total_len, JFFS2_ALIGNMENT);
> +
> +    method get_records = JFFS2_Sum_Rec[]:
> +    {
> +        return JFFS2_Sum_Rec[sum_num] @ records'offset;
> +    }
> +
> +};
> +
> +/*
> + * Used to for nodes with no extra data, or for unknown nodes.
> + *
> + * If a JFFS2 implementation encounters a node_type it isn't familiar
> + * with, it checks the FEATURE bits in the node_type to see how to handle
> + * it:
> + *
> + */
> +type JFFS2_Unk_Node =
> +struct
> +{
> +    uint<16> magic : magic in [JFFS2_MAGIC, JFFS2_MAGIC_OLD];
> +    uint<16> node_type; // could be anything
> +    offset<uint<32>, B> total_len : total_len >= JFFS2_HEADER_SIZE;
> +    uint<32> node_header_crc;
> +
> +    byte[0] data;
> +    byte[0] data_end @ total_len;

Ditto here for `data'.

> +    // padding
> +    byte[0] @ total_len + alignto(total_len, JFFS2_ALIGNMENT);
> +
> +    method get_data = byte[]:
> +    {
> +        return byte[data_end'offset - data'offset] @ data'offset;
> +    }
> +};
> +




reply via email to

[Prev in Thread] Current Thread [Next in Thread]