[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Libcdio-devel] Re: can files in UDF be fragmented?

From: Thomas Schmitt
Subject: [Libcdio-devel] Re: can files in UDF be fragmented?
Date: Fri, 22 Oct 2010 09:18:13 +0200


Shaya Potter:
> > > I'm just wondering if a file in a UDF file system can be fragmented?
Rocky Bernstein:
> > the code should be following the ECMA-167 specifications,
> > I don't see the guarantees, you are looking for,

Actually the contrary of the desired rule is announced.

ECMA-167 4/8.8 says:
"A file shall be described by a File Entry (see 4/14.9) or by an Extended
 File Entry (4/14.17), which shall specify the attributes of the file and
 the location of the file's recorded data. The data of a file shall be
 recorded in either of the following:
 - An ordered sequence of extents of logical blocks (see short_ad (4/14.14.1),
   long_ad (4/14.14.2) and ext_ad (4/14.14.3). The extents may be recorded
   or unrecorded, and allocated or unallocated. The extents, if specified
   as long_ad (4/14.14.2) or ext_ad (4/14.14.3), may be located on different
   partitions which may be on different volumes."

(This quote illustrates why i still procrastinate the endeavor to understand
 UDF enough to produce UDF images.)

The Allocation Descriptors in ECMA-167 4/14.4 all have 32-bit length fields.
So i assume that files of 4 GB or more have to be split into multiple extents.
But these extents may well be recorded as consequtive neighbors so that
they form one single block area.
It all depends on the UDF producing program.

Another reason for multiple extents might be recording of spare files
where large ranges of 0-bytes are represented as unallocated extents.

I understand that files on video DVD may not be larger than 1 GB.
So the probability is high that they consist of a single extent.

Shaya Potter:
> then does libcdio work?  it would seem from my reading that every
> udf_read_block() is basically made as an offset to the start of the file.
> i.e.
> 1) it calls offset_to_lba to find start sector and length
> 2) computed max # of blocks
> 3) calls udf_read_sectors() w/ that information

I seems that it is aware of multiple extents and but has problems to
fulfill read requests which cross an extent boundary.
Actually it seems to have a bug with extent limit evaluation.

In lib/udf/udf_file.c i read these two snippets.

  static lba_t
  offset_to_lba(const udf_dirent_t *p_udf_dirent, off_t i_offset,
                /*out*/ lba_t *pi_lba, /*out*/ uint32_t *pi_max_size)
           * The allocation descriptor field is filled with short_ad's.
           * If the offset is beyond the current extent, look for the
           * next extent.
          do {
          } while(i_offset >= icblen);
          lsector = (i_offset / UDF_BLOCKSIZE) + p_icb->pos;

          *pi_max_size = p_icb->len;

At usage of offset_to_lba it eventually warns and truncates the read job to
the size of the found extent (which is quite not senseful).

    lba_t i_lba = offset_to_lba(p_udf_dirent, p_udf->i_position, &i_lba,
    if (i_lba != CDIO_INVALID_LBA) {
      uint32_t i_max_blocks = CEILING(i_max_size, UDF_BLOCKSIZE);
      if ( i_max_blocks < count ) {
        printf("Warning: don't know how to handle yet\n" );
        count = i_max_blocks;
      ret = udf_read_sectors(p_udf, buf, i_lba, count);

Reading single sectors should be safe.

The code in offset_to_lba() seems faulty resp. uncoordinated with the usage
in udf_read_block():
If *pi_max_size is intended to give the readable bytes in the found
extent beginning at the current read position, then one should subtract
i_offset from it.

To make it fully able to deal with multiple extents:
In udf_read_block() one would have to repeat the mapping from p_udf->i_position
to i_lba and i_max_size, and to read what is available in the next extent,
... until the warning case does not apply any more.

> > > just wondering, [...] a DVD
> > > that I see that has a bunch of 0 length files located in
> > > what I'd assume to be the location of a longer file.

I am not sure whether ECMA-167 demands 0-byte data file to have a valid
start LBA. ECMA-119 does (aka ISO 9660).
But if the file has 0 bytes then it might be that the producer simply decided
to give it some address of a file that has >0 bytes (as usual with ECMA-119).

So for now i do not see a connection to the problem of multi-extent files.

Have a nice day :)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]