libcdio-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libcdio-devel] Re: can files in UDF be fragmented?


From: Rocky Bernstein
Subject: Re: [Libcdio-devel] Re: can files in UDF be fragmented?
Date: Fri, 22 Oct 2010 05:04:30 -0400

Many thanks for digging into the UDF specification and libcdio code to
answer the question.

Thomas - you do have commit rights to libcdio, so if you want to fix the
faulty code, by all means do so. Or if you want to make a patch and send
that to me and/or have me double check, I'd be happy to do that.

Thanks.

On Fri, Oct 22, 2010 at 3:18 AM, Thomas Schmitt <address@hidden> wrote:

> Hi,
>
> Shaya Potter:
> > > > I'm just wondering if a file in a UDF file system can be fragmented?
> Rocky Bernstein:
> > > the code should be following the ECMA-167 specifications,
> > > I don't see the guarantees, you are looking for,
>
> Actually the contrary of the desired rule is announced.
>
> ECMA-167 4/8.8 says:
> "A file shall be described by a File Entry (see 4/14.9) or by an Extended
>  File Entry (4/14.17), which shall specify the attributes of the file and
>  the location of the file's recorded data. The data of a file shall be
>  recorded in either of the following:
>  - An ordered sequence of extents of logical blocks (see short_ad
> (4/14.14.1),
>   long_ad (4/14.14.2) and ext_ad (4/14.14.3). The extents may be recorded
>   or unrecorded, and allocated or unallocated. The extents, if specified
>   as long_ad (4/14.14.2) or ext_ad (4/14.14.3), may be located on different
>   partitions which may be on different volumes."
>
> (This quote illustrates why i still procrastinate the endeavor to
> understand
>  UDF enough to produce UDF images.)
>
> The Allocation Descriptors in ECMA-167 4/14.4 all have 32-bit length
> fields.
> So i assume that files of 4 GB or more have to be split into multiple
> extents.
> But these extents may well be recorded as consequtive neighbors so that
> they form one single block area.
> It all depends on the UDF producing program.
>
> Another reason for multiple extents might be recording of spare files
> where large ranges of 0-bytes are represented as unallocated extents.
>
> I understand that files on video DVD may not be larger than 1 GB.
> So the probability is high that they consist of a single extent.
>
>
> Shaya Potter:
> > then does libcdio work?  it would seem from my reading that every
> > udf_read_block() is basically made as an offset to the start of the file.
> >
> > i.e.
> >
> > 1) it calls offset_to_lba to find start sector and length
> > 2) computed max # of blocks
> > 3) calls udf_read_sectors() w/ that information
>
> I seems that it is aware of multiple extents and but has problems to
> fulfill read requests which cross an extent boundary.
> Actually it seems to have a bug with extent limit evaluation.
>
> In lib/udf/udf_file.c i read these two snippets.
>
>  static lba_t
>  offset_to_lba(const udf_dirent_t *p_udf_dirent, off_t i_offset,
>                /*out*/ lba_t *pi_lba, /*out*/ uint32_t *pi_max_size)
>  ...
>          /*
>           * The allocation descriptor field is filled with short_ad's.
>           * If the offset is beyond the current extent, look for the
>           * next extent.
>           */
>          do {
>               ...
>          } while(i_offset >= icblen);
>
>          lsector = (i_offset / UDF_BLOCKSIZE) + p_icb->pos;
>
>          *pi_max_size = p_icb->len;
>
>
> At usage of offset_to_lba it eventually warns and truncates the read job to
> the size of the found extent (which is quite not senseful).
>
>    lba_t i_lba = offset_to_lba(p_udf_dirent, p_udf->i_position, &i_lba,
>                                &i_max_size);
>    if (i_lba != CDIO_INVALID_LBA) {
>      uint32_t i_max_blocks = CEILING(i_max_size, UDF_BLOCKSIZE);
>      if ( i_max_blocks < count ) {
>        printf("Warning: don't know how to handle yet\n" );
>        count = i_max_blocks;
>      }
>      ret = udf_read_sectors(p_udf, buf, i_lba, count);
>
>
> Reading single sectors should be safe.
>
>
> The code in offset_to_lba() seems faulty resp. uncoordinated with the usage
> in udf_read_block():
> If *pi_max_size is intended to give the readable bytes in the found
> extent beginning at the current read position, then one should subtract
> i_offset from it.
>
> To make it fully able to deal with multiple extents:
> In udf_read_block() one would have to repeat the mapping from
> p_udf->i_position
> to i_lba and i_max_size, and to read what is available in the next extent,
> ... until the warning case does not apply any more.
>
>
> > > > just wondering, [...] a DVD
> > > > that I see that has a bunch of 0 length files located in
> > > > what I'd assume to be the location of a longer file.
>
> I am not sure whether ECMA-167 demands 0-byte data file to have a valid
> start LBA. ECMA-119 does (aka ISO 9660).
> But if the file has 0 bytes then it might be that the producer simply
> decided
> to give it some address of a file that has >0 bytes (as usual with
> ECMA-119).
>
> So for now i do not see a connection to the problem of multi-extent files.
>
>
> Have a nice day :)
>
> Thomas
>
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]