Opened 7 years ago

Last modified 7 years ago

#8473 new enhancement

Concerns about ISO 9660 implementation

Reported by: scdbackup Owned by: nobody
Priority: normal Milestone: R1
Component: File Systems/ISO 9660 Version: R1/Development
Keywords: Cc: scdbackup@…, Jens.Arm@…
Blocked By: Blocking:
Has a Patch: no Platform: All

Description (last modified by umccullough)

On occasion of ticket #8460 i noticed two potential problems of the ISO 9660 filesystem read-only implementation:

It looks like data files of 4 GiB or larger are not supported. Even if it is not intended to augment Haiku for reading large data files, it should at least be tested whether it tolerates such files gracefully.

It has not been challenged whether the block-to-byte address computations of Directory Entry addresses are safe against byte address of 4 GiB or larger. Only the address computations for data file content have been fixed by #8460.

I am not a user of Haiku, but a producer of ISO 9660 filesystems. In that role i would be glad to help with better readability of ISO 9660.


Large data files:

struct iso9660_inode in http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/file_systems/iso9660/iso9660.h only has provisions for memorizing a single data extent:

        uint32          startLBN[2];                    // Logical block # of start of file/directory data
        uint32          dataLen[2];                             // Length of file section in bytes

ECMA-119 (the public specification of ISO 9660) says:

6.5.1 Relation to File Sections

"Each file shall consist of one or more File Sections. Each File Section of a file shall be identified by a record in the same directory. The sequence of the File Sections of a file shall be identified by the order of the corresponding records in the directory. A File Section may be part of more than one file and may occur more than once in the same file. A File Section may be identified by more than one record in the same or a different directory."

So if struct iso9660_inode is indeed the 1:1 representation of a filesystem inode, then it should be in a 1:n relation to ISO 9660 Directory Entries. (The array size [2] is not a 1:2 relation but rather mirroring the fact that ISO 9660 Directory Entries bear address and length in both byte orderings.)

More challenging will be to collect all ISO 9660 Directory Entries which bear identical names in the same directory.

But as said, at least it should be made sure that two ISO 9660 Directory Entries with the same name do not confuse Haiku beyond the scope of that single data file.

Proposal for producing a challenging ISO image:

Create a file with at least 4 GiB of size. Plus some small files with names alphabetically lower and higher than the big one's ("aaa", "zzz").

Put them into an ISO 9660 image e.g. by my program

xorriso -outdev my_image.iso -add aaa file_of_4gb zzz --

or use mkisofs

mkisofs -o my_image.iso -R -iso-level 3 aaa file_of_4gb zzz

(xorriso can operate directly on DVD burners on systems with Linux kernel, FreeBSD, or Solaris. Like -outdev /dev/sr0 , -outdev /dev/cd0, ...)

Mount the image and check whether all three files behave nice. E.g. whether file_of_4gb does not appear twice in a directory listing, whether at least the small files show correct content, ...


Directory entries in high block numbers:

Normally the Directory Entries are stored in a few hundred blocks shortly after block 16. So there occur no large byte addresses. But in the case of multi-session, the new directory tree gets written after the data blocks of the previous session.

So if the first session is 4 GiB or larger, then the address computations for ISO 9660 Directory can produce byte addresses which may get spoiled by 32-bit bottlenecks like the one of #8460.

Proposal for producing a challenging ISO image:

Produce an ISO image which contains two data files of exactly 2 GiB each:

xorriso -outdev my_image.iso -add file1 file2 --

(Note that other than mkisofs -o, this will refuse to overwrite an existing image.)

Add a session to the image by adding a small marker file "this_is_session_2"

touch this_is_session_2
xorriso -dev my_image.iso -add this_is_session_2 --

(Note that the second run uses option -dev rather than -outdev. By this it loads the tree of the first session and adds the new file to it.)

The resulting file "my_image.iso" will bear two sessions. The PVD at block 16 will direct the reader to a root directory above 4 GiB. The image will still fit on a single-layer DVD.

Attachments (4)

file_of_4gb.iso.bz2 (24.6 KB) - added by scdbackup 7 years ago.
ISO 9660 image, uncompressed 4.1 GiB, with "file_of_4gb" having 2 extents
high_root_block.iso.bz2 (25.7 KB) - added by scdbackup 7 years ago.
ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address larger than 32 bit
reloc_dir.iso.bz2 (26.6 KB) - added by scdbackup 7 years ago.
ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address > 32 bit and relocated deep tree
nonreloc_deep.iso.bz2 (27.3 KB) - added by scdbackup 7 years ago.
ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address > 32 bit and unrelocated deep tree ignoring ECMA-119 specs

Download all attachments as: .zip

Change History (12)

comment:1 Changed 7 years ago by scdbackup

Sorry, i misspelled the second mentioning of #8460 as "8450" which has nothing to do with this ticket.

comment:2 Changed 7 years ago by jahaiku

Cc: Jens.Arm@… added

comment:3 Changed 7 years ago by umccullough

Description: modified (diff)

Fix ticket description per comment:1

comment:4 Changed 7 years ago by diver

Version: R1/alpha3R1/Development

comment:5 Changed 7 years ago by korli

scdbackup, would you mind creating such challenging iso files with highly compressable contents and uploading it with this ticket?

Last edited 7 years ago by korli (previous) (diff)

Changed 7 years ago by scdbackup

Attachment: file_of_4gb.iso.bz2 added

ISO 9660 image, uncompressed 4.1 GiB, with "file_of_4gb" having 2 extents

comment:6 Changed 7 years ago by scdbackup

I produced https://dev.haiku-os.org/attachment/ticket/8473/file_of_4gb.iso.bz2 to demonstrate a file with more than one extent:

$ ls -l aaa file_of_4gb zzz
-rw-r--r-- 1 thomas thomas          4 2012-04-16 13:07 aaa
-rw-r--r-- 1 thomas thomas 4297064448 2012-04-16 13:00 file_of_4gb
-rw-r--r-- 1 thomas thomas          4 2012-04-16 13:07 zzz
$ xorriso -outdev file_of_4gb.iso -add aaa file_of_4gb zzz --
$ bzip2 file_of_4gb.iso

The sequence of directory records bears these ISO 9660 file identifiers:

AAA.;1
FILE_OF_4GB.;1
FILE_OF_4GB.;1
ZZZ.;1

The data blocks of these file are

Report layout: xt , Startlba ,   Blocks , Filesize , ISO image path
File data lba:  0 ,       55 ,  2097151 , 4297064448 , '/file_of_4gb'
File data lba:  1 ,  2097206 ,     1025 , 4297064448 , '/file_of_4gb'
File data lba:  0 ,  2098231 ,        1 ,        4 , '/aaa'
File data lba:  0 ,  2098232 ,        1 ,        4 , '/zzz'

(Rock Ridge names are listed rather than ECMA-119 identifiers.)

"file_of_4gb" actually has 4098 MiB to produce a non-pathetic second extent. Each of its 2 KiB blocks bears at its start the number of its megabyte in ASCII decimal digits. So the block content changes only every full MiB. (When i gave each block its own number as content, the file could not be compressed beyond 2.3 MB.)


I have re-read ECMA-119 about the sequence of directory records of large files.

According to 9.3 the records are sorted by their file identifiers. These identifiers are identical for all extents (File Sections) of a file. So the record of extent N + 1 of a file must directly follow the record of extent N.

Further i learned it is "directory record" rather than "directory entry".

Changed 7 years ago by scdbackup

Attachment: high_root_block.iso.bz2 added

ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address larger than 32 bit

comment:7 Changed 7 years ago by scdbackup

https://dev.haiku-os.org/attachment/ticket/8473/high_root_block.iso.bz2 was produced by:

$ ls -l file1 file2 this_is_session_2
-rw-r--r-- 1 thomas thomas 2147483648 2012-04-16 14:01 file1
-rw-r--r-- 1 thomas thomas 2147483648 2012-04-16 14:02 file2
-rw-r--r-- 1 thomas thomas         18 2012-04-16 14:12 this_is_session_2
$ xorriso -outdev high_root_block.iso -add file1 file2 --
$ xorriso -dev high_root_block.iso -add this_is_session_2 --

It bears two sessions. The PVD in block 16 points to the root directory of the second session in block 2097234 = byte 0x100029000.

TOC layout   : Idx ,  sbsector ,       Size , Volume Id
ISO session  :   1 ,        32 ,   2097175s , ISOIMAGE
ISO session  :   2 ,   2097216 ,        24s , ISOIMAGE

(There is a PVD in block 48 which points to the root directory of the first session. With Linux-specific mount option sbsector=32 it is possible to mount this first session.)

The data file content is stored at these places:

Report layout: xt , Startlba ,   Blocks , Filesize , ISO image path
File data lba:  0 ,       55 ,  1048576 , 2147483648 , '/file1'
File data lba:  0 ,  1048631 ,  1048576 , 2147483648 , '/file2'
File data lba:  0 ,  2097239 ,        1 ,       18 , '/this_is_session_2'

Changed 7 years ago by scdbackup

Attachment: reloc_dir.iso.bz2 added

ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address > 32 bit and relocated deep tree

Changed 7 years ago by scdbackup

Attachment: nonreloc_deep.iso.bz2 added

ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address > 32 bit and unrelocated deep tree ignoring ECMA-119 specs

comment:8 Changed 7 years ago by scdbackup

Probably high_root_block.iso.bz2 is too few of a challenge. It does not exercise diving into sub-directories. So i created two images which put more emphasis on tree traversal. https://dev.haiku-os.org/attachment/ticket/8473/reloc_dir.iso.bz2

With Rock Ridge its tree looks like:

/deep_dir
/deep_dir/1
/deep_dir/1/2
/deep_dir/1/2/3
/deep_dir/1/2/3/4
/deep_dir/1/2/3/4/5
/deep_dir/1/2/3/4/5/6
/deep_dir/1/2/3/4/5/6/7
/deep_dir/1/2/3/4/5/6/7/8
/deep_dir/1/2/3/4/5/6/7/8/9
/deep_dir/1/2/3/4/5/6/7/8/9/10
/deep_dir/1/2/3/4/5/6/7/8/9/10/File_10_1
/deep_dir/1/2/3/4/5/6/7/8/9/10/File_10_2
/deep_dir/1/2/3/4/5/6/7/8/9/File_9_1
/deep_dir/1/2/3/4/5/6/7/8/9/File_9_2
/deep_dir/1/2/3/4/5/6/7/8/File_8_1
/deep_dir/1/2/3/4/5/6/7/8/File_8_2
/deep_dir/1/2/3/4/5/6/7/File_7_1
/deep_dir/1/2/3/4/5/6/7/File_7_2
/file1
/file2
/this_is_session_3

Whereas without Rock Ridge:

/DEEP_DIR
/DEEP_DIR/1
/DEEP_DIR/1/2
/DEEP_DIR/1/2/3
/DEEP_DIR/1/2/3/4
/DEEP_DIR/1/2/3/4/5
/DEEP_DIR/1/2/3/4/5/6
/DEEP_DIR/1/2/3/4/5/6/7
/FILE1.;1
/FILE2.;1
/THIS_IS_SESSION_3.;1
/7
/7/8
/7/8/9
/7/8/9/10
/7/8/9/10/FILE_10_1.;1
/7/8/9/10/FILE_10_2.;1
/7/8/9/FILE_9_1.;1
/7/8/9/FILE_9_2.;1
/7/8/FILE_8_1.;1
/7/8/FILE_8_2.;1
/7/FILE_7_1.;1
/7/FILE_7_2.;1

This difference is called "deep directory relocation" and is due to the prescription of ECMA-119 that no path shall have more than 8 name components.

Deep directory relocation has a bad reputation for being bug prone. So it often is left out. Question is whether the ISO 9660 reader relies on the assumption that no paths deeper than 8 can occur.

This can be exercised by https://dev.haiku-os.org/attachment/ticket/8473/nonreloc_deep.iso.bz2 which bears as ISO 9660 tree:

/DEEPER_DIR
/DEEPER_DIR/1
/DEEPER_DIR/1/.SOME_NAME;1
/DEEPER_DIR/1/2
/DEEPER_DIR/1/2/3
/DEEPER_DIR/1/2/3/4
/DEEPER_DIR/1/2/3/4/5
/DEEPER_DIR/1/2/3/4/5/6
/DEEPER_DIR/1/2/3/4/5/6/7
/DEEPER_DIR/1/2/3/4/5/6/7/8
/DEEPER_DIR/1/2/3/4/5/6/7/8/9
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/FILE_18_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/FILE_18_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/FILE_17_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/FILE_17_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/FILE_16_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/FILE_16_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/FILE_15_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/FILE_15_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/FILE_14_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/FILE_14_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/FILE_13_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/FILE_13_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/FILE_12_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/FILE_12_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/FILE_11_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/FILE_11_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/FILE_10_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/FILE_10_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/FILE_9_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/9/FILE_9_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/FILE_8_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/8/FILE_8_2.;1
/DEEPER_DIR/1/2/3/4/5/6/7/FILE_7_1.;1
/DEEPER_DIR/1/2/3/4/5/6/7/FILE_7_2.;1
/DEEPER_DIR/1/SOFTLINK.;1
/FILE1.;1
/FILE2.;1
/THIS_IS_SESSION_3.;1
Note: See TracTickets for help on using tickets.