Opened 13 years ago
Last modified 4 years ago
#8473 new enhancement
Concerns about ISO 9660 implementation
Reported by: | scdbackup | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1.1 |
Component: | File Systems/ISO 9660 | Version: | R1/Development |
Keywords: | Cc: | scdbackup@…, Jens.Arm@… | |
Blocked By: | Blocking: | ||
Platform: | All |
Description (last modified by )
On occasion of ticket #8460 i noticed two potential problems of the ISO 9660 filesystem read-only implementation:
It looks like data files of 4 GiB or larger are not supported. Even if it is not intended to augment Haiku for reading large data files, it should at least be tested whether it tolerates such files gracefully.
It has not been challenged whether the block-to-byte address computations of Directory Entry addresses are safe against byte address of 4 GiB or larger. Only the address computations for data file content have been fixed by #8460.
I am not a user of Haiku, but a producer of ISO 9660 filesystems. In that role i would be glad to help with better readability of ISO 9660.
Large data files:
struct iso9660_inode
in
http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/file_systems/iso9660/iso9660.h
only has provisions for memorizing a single data extent:
uint32 startLBN[2]; // Logical block # of start of file/directory data uint32 dataLen[2]; // Length of file section in bytes
ECMA-119 (the public specification of ISO 9660) says:
6.5.1 Relation to File Sections
"Each file shall consist of one or more File Sections. Each File Section of a file shall be identified by a record in the same directory. The sequence of the File Sections of a file shall be identified by the order of the corresponding records in the directory. A File Section may be part of more than one file and may occur more than once in the same file. A File Section may be identified by more than one record in the same or a different directory."
So if struct iso9660_inode
is indeed the 1:1 representation of a
filesystem inode, then it should be in a 1:n relation to ISO 9660 Directory
Entries.
(The array size [2] is not a 1:2 relation but rather mirroring the fact
that ISO 9660 Directory Entries bear address and length in both byte
orderings.)
More challenging will be to collect all ISO 9660 Directory Entries which bear identical names in the same directory.
But as said, at least it should be made sure that two ISO 9660 Directory Entries with the same name do not confuse Haiku beyond the scope of that single data file.
Proposal for producing a challenging ISO image:
Create a file with at least 4 GiB of size. Plus some small files with names alphabetically lower and higher than the big one's ("aaa", "zzz").
Put them into an ISO 9660 image e.g. by my program
xorriso -outdev my_image.iso -add aaa file_of_4gb zzz --
or use mkisofs
mkisofs -o my_image.iso -R -iso-level 3 aaa file_of_4gb zzz
(xorriso can operate directly on DVD burners on systems with Linux kernel, FreeBSD, or Solaris. Like -outdev /dev/sr0 , -outdev /dev/cd0, ...)
Mount the image and check whether all three files behave nice. E.g. whether file_of_4gb does not appear twice in a directory listing, whether at least the small files show correct content, ...
Directory entries in high block numbers:
Normally the Directory Entries are stored in a few hundred blocks shortly after block 16. So there occur no large byte addresses. But in the case of multi-session, the new directory tree gets written after the data blocks of the previous session.
So if the first session is 4 GiB or larger, then the address computations for ISO 9660 Directory can produce byte addresses which may get spoiled by 32-bit bottlenecks like the one of #8460.
Proposal for producing a challenging ISO image:
Produce an ISO image which contains two data files of exactly 2 GiB each:
xorriso -outdev my_image.iso -add file1 file2 --
(Note that other than mkisofs -o, this will refuse to overwrite an existing image.)
Add a session to the image by adding a small marker file "this_is_session_2"
touch this_is_session_2 xorriso -dev my_image.iso -add this_is_session_2 --
(Note that the second run uses option -dev rather than -outdev. By this it loads the tree of the first session and adds the new file to it.)
The resulting file "my_image.iso" will bear two sessions. The PVD at block 16 will direct the reader to a root directory above 4 GiB. The image will still fit on a single-layer DVD.
Attachments (4)
Change History (13)
comment:1 by , 13 years ago
comment:2 by , 13 years ago
Cc: | added |
---|
comment:4 by , 13 years ago
Version: | R1/alpha3 → R1/Development |
---|
comment:5 by , 13 years ago
scdbackup, would you mind creating such challenging iso files with highly compressable contents and uploading it with this ticket?
by , 13 years ago
Attachment: | file_of_4gb.iso.bz2 added |
---|
ISO 9660 image, uncompressed 4.1 GiB, with "file_of_4gb" having 2 extents
comment:6 by , 13 years ago
I produced https://dev.haiku-os.org/attachment/ticket/8473/file_of_4gb.iso.bz2 to demonstrate a file with more than one extent:
$ ls -l aaa file_of_4gb zzz -rw-r--r-- 1 thomas thomas 4 2012-04-16 13:07 aaa -rw-r--r-- 1 thomas thomas 4297064448 2012-04-16 13:00 file_of_4gb -rw-r--r-- 1 thomas thomas 4 2012-04-16 13:07 zzz $ xorriso -outdev file_of_4gb.iso -add aaa file_of_4gb zzz -- $ bzip2 file_of_4gb.iso
The sequence of directory records bears these ISO 9660 file identifiers:
AAA.;1 FILE_OF_4GB.;1 FILE_OF_4GB.;1 ZZZ.;1
The data blocks of these file are
Report layout: xt , Startlba , Blocks , Filesize , ISO image path File data lba: 0 , 55 , 2097151 , 4297064448 , '/file_of_4gb' File data lba: 1 , 2097206 , 1025 , 4297064448 , '/file_of_4gb' File data lba: 0 , 2098231 , 1 , 4 , '/aaa' File data lba: 0 , 2098232 , 1 , 4 , '/zzz'
(Rock Ridge names are listed rather than ECMA-119 identifiers.)
"file_of_4gb" actually has 4098 MiB to produce a non-pathetic second extent. Each of its 2 KiB blocks bears at its start the number of its megabyte in ASCII decimal digits. So the block content changes only every full MiB. (When i gave each block its own number as content, the file could not be compressed beyond 2.3 MB.)
I have re-read ECMA-119 about the sequence of directory records of large files.
According to 9.3 the records are sorted by their file identifiers. These identifiers are identical for all extents (File Sections) of a file. So the record of extent N + 1 of a file must directly follow the record of extent N.
Further i learned it is "directory record" rather than "directory entry".
by , 13 years ago
Attachment: | high_root_block.iso.bz2 added |
---|
ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address larger than 32 bit
comment:7 by , 13 years ago
https://dev.haiku-os.org/attachment/ticket/8473/high_root_block.iso.bz2 was produced by:
$ ls -l file1 file2 this_is_session_2 -rw-r--r-- 1 thomas thomas 2147483648 2012-04-16 14:01 file1 -rw-r--r-- 1 thomas thomas 2147483648 2012-04-16 14:02 file2 -rw-r--r-- 1 thomas thomas 18 2012-04-16 14:12 this_is_session_2 $ xorriso -outdev high_root_block.iso -add file1 file2 -- $ xorriso -dev high_root_block.iso -add this_is_session_2 --
It bears two sessions. The PVD in block 16 points to the root directory of the second session in block 2097234 = byte 0x100029000.
TOC layout : Idx , sbsector , Size , Volume Id ISO session : 1 , 32 , 2097175s , ISOIMAGE ISO session : 2 , 2097216 , 24s , ISOIMAGE
(There is a PVD in block 48 which points to the root directory of the first
session. With Linux-specific mount option sbsector=32
it is possible
to mount this first session.)
The data file content is stored at these places:
Report layout: xt , Startlba , Blocks , Filesize , ISO image path File data lba: 0 , 55 , 1048576 , 2147483648 , '/file1' File data lba: 0 , 1048631 , 1048576 , 2147483648 , '/file2' File data lba: 0 , 2097239 , 1 , 18 , '/this_is_session_2'
by , 13 years ago
Attachment: | reloc_dir.iso.bz2 added |
---|
ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address > 32 bit and relocated deep tree
by , 13 years ago
Attachment: | nonreloc_deep.iso.bz2 added |
---|
ISO 9660 image, uncompressed 4.1 GiB, with root directory byte address > 32 bit and unrelocated deep tree ignoring ECMA-119 specs
comment:8 by , 13 years ago
Probably high_root_block.iso.bz2 is too few of a challenge. It does not exercise diving into sub-directories. So i created two images which put more emphasis on tree traversal. https://dev.haiku-os.org/attachment/ticket/8473/reloc_dir.iso.bz2
With Rock Ridge its tree looks like:
/deep_dir /deep_dir/1 /deep_dir/1/2 /deep_dir/1/2/3 /deep_dir/1/2/3/4 /deep_dir/1/2/3/4/5 /deep_dir/1/2/3/4/5/6 /deep_dir/1/2/3/4/5/6/7 /deep_dir/1/2/3/4/5/6/7/8 /deep_dir/1/2/3/4/5/6/7/8/9 /deep_dir/1/2/3/4/5/6/7/8/9/10 /deep_dir/1/2/3/4/5/6/7/8/9/10/File_10_1 /deep_dir/1/2/3/4/5/6/7/8/9/10/File_10_2 /deep_dir/1/2/3/4/5/6/7/8/9/File_9_1 /deep_dir/1/2/3/4/5/6/7/8/9/File_9_2 /deep_dir/1/2/3/4/5/6/7/8/File_8_1 /deep_dir/1/2/3/4/5/6/7/8/File_8_2 /deep_dir/1/2/3/4/5/6/7/File_7_1 /deep_dir/1/2/3/4/5/6/7/File_7_2 /file1 /file2 /this_is_session_3
Whereas without Rock Ridge:
/DEEP_DIR /DEEP_DIR/1 /DEEP_DIR/1/2 /DEEP_DIR/1/2/3 /DEEP_DIR/1/2/3/4 /DEEP_DIR/1/2/3/4/5 /DEEP_DIR/1/2/3/4/5/6 /DEEP_DIR/1/2/3/4/5/6/7 /FILE1.;1 /FILE2.;1 /THIS_IS_SESSION_3.;1 /7 /7/8 /7/8/9 /7/8/9/10 /7/8/9/10/FILE_10_1.;1 /7/8/9/10/FILE_10_2.;1 /7/8/9/FILE_9_1.;1 /7/8/9/FILE_9_2.;1 /7/8/FILE_8_1.;1 /7/8/FILE_8_2.;1 /7/FILE_7_1.;1 /7/FILE_7_2.;1
This difference is called "deep directory relocation" and is due to the prescription of ECMA-119 that no path shall have more than 8 name components.
Deep directory relocation has a bad reputation for being bug prone. So it often is left out. Question is whether the ISO 9660 reader relies on the assumption that no paths deeper than 8 can occur.
This can be exercised by https://dev.haiku-os.org/attachment/ticket/8473/nonreloc_deep.iso.bz2 which bears as ISO 9660 tree:
/DEEPER_DIR /DEEPER_DIR/1 /DEEPER_DIR/1/.SOME_NAME;1 /DEEPER_DIR/1/2 /DEEPER_DIR/1/2/3 /DEEPER_DIR/1/2/3/4 /DEEPER_DIR/1/2/3/4/5 /DEEPER_DIR/1/2/3/4/5/6 /DEEPER_DIR/1/2/3/4/5/6/7 /DEEPER_DIR/1/2/3/4/5/6/7/8 /DEEPER_DIR/1/2/3/4/5/6/7/8/9 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/FILE_18_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/FILE_18_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/FILE_17_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/FILE_17_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/FILE_16_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/FILE_16_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/FILE_15_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/FILE_15_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/FILE_14_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/14/FILE_14_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/FILE_13_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/13/FILE_13_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/FILE_12_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/12/FILE_12_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/FILE_11_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/11/FILE_11_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/FILE_10_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/10/FILE_10_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/FILE_9_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/9/FILE_9_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/FILE_8_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/8/FILE_8_2.;1 /DEEPER_DIR/1/2/3/4/5/6/7/FILE_7_1.;1 /DEEPER_DIR/1/2/3/4/5/6/7/FILE_7_2.;1 /DEEPER_DIR/1/SOFTLINK.;1 /FILE1.;1 /FILE2.;1 /THIS_IS_SESSION_3.;1
comment:9 by , 4 years ago
Milestone: | R1 → R1.1 |
---|
Sorry, i misspelled the second mentioning of #8460 as "8450" which has nothing to do with this ticket.