Opened 10 years ago
Last modified 6 months ago
#11117 new bug
FAT: issues with lowercase 8.3 filenames
Reported by: | MatejHorvat | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | File Systems/FAT | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
- Using an NT-based Windows such as Windows XP (this is crucial - see below), create a file with an entirely lowercase but 8.3 filename, such as "example.txt".
- Mount the FAT/FAT32 partition in Haiku for reading and writing.
- Open the file, modify it, save it, and close it.
- Try to open it again. You won't be able to do it. Additionally, if you used Pe to save it, it will complain with an error. This is a sign that something is wrong.
- Unmount the partition and mount it again.
- You will see that the file's name has been changed to uppercase (as of hrev47617). I have also seen files get corrupted (on alpha 4.1 at least), but I cannot reliably reproduce that.
My guess is that when recreating the file's directory entry, Haiku ignores the two flags that NT-based Windows uses to indicate lowercase 8.3 filenames. See: https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#VFAT_long_file_names ("If a filename contains only lowercase letters...")
Attachments (2)
Change History (16)
comment:1 by , 10 years ago
Keywords: | FAT FAT32 lowercase removed |
---|---|
Milestone: | R1/alpha5 → R1 |
by , 10 years ago
Attachment: | 0001-fat-correctly-read-lowercase-8.3-filenames.patch added |
---|
comment:2 by , 10 years ago
patch: | 0 → 1 |
---|
comment:3 by , 10 years ago
comment:5 by , 10 years ago
I don't understand what you mean. I changed it to a uint8 because bits 3 and 4 of the byte at offset 0xC of a directory entry are flags that specify whether the filename and/or its extension are lowercase (you can have a filename that is lowercase but has an uppercase extension, or the opposite). The msdos_to_utf8 function must therefore receive both flags to correctly convert the name.
comment:7 by , 10 years ago
The variable name "toLower" used for an uint8 is confusing (we would expect a bool with that name). I would either pass two booleans to the function (lowerBase and lowerExtension for example) or find a better name for the variable (caseFlags maybe?)
A comment explaining what's going on and how the flags are used may be helpful, as there are a lot of magic numbers here (0xC, 0x10, 0x8, ...). These should be constants.
comment:8 by , 10 years ago
While "toLower" could be changed, eliminating things like "buffer[0xC]" would require changes through the whole program to use structs (packed, and with endian conversion if required by the architecture) for directory entries. I could do that (and I think it should be done), but it would break my patch for #11120, so I humbly suggest merging the changes first and then cleaning everything up.
comment:9 by , 10 years ago
I suggested to replace the magic number by constants. So instead of writing buffer[0xc]
you would write:
static const kWhateverOffset = 0xc;
Then later in the code:
buffer[kWhateverOffset] = x;
This makes it clear what the byte is used for, making the code easier to read. No need to convert to structures (which would not necesarily be a good idea as it could have endianness problems).
comment:10 by , 10 years ago
Would const int (vs. #define) really be appropriate here? This is a mixture of C and C++ code.
Anyway, I have not had much progress with solving the problem, but here's a disk image that can be used to reproduce it (use "diskimage register", then mount it).
If you edit the files, unmount, and mount again, data will almost certainly be lost and the filenames of lower.UPP and UPPER.low will be converted to all capitals (Mixed.cas will remain as it is because it uses LFN entries).
If you use my patch, you will see the correct filenames before editing, but after editing, they will be uppercased regardless.
comment:11 by , 9 years ago
Yes, const uint8/uint16 as per style guide.
Have you made any more progress?
comment:13 by , 6 years ago
patch: | 1 → 0 |
---|
Here's a patch that at least makes it read (and write?) such filenames correctly. However, Pe still seems to trigger an edge case that makes it corrupt files.