Comment 9 for bug 230906

Revision history for this message
Andrew Sayers (andrew-bugs-launchpad-net) wrote :

A bit of googling and testing shows that the null character is invalid in ext2, and that it's silently removed from files:

$ echo > ~/foo$'\0'bar
$ ls ~/foobar
/home/andrew/foobar

If these are actually valid characters in the filesystem, it seems inappropriate to call this a "bug" anywhere in Linux. What if my digital camera only supports FAT but I want to create a file "Me & my friends.jpg"?

Reading the open(2) man page, it's not clear to me that the kernel has any concept of rejecting illegal filenames: the null character is silently discarded and '/' is treated as a directory separator. However, mount(8) shows that there's a "uni_xlate" option for vfat which translates Unicode characters that can't be represented in a FAT filesystem.

If you don't mind rousing the kernel gods, this problem could be fixed by adding the specified characters to those that uni_xlate handles, and by adding uni_xlate to NTFS-3G. This has the advantages that it's relatively transparent to the user and doesn't create weird side-effects like filename collisions. However, it needs to be explicitly turned on, and doesn't help with the (admittedly rare) case of people wanting to read funnily-named files in ext2-on-Windows FSs.

Alternatively, bugs could be filed against all of the major graphical filesystem browsers to add a "warn on Windows-illegal characters" mode. One problem with this is that it creates annoying "are you sure?" boxes that people will ignore then complain about the non-existence of. Another problem is that there will have to be a line drawn between programs that support this feature and programs that don't - it'll be a long time before `cp` gets such a feature, for example. Users crossing that line will face significant confusion as programs on the other side of the line do the wrong thing (from that user's point-of-view).