Linus Torvalds Expresses His Hatred For Case-Insensitive File-Systems

73 points by laktak

andrewrk

I think Linus’s greatest strength is his willingness to delegate despite people doing things that don’t live up to his sense of perfectionism.

minimax

I don’t read this as him being angry at kernel contributors, but rather at those inconsiderate fools who wrote those no-good filesystem specs in the first place. I imagine the authors of his pointed-to commits agreeing with him, along the lines of “I know, this is terrible, but what are we supposed to do, that’s just what the spec says.”

This is silly, fake drama. It doesn’t matter, at this point, whether the FAT16 spec was “brain dead” or whatever mean thing he wants to call it: it’s not being revised. The horse has left the barn. The only real question is whether or not Linux can have drivers that can implement the spec. Same deal with (more to the point) NTFS and HFS+ and APFS: these allow for case-sensitive configurations, but in practice are rarely configured that way. I wouldn’t choose to format a volume with these filesystems, but I would like to be able to read and write such a volume without breaking it for other users, thanks.
- k749gtnc9l3w
  
  It is, however, a direct reply to the original author of Bcachefs, a Linux-targeting FS with a case insensitivity option, so in the case of this specific email, it is targeted directly at a kernel contributor for having chosen freely to implement case insensitivity.
  
  Given that the FS already had some users, and some case-insensitivity users, by the time merging it into mainline got discussed, accepting it with insensitivity was the best of the options, but writing messages to LKML as if original implementation of case insensitivity was a good idea still counts as provocation enough for a rant in reply.
- tonyarkles
  
  HFS+ and APFS: these allow for case-sensitive configurations, but in practice are rarely configured that way.
  
  I don’t know if this is still true but last time I looked into using a case-sensitive HFS+ filesystem as my root on OS X, I came across some pretty dire warnings that there were a bunch of applications that would subtly break, Photoshop and Illustrator being examples that I remember off the top of my head. Which… given that I was doing pretty heavy web development at the time, would have not been a Fun Time.
  - runxiyu
    
    Microsoft OneDrive also refuses to work on systems where /Users is on a case-sensitive APFS volume.
- Cloudef
  
  I’m so glad HFS+ got replaced with APFS. The (non standard) unicode normalization in HFS+ was a disaster and made some software just not work, as well even more confusion if you tried to share these files with other systems.
  - zie
    
    See and I wish they would have adopted ZFS instead of write APFS. We are both in agreement of retiring HFS+ though!
    
    jitl
    
    They considered it, there was a lot of excitement for ZFS in the Leopard era, this post covers the history in detail: https://ahl.dtrace.org/2016/06/15/apple_and_zfs/
    
    zie
    
    Agreed, I was around during that time and heard all the rumors and hoped it would come true. Of course Ellison ruined it, what else would one expect from Oracle.
    
    classichasclass
    
    In fact, I’d submit that (at least on older OS X) case sensitivity is really what UFS was for, if you really had to have it. My 10.4 file server has HFS+ and UFS partitions: the HFS+ part is where the Mac clients get stuff, and is insensitive as they expect, and the UFS portion is where the Unix clients get stuff, and is case-sensitive as they expect.
  - vimpostor
    
    Not sure why we are still linking to Phoronix dumbing down the original source for broad mainstream consumption, but that whole LKML thread is full of gold, e.g.:
    
    I think this is something that NTFS actually got right. Each filesystem carries with it a 128KiB table that maps each codepoint to its case-insensitive equivalent. So there’s no ambiguity about “which version of the unicode standard are we using”, “Does the user care about Turkish language rules?”, “Is Aachen a German or Danish word?”. The sysadmin specified all that when they created the filesystem, and it doesn’t matter what the Unicode standard changes in the future; if you need to change how the filesystem sorts things, you can update the table.
    
    Without context the thought of having that 128KiB translation table around sounds completely cursed to me, but I agree when you are stuck with a case-insensitive fs, it’s probably the only correct way to deal with this problem.
    
    And it wouldn’t be LKML if there wasn’t a whole other sidethread about Linus arguing with another person, I thought their blunt phrasing was actually hilarious in that context:
    
    Kent: The subject is CI lookups, and I’ll eat my shoe if you wrote that.
    
    Linus: Start chomping. [proceeds to show code he wrote back in 1997]
    
    intelfx
    
    Not sure why we are still linking to Phoronix dumbing down the original source for broad mainstream consumption
    
    Probably because lobste.rs has rules against “linking into projects’ community spaces” and nobody wants to test the exact position of the line.
    
    wink
    
    I think you are misinterpreting the rule here, linking to ML archives always seemed fine. This is about Github issues where (probably) half of the readers here are already logged in and could click a button to react.
    
    intelfx
    
    I’m very definitely not. There have been moderation actions against exactly this type of links not long ago.
    
    See thread from here: https://lobste.rs/s/uzzevr/linus_torvalds_clearly_lays_out_linux#c_oivv0u (grep for “pushcx”, comment links are still broken)
    
    See modlog entry at: https://lobste.rs/moderations/page/3?moderator=(All)&what[stories]=stories at time “2025-02-21 07:28” (grep for time, modlog links do not exist)
    
    (In hindsight, what was wrong was my remark about “testing the exact position of the line” — per the thread above, the rule is in fact designed to be heavy-handed and have no exceptions, so no line-testing is even needed. Links to LKML are banned per pushcx, full stop.)
    
    kwas
    
    as an alternative, someone with LWN subscription could find the LKML mails on their archive, and link there instead.
    
    laktak
    
    Phoronix helped me find this so IMO it’s only fair to link to him.
    
    nomnp
    
    Given the choice, I’ve always preferred case-sensitive file systems, but thinking about it, are those really better from the user’s perspective? Is it correct to assume that “foo.txt” and “Foo.txt” are different files? If I have two paper files on my desk with the same title but different capitalization, I would assume the content to be the same.
    
    BenjaminRi
    
    With ASCII it’s easy. But you will quickly learn that case insensitive matching in Unicode is not just a can of worms, but of eldritch horrors.
    
    dogacel
    
    One famous issue is the Turkish capital I problem. In english, lowercase I is i, however in Turkish it is ı and uppercase i is İ.
    
    BenjaminRi
    
    Clearly, the solution is that the file system should do locale-aware case insensitive matching. What could possibly go wrong?
    
    dogacel
    
    I was setting up my Minecraft mods exactly the same way the tutorial does it, and it wasn’t working. Turns out the solutions is changing my JVM locale! I have learned why it worked after changing the locale about 8 years later.
    
    Lilian
    
    I think the most sensible answer is only ASCII should match different letter-case. Every other character is fringe in file names anyways.
    
    gerikson
    
    Every other character is fringe in file names anyways.
    
    laughs hollowly in European
    
    runxiyu
    
    Non-ASCII filenames are fringe to people whose primary languages only use ASCII characters. I’m not sure how broadly this applies to people in general. For the record, about 1/5 of the files I have in ~/School have Simplified Chinese codepoints in their filenames.
    
    recursion
    
    Touché!
    
    moltonel
    
    Unless your system explicitly prevents non-ASCII, you’ll have to deal with weird cases. Even an all-ASCII string might up/downcase to non-ASCII codepoints in certain locales. One man’s fringe is another’s everyday.
    
    steveklabnik
    
    I would assume the total opposite. They’re not the same name, why would I assume they’re the same thing?
    
    tomjakubowski
    
    It only gets weirder though when you have Unicode paths. Are А.txt and A.txt the same file? One file name is Cyrillic, the other is Latin; they have different names in some sense but it’s often impossible to visually distinguish them.
    
    dogacel
    
    It is also hard to visually distinguish letter “l” and “I” (lowercase L vs uppercase i). Do you want to treat them similarly too? Visual distinction is different from actual distinction.
    
    zesterer
    
    Indeed. Fails Leibniz’s definition.
    
    gir
    
    Is it correct to assume that “foo.txt” and “Foo.txt” are different files?
    
    are color.txt and colour.txt the same file?
    
    in languages that capitalize nouns, are Steuern.txt (de:taxes) and steuern.txt (de:to navigate) supposed to be the same?
    
    iamnearlythere
    
    A bit off-topic but here’s what happened to Spotify with that line of thinking: https://engineering.atspotify.com/2013/06/creative-usernames/
    
    matklad
    
    Ouuu, thanks for the link, it is a perfect illustration for the specific problems Linus calls attention to:
    
    using someone else’s Unicode normalization without understanding what exactly it does
    
    not appreciating that Unicode is a living standard that changes over time.
    
    ianloic
    
    How about the files “Interesting.txt” and “interesting.txt”? In languages that have a dotless “i” those are different words, but in languages that don’t they’re the same. Should the filesystem be tracking what language filenames are written in so that it can accurately be case insensitive?
    
    technomancy
    
    Yeah, I think the main problem with this line of thinking is that capitalization is only one example of many, many different situations where two strings can “mean the same thing” but be composed of different codepoints. With capitalization it “seems obvious” that they are the same, but there are a bunch of other situations in other writing systems that are harder to normalize. So saying that filenames should be case-insensitive is really just saying “normalization should happen, but only for characters that Americans use”.
    
    moltonel
    
    Yes, if we’re considering that A and a are equivalent, then why not A and à, a and а, leet and 1337, etc. You can argue that one is justified but not the other, but there’s no universal consensus. The only sane approach IMHO is to not consider any of them equivalent.
    
    technomancy
    
    I mean Han unification is basically the east asian equivalent of saying that “a” and “α” are equivalent, and we all know how well that went over.
    
    jlarocco
    
    Isn’t that a problem in any case? If I downcase text in a text file, it’s not going to know the language, either, unless I tell it somehow.
    
    ianloic
    
    The filesystem has to leave it the hell alone. There’s no way that it can make reliability correct decisions and if it tries it’ll just get in the way of higher level software that might have more context about the user or the data.
    
    jlarocco
    
    My point is, it’s hardly a unique problem to the file system. If somebody is using a language with those characters then they need the correct locale setup or they’re going to have problems all over the place. If they have the locale setup correctly, then switching case in the FS will be handled the same way it is everywhere else automatically.
    
    Of course the best solution is case sensitive file systems.
    
    ianloic
    
    The problem with the file system is that it’s uniquely poorly suited to this problem. It’s user interface is just a set of syscalls for file manipulation. Higher layers of the stack can use everything from environment variables to GUIs to allow the user to configure their language.
    
    andrewrk
    
    Should the kernel make opinionated decisions to provide a user interface making file system files similar to papers on your desk? Or should that be relegated to applications?
    
    intelfx
    
    Or should that be relegated to applications?
    
    That’s exactly what being done here. Case-insensitivity as implemented is a toggleable flag with a per-directory granularity.
    
    mort
    
    That’s not the same as relegating the case insensitivity to the application. That’s still the kernel doing case sensitivity, it’s just inconsistent between directories.
    
    intelfx
    
    That’s a pretty uncharitable way to reinterpret my point.
    
    Case sensitivity has to be physically done in kernel even if it’s “relegated to applications”, because the kernel is the entity doing directory lookups.
    
    In practice, desktop applications work with mostly disjoint sets of files (yes, this is true even for user data). Thus, when an application decides it wants the case-insensitive semantics, it flips the case folding switch for the directories it “owns”. Thus, “relegated to applications”.
    
    mort
    
    Applications don’t own (most) directories.
    
    intelfx
    
    Well, most directories will not (and are not supposed to) have case insensitivity enabled. I don’t see how this contradicts anything I’ve written.
    
    mort
    
    It means that, when case insensitivity is configured per-directory, it’s handled by the kernel and configured by the user. It’s hard to view that as being “relegated to applications”.
    
    lonjil
    
    Applications can’t make that decision, since the kernel is the interface to the file system. One application deciding to not have case sensitivity will break as soon as another app decides to make files with the same name modulo case. It has to be handled uniformly across applications, and the only way to do that is in the kernel.
    
    ThinkChaos
    
    macOS does something in between: it doesn’t allow creating a file when one with the same case-normalized name exists, but it will allow referencing the file given a different case.
    Essentially it always normalizes the case at the filesystem level, except it preserves the given case when creating the file.
    
    $ echo wow > afile $ cat aFile wow
    
    I’m not sure whether it’s the best or worst of both worlds!
    
    valenterry
    
    From experience I can say that this is absolutely horrible and causes tons of issues.
    
    laktak
    
    Yes, some file managers (including Finder, the last time I tried) won’t let you rename afile to aFile. They check first and aFile already “exists”.
    
    kevinc
    
    An experiment on macOS 15.3.2, on an APFS volume of the default kind which is case-insensitive:
    
    touch afile: creates it
    
    Rename in Finder to “aFile”: the GUI shows “aFile”
    
    ls: prints aFile
    
    ls afile: prints afile
    
    ls AFILE: prints AFILE
    
    ls a<Tab>: completes aFile
    
    ls af<Tab>: does not complete
    
    The behavior is not different if I use mv instead of Finder to rename.
    
    The primary problem I’ve experienced with this scheme is that although git is case-sensitive, if I re-case a file git is already tracking, git will not notice it as a rename because the file system still finds the file at the tracked name. Now my working copy is out of sync and it only Works on My Machine. At one point I used a case-sensitive APFS volume for repo storage, but it was more complicated than the workaround: Rename the file in a way that’s more than re-casing, commit that, do it again to the correct name and case, then amend.
    
    Why do we even have that lever
    
    laktak
    
    The behavior is not different if I use mv instead of Finder to rename.
    
    maybe it’s because of gnu coreutils (via brew) but if I mv afile Afile it says:
    
    mv: ‘afile’ and ‘aFile’ are the same file
    
    kevinc
    
    Yeah it must be some such difference. I don’t even get that message from “mv afile afile”.
    
    mort
    
    Yeah, I encounter this all the time and it’s infuriating. I’ll make a source file representing a class, then realise I accidentally made it lower case instead of camel case, and then to rename the file to camel case, I need to do this stupid dance of first renaming it to something I don’t want and then renaming it a second time.
    
    Garbi
    
    Even worse is trying to change case in a filename in Windows and it silently fails, leading to the two-step you describe, if you happen to notice.
    
    laktak
    
    The best thing IMHO is to create a case sensitive apfs volume and link to it frome home. Saved me a lot of issues.
    
    slondr
    
    How is this different from case-insensitivity?
    
    insanitybit
    
    Comment removed by author
    
    insanitybit
    
    If I chose two different titles, yes, assume that they are different files.
    
    kevinc
    
    Some of the stickiest “things programmers believe about <domain>” come from assuming that a relationship is one-to-one. A letter is not a character; there could be many. A character is not a code point; there could be many. A code point is not a grapheme cluster; it may take many to make one. A grapheme cluster is not a glyph, how many it takes depends on the font. (I sure hope that was accurate.) Which of these things is a string a list of, such that you can derive length or equality from them? Even before we talk about file systems and VCS, programming languages disagree and some take years to make up their minds.
    
    elobdog
    
    At least in the Unix, Linux and BSD world, case-sensitive filesystems are an accepted norm, and trying to deviate from it is basically digging a hole for yourself and the users.
    
    indiebat
    
    I understand that the man is a legend, and he will always be among geeks and nerds of the wonderland that we live in, but why do we dramatize, discuss, politicize, argue about every opinion of his?! It’s repetitive, unnecessary and not in the spirit of thoughtful community (Linux should run across the world, and galaxy and then some with full force of open-source nerds offering tech support to them all) :-)
    
    That being said, designing anything let alone important building block like file system without case sensitivity leads to pandora’s box worth of bugs, and that’s me saying it having some experience with authentication and other client side validations
    
    bwbuhse
    
    At work, they give engineers Macs, but we have to ssh to Linux boxes to actually do development work because we’ve got an old Linux files with names that only differ by case so the repos can’t be cloned on the Macs. I’d love to actually use my M3 Max cores for building but instead I get an old workstation with 4C/8T that’s a lot slower (though, it’s fine).
    
    Ultimately I’d love to just run a Linux distro of my choice on the Mac but IT isn’t going to allow that any time soon…
    
    ibisum
    
    You can get around this issue entirely on MacOS by using Disk Utility.app to create a new Volume Image, with case sensitivity, and just use that as your work/repo folder, if possible.
    
    If you want to do it at the command-line, maybe as some sort of repo setup requirement, you can do it like this:
    
    $ hdiutil create -size 100m -fs "Case-sensitive APFS" -volname aCaseOfSensitivity aCaseOfSensitivity.dmg $ open aCaseOfSensitivity.dmg $ touch /Volumes/aCaseOfSensitivity/Yo $ touch /Volumes/aCaseOfSensitivity/yo $ ls /Volumes/aCaseOfSensitivity/ Yo yo
    
    Disk Utility.app and hdiutil go hand in hand, so you can do it both ways ..
    
    lina
    
    You don’t need to create a disk image. If you create a new case-sensitive volume within the main system APFS container, it will share space with the main volume and you don’t have to worry about deciding on its size or manually mounting it.
    
    hoistbypetard
    
    Nice! That’s a relatively new trick I’d never heard about. Thank you.
    
    hoistbypetard
    
    For a while, that was a necessary step for building AOSP on a Mac. I’m not sure if that’s still the case or not, since it’s been a long time since I’ve needed to build AOSP on a Mac or elsewhere. I think the ability to create case-sensitive images might be what keeps it from being a larger issue on the Mac. Unfortunately, they’re a little slow when it comes to something like compiling a large project that creates many thousands of small files, though probably not as slow as a VM would be.
    
    bwbuhse
    
    Thank you!! I’ll give that a shot.
    
    tonyarkles
    
    A long time ago I was trying to build some software on OS X and the errors I was getting were completely baffling. On Linux it all built fine. On OS X it wasn’t even trying to compile the same files as it was on Linux. Eventually after doing some diffing I discovered that the Git repo had a file called Makefile and a file called makefile in it… Almost flipped my desk over.
    
    kylewlacy
    
    The repo for the Linux kernel today has several file paths that differ only by case too. I’m not sure anyone besides me is interested in trying to cross-compile the Linux kernel from macOS (more of an academic exercise) but it mildly annoyed me…
    
    (haven’t tried with a disk image or APFS volume yet so I’d still like to give it a shot sometime)
    
    zzing
    
    There was an option to format an hfs+ drive with case sensitivity. So if that is still the option you just have to make another volume (never the system volume).
    
    bwbuhse
    
    Sadly it’s not, at least as far as I’m aware. They (IT) format the whole thing when they give it to us. I’ve been trying to use OCI containers but haven’t gotten it to work yet (nor tried too hard lol)
    
    dgl
    
    You can do it with a volume inside a .dmg, so no need to partition the whole system, just make a filesystem in a file.
    
    bwbuhse
    
    Oh cool, I didn’t think about that. I’ll try it out. Still might have other issues (basically build a Linux system that’s our IPS OS) but sounds worth trying. Thanks!
    
    pm
    
    Perfect example of why I don’t bother getting familiar with osx or windows. Such ridiculous problems are just too frequent. And life is too short to waste time battling them.
    
    bdesham
    
    “Functional audio playback for some, case sensitive filesystems for others!”
    
    pimeys
    
    Yeah. It is really hard to use a system without pipewire. Especially if you care about correct sample rates and defining audio chains with a simple interface.
    
    runxiyu
    
    I’ve always found the name “case-insensitive” to mean “filesystem that considers a to be the same as A” to be strange. Since I instinctively interpret “insensitive” as “this filesystem doesn’t care about cases at all”, so my brain defaults to “oh, so it distinguishes based on their octet sequence alone, right”
    
    recursion
    
    I have to think that terminology through every time too even though I’ve programmed for HFS+!
    
    atmosx
    
    What FS are ppl using in Linux desktop these days? I run a Linux server but it’s in rolling distro (gentoo) running since 2015. Has ext4FS and ZFS for the 4 x 2TB disks (these are due to an update in size).
    
    intelfx
    
    If you don’t need to extract every last bit of I/O performance, just do btrfs. It works, it has all the features, and it is extremely flexible when it comes to resizing/reshaping the volumes.
    
    pimeys
    
    It is also the default for Fedora. Raid1, with encryption and compression was a few clicks in the installer.
    
    lonjil
    
    If you don’t need every last bit of I/O performance, just use ZFS. It’s very convenient and let’s you use snapshots and zfs send to do semi real time backups to the ZFS pool on your server.
    
    pimeys
    
    With one caveat: you cannot follow the latest kernel if you choose ZFS for your desktop. Or it takes time to port the module to the latest kernel.
    
    cosarara
    
    ext4, it will not stop working when you get close to filling the filesystem, it has a good old fsck and it will keep your files.
    
    intelfx
    
    ext4, <…> it has a good old fsck and it will keep your files.
    
    Except when it won’t.
    
    Just a week ago, I was helping a friend rescue his files from an ext4/lvm sandwich that somehow started spitting out nondeterministic garbage. This was caught when he downloaded a huge archive of historical data over many days, which failed to extract, then downloaded it two more times, at which point it turned out all three copies hash to different values and the first one changed its hash in the meantime. (No, there were no memory issues with the box, an LTS kernel was being used, and fsck was perfectly happy all the while.)
    
    So yeah. It will keep your files, as long as you’re an ostrich.
    
    Forty-Bot
    
    XFS. It’s the best filesystem in upstream Linux.
    
    slater
    
    Wondering if there’s a happy medium , e.g. you can have a folder called “Home”, but internally it’s “home”; you can rename it to “hOme”, but you can’t have two folders in the same directory called “Home” and “hOme”, respectively? Or maybe that’s already how things are?
    
    xolve
    
    I agree with Linus here. Concept of path in Linux is simple: its a string with / as path separators (ending with null character in C). Adding case-insensitivity invites edge cases which shouldn’t exist. A glaring one I found recently is different cases of file names in git index and file on disk in Windows.
    
    When programs assume insensitive case file system (like some apps on macOS) I am just horrified. Looks like code from DOS or Windows ported and it contained reading/writing files with different cases. Well bad implementation I say and it should be fixed.
    
    dogacel
    
    What if you are trying to copy files from a case-sensitive file system to an insensitive file system? I think it would be super annoying to do that.
    
    moltonel
    
    Both directions pose problems, plenty of examples in this thread.
    
    In some ways it’s a pointless debate: case-insensitive systems exist, so we have to make them work. But they’re a self-inflicted wound, so one could hope that they get deprecated over time, if not phased out.