Linus Torvalds Expresses His Hatred For Case-Insensitive File-Systems
(www.phoronix.com)
(www.phoronix.com)
Such insensetivity!
remember when windows could only handle 8 characters and longer names ended in ~1
To be precise, longer names ending with ~1 are a backwards compatible fix for DOS programs introduced after Windows started supporting longer filenames.
I prefer case sensitivity, the filesystem shouldn't do any magic like that. If someone types "file.txt", opening "File.TXT" would be convenient, but also misleading. Ignoring case is what autocompletion/search is for imo.
The best things is when the OS enforces magic onto the filesystem. Ntfs is case sensitive but windoze is not. So expect some real fun times if you use ntfs on other systems.
For real. It's a ton of fun when you have a Linux server presenting a SMB share and you get a folder called MyFolder and one called MYFOLDER. Take a guess about what happens in that situation. I guarantee it's different
Damn straight. I thought bcachefs was a modern filesystem? Why is it case insensitive? Huge red flag.
Isn't bcache the one made by the solo dev who was causing all that drama trying to merge a bunch of crap during a freeze last year?
If so that explains quite a bit lmao
It isn't normally, but it, like e.g. Ext4, allows case insensitivity mostly for the sake of Wine.
But Wine could handle the case insensitivity though? NTFS is case sensitive.
It does, but having case insensitivity in the file system can get you better performance.
Its' lead dev is also so full of himself, it's insufferable
I recall a case-insensitivity bug from the early days of Mac OS X.
There are three command-line utilities that are distributed as part of the Perl HTTP library: GET
, HEAD
, and POST
. These are for performing the HTTP operations of those names from the command line.
But there's also a POSIX-standard utility for extracting the first few lines of a text file. It's called head
.
I think you see where I'm going with this. HEAD
and head
are the same name in a case-insensitive filesystem such as the classic Mac filesystem. They are different names on a Unix-style filesystem.
Installing /usr/bin/HEAD
from libwww-perl
onto a Mac with the classic filesystem overwrote /usr/bin/head
and broke various things.
Case insensitive is more intuitive and MUCH safer.
You do not want every Windows user to live in a world where Office.exe, office.exe, Offlce.exe and 0fflce.exe are all different files.
OSs and filesystems aren't built for programmers, they're built for grandmas. Programmers just happen to use them. It's much more sensible to give programmers a harder time fixing bugs and incompatibilities than it is to make the user experience even marginally worse.
I mean, all due respect for the guy, but that is an absolutely terrible opinion and I will die on this hill.
Case insensitive is more intuitive
Are these the same filename?
What about these?
Databases have different case-insensitive collations - these control what letters are equivalent to each other. The fact that there's multiple options should tell you that there's no one-size-fits-all solution to case insensitivity.
This issue is only simple and obvious if you don't know enough about it.
I mean, cases in non-latin alphabets are cases as long as they function like cases, equivalences between alphabets are not cases, they're equivalences between alphabets and a different issue altogether. At least that'd be my starting point for implementation.
But you're misrepresenting my argument. I don't give a crap if it's simple and obvious to implement and it's not my claim that it is. If it's simple and obvious to the user it's still the right call, even if the implementation is complicated and has to deal with edge cases.
My last caveat there would be that nobody claimed that a one-size-fits-all is necessary. Ultimately you're not deciding the case sensitiveness of databases, just of one database, and that's the filesystem's naming rules. The rules are arbitrary and conventional. Short of raw "any character code will always be different from any other character code regardless of how visually similar or grammatically interchangeable the user-facing glyphs may be" any other solution is just as arbitrary as each other. You're always making a decision about it.
My contention is the decision shouldn't be based on what is comfortable or more straightforward to implement, debug or use for the OS developers, it should be what is more usable by the lowest common denominator GUI-only users. And that's case insensitive (but otherwise long and flexible) filenames.
But you're misrepresenting my argument.
Hardly, I'm directly addressing your statement that case insensitive is intuitive to users, grandmas or otherwise - I give examples where it's not initiative or obvious which filenames match. I didn't mention ease of implementation at all.
The principle of least surprise is an important UX consideration, and your idea of effectively introducing collation and localising which files conflict is just trading one problem for another set of problems and suprises (e.g. copying directories between drives with different settings).
No, it's not. You're substituting a base use case for an edge case and pretending they are on the same order when it comes to UX. They are not. File localization and mixing and matching alphabets in filenames is NOT the same as case sensitivity and using cases (or spaces, if we want to roll this conversation back a couple decades and talk about an actual implementation mess) in filesystems. Security and stability care about edge cases, it's weird that you try to flex by name dropping "principle of least surprise" and then pretend that a problem impacting every single user who types a filename is the same impact on that than a user mixing and matching alphabets on multiple cases. ESPECIALLY when your example requires making the conscious decision that equivalent characters across alphabets is equivalent to case sensitivity, which is not a given at all.
Oh, and it's not my idea. Default Windows and Mac FSs are case insensitive, legacy FAT systems are case insensitive. If the issue is standardization across systems, case sensistivity is the odd one out. If you're having issues mixing and matching drives in older supported case-insensitive FSs the blanket fix for that is not having a case sensitive system elsewhere for no particularly good reason. I mean, speaking of minimizing surprise...
Your grandma will never type file names in shell, she'll use Open File dialog, where case sensitivity does not matter.
Hah. Second absolutely deadpan Average Familiarity instance in a Linux forum I have this week.
I mean, no offense to grandma. Plenty of grandmas are computer literate. But the idea of this hypothetical normie Windows user doing anything but double click on an icon (too slowly, with a bit too much pressure on the left mouse button, as if that made a difference, probably having single clicked to select first, just in case) is absurd.
File names are icon names first and foremost. File paths are a UI element to breadcrumb the location of the currently open file manager/explorer window unless proven otherwise.
And that is the right answer and how the whole thing should be designed.
double click on an icon (too slowly, with a bit too much pressure on the left mouse button, as if that made a difference, probably having single clicked to select first, just in case
I do that.
I use KDE.
I am a programmer.
Also, I make directories with the correct capitalisations for the project names before going inside them and running git clone
, which makes another directory in small letters.
Also, when I make header files matching class names, I capitalise them same as the class name. That messes up stuff for some others, sometimes. I like it.
OSs and filesystems aren't built for programmers, they're built for grandmas.
You're just flat out and completely wrong.
The entire issue is that gradmas don't type out filepaths.
When you're tying filenames case is easy, because a) you have to press something different, and b) typically terminal monospace fonts look very different in caps and non caps.
But in a GUI where you aren't typing the names out? For a human reading human text caps and non caps are interchangeable. So as the name of an icon case sensitivity is confusing and prone to human error.
I mean, it's that in typing, too, because it's a very easy typo to make and all sorts of mixed case choices can be hard to remember, but it's MORE confusing if you end up with just an icon with a name and the exact same icon with the exact same name just one character is a different case.
OSs don't do anything by themselves, but they come bundled with all sorts of standardize applications built on top of them. If case sensitivity is baked into the filesystem, it's baked into the filesystem. And absolutely no, you can't put it in at the application level. I mean, congratulations for finding the absolute worst of both worlds, but how would that even work? If I tell an app to use a file and there are two of them with different cases how would that play out? You can build it into indexing and search queries and so on when they will display more than one result (and that, by the way, is typically extra EXTRA confusing), but you can't possibly override the case sensitive filesystem.
Now, character byte codes are a different thing, and it's true that the gripe in this particular rant seems to be almost more focused into weird unicode quirks and the case sensitivity thing seems to be mostly a pet peeve he rolls into it, I suspect somewhat facetiously.
But still, that's for the OS, the filesystem and the applications to sort out. It's an edge case to handle and it can be sorted out via arbitrary convention regardless of whether you do case sensitivity for filenames. "Case insensitive means insensitive to other things, too" is not a given at all.
Now, character byte codes are a different thing, and it's true that the gripe in this particular rant seems to be almost more focused into weird unicode quirks and the case sensitivity thing seems to be mostly a pet peeve he rolls into it, I suspect somewhat facetiously.
No, it has nothing to do with "weird Unicode quirks".
It has everything to do with their being a universal standard for representing different characters, and the file system deciding to then apply its own random additional standard on top that arbitrarily decides some thing are probably the same as others.
This is just like Javascript's early ==, fuzzy equality choice. It was done to be helpful, but was a fuzzy system that doesn't cover enough edge cases to be implemented at that low of a level.
Arbitrary is the word.
Arbitrary means you can implement it however you want. The limits to it are by convention. There is no need to go any further than case insensitive filenames. At all. Rolling case insensitive filenames into the same issue is entirely an attempt to make a case against a pet peeve for unrelated reasons.
You want it to handle the edge cases? Have it handle the edge cases. You want to restrict it to the minimum feature just for alphabet characters? Do that.
But you do NOT give up on the functionality or user experience because of the edge cases. You don't design a user interface (and that's what a OS with a GUI is, ultimately) for consistency or code elegance, you design it for usability. Everything else works around that.
I can feel this conversation slipping towards the black hole that is the argument about the mainstream readiness of Linux and I think we should make a suicide pact to not go there, but man, is it starting to form a narrative and am I finding it hard to avoid it.
There is no need to go any further than case insensitive filenames. At all. Rolling case insensitive filenames into the same issue is entirely an attempt to make a case against a pet peeve for unrelated reasons.
This is literally just the same issue. I cannot see what two issues you are separating this into.
All of this stems from case insensitive file names.
But you do NOT give up on the functionality or user experience because of the edge cases. You don't design a user interface (and that's what a OS with a GUI is, ultimately) for consistency or code elegance, you design it for usability. Everything else works around that.
The OS is not the GUI. Every GUI you see in the OS is an application running on top of the actual OS.
The OS should not arbitrarily decide that some characters are the same as others, it should respect the unified standards for what bytes represent what characters. Unless there is an internationally agreed upon standard for EXACTLY what case insensitive means for every character byte code, then you are building a flawed system that will ruin the user experience when massive bugs and stability issues come up because you didn't actually plan out your system properly to cover edge cases.
You know, as Linus is pointing out given his multi decade history of running Linux.
No, hold on, this is not about the OS.
This is about whether the filesystem in the OS supports case insensitive names.
That determines whether the GUI supports case insensitive names down the line, so the choices made by the filesystem and by the OS support of the filesystem must be done with the usability of the GUI in mind.
So absolutely yes, the OS should decide that some characters are the same as others, not arbitrarily but because the characters are hard to read distinctly by humans and that is the first consideration.
Or hey, we can go back to making all filenames all caps. That works, too and fully solves the problem.
But if someone creates a file called HEAD
, should it overwrite a file called head
?
That shouldn't matter to the "nontechnical" end-user at all. To the nontechnical user, even the abstraction of "creating a file" has largely gone away. You create a document, and changes you make to it are automatically persisted to storage, either local or cloud.
Only the technical command-line user cares about whether /usr/bin/HEAD
and /usr/bin/head
are the same path. And only in a specific circumstance — such as the early days of Mac OS X, where the Macintosh and Unix cultures collided — could the bug that I described emerge.
I found this post confusing because on the face of it, it sounds like you agree with me.
I mean, yeah, HEAD and head should overwrite each other.
As you say, only technical command-line users care about the case sensitivity. So no, it shouldn't matter to the nontechnical user. And because the nontechnical user doesn't care about the distinction if something is called "head" in any permutation it shares a name with anything else called "head". And the rules are items within a directory have unique filenames. So "head" and "HEAD" aren't unique.
The issue isn't that the names are case insensitive, the issue is that two applications are using the same name in the same path.
If we're not careful that'll lead to a question about whether consolidating things in the Unix-style directory structure is a bad idea. I normally tend to be neutral on that choice, but you make a case for how the DOS/Windows structure that keeps all binaries, libraries and dependencies under the same directory at the cost of redundancy doesn't have this problem to begin with.
But either way, if two pieces of software happen to choose the same name they will step over each other. The problem there is neither with case sensitivity or case insensitivity. The problem there is going back and forth between the two in a directory structure that doesn't fence optional packages under per-application directories. As you say, this is only possible in a very particular scenario (and not what the post in question is about anyway).
Utterly reasonable opinion. Case insensitive filesystems are just lazy programming.
Case insensitive file systems arent lazy, they're a programmer putting in a lot of effort to try and be helpful only to realize that their helpful system doesn't actually cover all the edge cases it needs to and thus just adds a whole extra layer of complication and annoyance to the project.
Hmmm. I doubt that, unless they were really bad programmers, downcasing (or upcasing) the file name in the file name accessors took much work, but I'll grant it's more than zero.
I'll retract the "lazy" comment.
That's because you're thinking in your tiny ASCII bubble. Switching case in Unicode is a hugely complex problem.
Wait... vfat supports Unicode? The filesystem that craps out if the file path length is longer than a couple hundred characters; that is an extension of a filesystem that couldn't handle file names longer than 8.3 characters; that doesn't have any concept of file permissions, much less ACLs; the one that partitioned filenames in 13 character hunks in directories to support filenames longer than 12 characters... that isn't case sensitive, except in all the wrong ways - this filesystem can handle Unicode?
I greatly doubt that. FAT doesn't even support 8-bit ASCII, does it? 7-bit only. Unless you mean FAT32, which can optionally have UTF-16 support enabled. And it's far easier to manage case changes in UTF-16 than UTF-8, using case mapping as MS does. The API handles all of this for you; it keeps track of what the the user calls them, but uses it's own internal name for the file. And na'er the two shall meet, lest there be trouble.
I do think it's sloppy and lazy; it's very easy to avoid doing actual work thinking about the problem and to bang out some hack solution. In the end, far more work is done, but for the wrong reasons.
I don't know what Apple's excuse is, except maybe DNA. Apple ][ were not only case insensitive, they didn't even have lower case characters at all. There was only one case, and maybe those engineers brought that mind set forward with the Lisa, and then the Mac. How it got into Darwin... is Darwin really case insensitive? I'm pretty sure on the company line - at the filesystem level, it is.
Kind of the opposite. It takes more effort to make a filesystem case-insensitive. Binary comparison is the laziest approach. (Note that laziness is a virtue.)
I'm on the fence as to which is better. Putting backwards compatibility aside, there's a perfectly good case to be made for case-insensitivity being more intuitive to the human user.
Apple got into a strange position when marrying Mac OS (case-insensitive) and NeXTSTEP (case-sensitive). It used to be possible to install OS X on case-sensitive HFS+ but it was never very well supported and I think they axed it somewhere down the road.
I can with very high confidence say that for the average computer user, case-insensitive is the only alternative. At least if you don't want IT and computer support around the world to start going postal.
As soon as someone is at least semi comfortable navigating a unix-style terminal and using a terminal based text editor to at least change config files, case-sensitive starts to become better. And often the more you get into programming, the more you get like Linus here and develop a hate.
Good for him. I hate case-sensitivity, and it's what keeps me going back to DOS & Windows. FILE, File, file, and FilE should all be the same thing at all times.
FILE, File, file, and FilE should all be the same thing
If these were truly the same thing, you should have not written them differently.
But you did.
you should have not written them differently.
But you did.
Remember that 99% of the time that's gonna be because of a typo for 99% users. They won't have File.txt
, FILE.TXT
and FiLe.tXt
, they'll have ReportMay.docx
and REportMay.docx
or whatever.
And yeah, that includes me. I don't want case-sensitivity for that reason alone. Thanks, but no thanks.
Do you actually have a case sensitive filesystem? Because in reality I don't even notice it when doing normal work. It seems like such a weird thing to be crying about.
I've used Linux, yes. And I'm not "crying" I just find it annoying. Good grief.
I did, because they're different ways of expressing the same meaning. They all mean (apologies for borrowing mathematical notation for linguistic applications) |file|. I don't care what the expression of a thing is, I care about meaning. And as a result, when I save a file and then search to recall it, it should not matter what case it's in - only for the meaning to match. The state of my shift or capslock should be totally immaterial.
when I save a file and then search to recall it, it should not matter what case it’s in
Whatever you use to search can just be case insensitive, which is how most file browsers work on Linux.
Then why should it allow me to save different expressions of the same meaning ever? If it's going to let me search for it case-insensitive, just head the matter off at the pass and save it that way. Either that, or automatically create link files for every case permutation to the same folder as soon as the file exists.
This is really a problem of human vs computer thinking.
F and f are two different characters, encoded differently. Ergo, File and file are different by raw bytes.
Some developers wish to make the interactions for the user more consistent and thus a case-insensitive filesystem is born. The problem is that this is such a low level place to make this decision.
A filesystem, as in the kernel level interactions for files, should be case-sensitive in that every character is a unique series of bits. But there’s nothing stopping a higher level api from helping users out. It would be sensible to have a case-insensitive desktop environment.
The low level functionality should remain intentional though.
I was looking into this recently and I didn't know this but NTFS is actually designed by competent people and is fully case sensitive.
For backwards of course Microsoft had to make the file APIs case insensitive, but the actual filesystem is case sensitive.
Also, presumably because this is a real turn-off for developers there is actually an option in Windows to sort of make specific directories case sensitive. Wild right?
https://learn.microsoft.com/en-us/windows/wsl/case-sensitivity
Yeah, I think Windows actually handles it quite well, the actual filesystem has no notion of what the filenames are outside of basic "It's UTF-16", it's the OS filesystem layer that handles all the quirks.
Because that's what people seem to dismiss, there's no one standard notion of case folding. It depends on the locale you're using, and that shouldn't be built into the FS itself. The classic one was the German "long S", where "SS" should be case folded with "ß", except they changed it in 2024 so now they shouldn't match ("ß" becomes "ẞ" now), good luck updating your FS to support rules like that.
Now your shell? That's easy, you can just warn the user that a "matching" filename already exists and prompt them to change it, and you can vary those warnings based on the locale, and you can push out updates as easily as any other patch.
Ah, one more reason for me to despise NTFS.
Why? They did it right...
FILE, File, file, and FilE should all be the same thing at all times.
"Let's point many completely different combinations of characters at the same file"
sentences dreamed up by the utterly deranged /hj /lh
Couldn't agree more. I literally can not think of a single scenario where case sensitive file names would be anything but an annoyance.
somehow you remInded me of the BIG-small archItecture.
BlG-small?
smig ball
Please either use the web app or Jerboa for Android (Play Store, F-Droid). There is currently an iOS app in beta called Mlem.
Matrix chat room: https://matrix.to/#/#midwestsociallemmy:matrix.org
Communities from our friends:
LiberaPay link: https://liberapay.com/seahorse