this post was submitted on 03 May 2024
15 points (66.0% liked)

Linux

48330 readers
613 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

I know that there are ten different alternatives. Why don't we simply improve the basic stuff?

all 42 comments
sorted by: hot top controversial new old
[–] [email protected] 55 points 6 months ago (2 children)

It has nothing to do with bash specifically - other shells like sh, csh, tcsh, zsh, etc. are the same. Whitespace in UNIX is just that way by design. And it's been a long while since I used a Windows CLI but they were that way too - plus added all that weirdness about ~1 at the ends of filenames, and Mac OSX also. So not even just UNIX, but it's how the CLIs tend to work, where whitespace acts as the "delimiter" between arguments sent to a program.

program_name arg1 arg2 arg3 arg4

So if you use whitespace like "cp file 1 file 2", the CLI sends arg1="file", arg2="1", arg3="file", arg4="2", rather than arg1="file 1" and arg2="file 2". These are just the foundational rules of how CLIs work - a computer can't read your mind, and this is how you precisely tell it what you want, within this highly rigid framework to avoid misunderstandings.

The alternative is to use a GUI, so like see file, drag file, and ofc that has its own set of tradeoffs good and bad.

[–] [email protected] 16 points 6 months ago (1 children)

Yeah, for this reason, lots of full-fledged programming languages actually make you specify the arguments as a list of strings directly, so for example:

Command::new("cp")
    .args(["file 1", "file 2”])
[–] [email protected] 17 points 6 months ago

It's a subset of the standard delimiter problem: if I want to use the delimiter inside of an entry, can I even do that and if so then how?

e.g. in comma-delimited lists you could "escape" the commas individually, or encapsulate each entry inside quotes, or provide each entry by name, etc. - all of which significantly complicates the retrieval process by adding greater complexity to decide on rules determining how it all works (like if by name, then what if the user [stupidly? on purpose?] provides multiple entries with the same name - do subsequent ones overwrite the earlier ones or their contents get appended to the end and if the latter, is any separation provided between them? and on and on it goes):

  • item1,item2,item3
  • "Denver, CO","New York, NY",Miami/, FL
  • "Lastname, Firstname",Lastname/, Firstname
  • item1="Denver, CO", item2="New York, NY"

Common English has issues with this too like is a list with "John, Marsha, Barbie and Ken" 4 entries or just 3 where the latter is a pairing? (leading to Oxford comma discussion:-P it is very important though bc while while individual people may have similar needs like food, pairings may have different constraints like if they drive together then they need less parking space)

So this delimiter issue is not even specific to CLIs, nor even computers in general - it is a universal problem with any communication system.

[–] [email protected] 5 points 6 months ago (2 children)

other shells like sh, csh, tcsh, zsh, etc. are the same

Zsh has some important differences in how it handles whitespace and quoting, which affects OP's exact example.

Consider this:

touch a b c 'd e f' 'g h i'
for f in *; do ls -la $f; done

In zsh, this works. In bash, it will give you six errors saying d, e, f, g, h, and i do not exist.

[–] [email protected] 5 points 6 months ago (1 children)
touch a b c 'd e f' 'g h i'
for f in *; do ls -la "$f"; done

fxd

[–] [email protected] 6 points 6 months ago (1 children)

That will work in either zsh or bash, yes. It's a good habit to use quotes, but I am pointing out that quoting and expansion behavior is not the same across all shells.

[–] [email protected] 6 points 6 months ago

It's the same across all POSIX compliant shells. zsh is not POSIX compliant.

[–] [email protected] 2 points 6 months ago (2 children)

That only affects whitespaces within quotes though. Still, fair point, except I just tried a bunch of stuff in both bash and zsh and touching a file works, echoing a string works, then I stopped so I don't know about the asterisk but we have already veered far away from what OP said: "normal foor (sic) loop with whitespace in file names" - whereas what you had seems significantly more advanced than a "normal" foor (sic:-P) loop.

Notably, Mac OSX right out of the box uses zsh. I haven't touched "standard" personal distros for a number of years but a quick search suggests that Mint, Ubuntu, and NixOS all use bash by default - which halfway though not entirely surprises me? Anyway if OP wants to change their default shell to something more advanced, that would be fine for common every-day usage, though asking for bash itself to now be changed after decades of backwards compatibility seems a non-starter to me. There are reasons for why it works as it does, and those reasons have nothing to do with it being "old", but rather b/c it "works".

And the underlying reason for that is b/c we are still using keyboards. The addition of mice as HUDs enabled drag-and-drop, and perhaps some kind of glove or fingertip reader or eyesight-tracker may allow the same, like Minority Report (an old movie) or Iron Man style pinching an "object", grabbing it and letting it go, is basically just another style of "mouse". Afaik, there hasn't been even a hint of anything truly revolutionary for all this time. Although I can envision one such idea: combining keyboard+"mouse" in a more intelligent way, like if you start typing a command, then fix your eyes on the screen to a particular file and perhaps flick your eyes in a particular direction to indicate acceptance and it could fill it in for you, without having to move your hands away from the keyboard. With glasses and ubiquitous cameras everywhere now, we might see something like that in a few decades? Though it would put further pressure onto privacy concerns over having a camera watching every move you make.

[–] [email protected] 6 points 6 months ago (1 children)

a quick search suggests that Mint, Ubuntu, and NixOS all use bash by default

With Debian-based distros, it's actually a bit weirder. They use dash as the global default shell (i.e. for executing sh scripts).
dash has basically no code for interactive use, so it's supposedly faster and more secure. It is POSIX-compliant, so the treatment of whitespace should be identical, but it doesn't support any of the added features of bash.

If you open up a terminal emulator, they've got that set up to use bash by default, so dash is supposed to be invisible to the user, but well, spoilers, it's not. If you switch to a TTY, for example, it launches there and makes the TTY look completely broken.

[–] [email protected] 3 points 6 months ago

Hehe thank you for the fun extra story:-).

[–] [email protected] 3 points 6 months ago (1 children)

Yeah, Apple moved to Zsh as default some years back, which is the main reason I'm familiar with its differences in terms of parameter expansion. They still ship Bash 3.2 with macOS, but they can't ship newer versions due to GPLv3 licensing, or something like that. So they had the motivation to switch.

In the Linux world, there's no great motivation to change the default, because Bash 5.x is already comparable to zsh in terms of features, and it's what everyone is already familiar with.

Perhaps I misunderstood OP's question. I figured they meant using variables. Otherwise I don't know how to make sense of it.

[–] [email protected] 2 points 6 months ago

Admittedly, I too am not certain why "noone inprove bash such that you can write a normal foor loop with whitespace in file names?" :-P I just noticed that not only was "foor" loop misspelled, and "noone" is likewise improper (should be "no one" or "nobody"), but "inprove" is also a "performance improvement company that helps clients implement their internal continuous improvement programs more effectively, and achieve better, more consistent and sustained results", according to Google's (SEO) search feature:-P

Therefore, I have little trouble believing that they wanted all of bash to be changed - for free ofc - so that they could do something like:

touch "Iron Man"; mv Iron Man The Greatest Movie of All Time!?

And the computer would auto-magically figure out that since mv is a command involving files, and "Iron Man" is a file that exists, that it should be the first argument and the rest of the text is the second argument. i.e., why learn how bash works, when you can make a post to [email protected] and put hundreds of programmers to work for you to change the entire world, at your beck and call, while also working in how ashamed they should be that they haven't done that effort preemptively?

Which ngl, might be a good idea. Or, you know, OP could learn to use tab-complete that already does that. I should have mentioned that I suppose... but it seems too late now b/c I doubt the mods will let this post remain for too much longer. Even if you were correct and they meant variables: they never actually said that, which makes this communication really difficult to both guess what OP might have meant and also solve their problem for them, on top of them being willing to learn on their own. But we can do better on our end too: perhaps we could create a community specialized in providing help to newcomers who want to learn linux - like what resources can they read/watch/play with, to help them get started? To be clear, *I'm* not offering to start that!!

[–] [email protected] 38 points 6 months ago (1 children)

My question is, how can you look at whitespace in a filename and not have your eyelid twitch?

[–] [email protected] 33 points 6 months ago

You can already write a for loop that handles whitespace in file names, just use quotes around the file name variable:

https://www.howtogeek.com/850124/spaces-in-filenames-on-linux/#how-to-use-filenames-with-spaces-in-bash-scripts

[–] [email protected] 28 points 6 months ago (1 children)

"whitespace in file names", please don't.

[–] [email protected] 0 points 6 months ago (1 children)

@TigrisMorte Why not? I don't understand the hate with using valid filename characters in filenames. If anything I would argue it makes detecting non-conformant code easier... you wouldn't want a program to skip processing a file because it has the letter Z in it would you?

[–] [email protected] 1 points 6 months ago (1 children)

Makes CLI involving the space a pain. Use an underscore if you must have a visual space, but best practices would be to use Camel Case, no punctuation (including spaces), and include the date in Gregorian format CCYYMMDD

I don't know if any given system shall issues or could handle it fine, but I know some systems cannot handle spaces in file names. There is no reason to tempt fate.

[–] [email protected] -1 points 6 months ago* (last edited 6 months ago) (1 children)

@TigrisMorte I don't think camel case would make sense for files such as books, music or movies when you want the name to look proper.

[–] [email protected] 2 points 6 months ago (1 children)

It is a filename not a PR press release. How it looks is irrelevant to how the system handles it. Want the display to show something other than the file name is easy and doesn't risk errors, unlike punctuation in filenames which creates problems and solves none.

[–] [email protected] 0 points 6 months ago (1 children)

> How it looks is irrelevant to how the system handles it

Then IMO it shouldn't matter that there is a space in it. I disagree with your viewpoints and I think if someone wants a space or punctuation in their filename, they should be able to do that without problems.

[–] [email protected] 1 points 6 months ago

Your opinion does not matter. Nor does the presentation upon the screen, which is what you are actually asking for. The reality was explained and you are stamping your feet and holding your breath.

[–] [email protected] 26 points 6 months ago

Am I not reading your question right? Just quote the variable and filenames with spaces work fine.

for i in *; do
    cat "$i" ;
done
[–] [email protected] 15 points 6 months ago (1 children)

As others have said, if you quote your variables, they won't get split on spaces. The Unix shell unfortunately has ton of gotchas like this, and the reason this is not changed is backwards-compatibility. Lots of shell scripts depend on this behavior, e.g. there might be something like:

flags="-a -l"
ls $flags

If you quote this (ls "$flags"), ls will see it as one argument, instead of splitting it into two arguments. You could patch the shell to not split arguments by default, and invent some other syntax for when you want this splitting behavior, but that would break a ton of existing shell scripts, and confuse users who are already familiar with the way it works right now. It would also make the shell incompatible with other shells, and violate the POSIX standard.

[–] taladar 3 points 6 months ago (1 children)

The reason for this is not backwards compatibility, the reason is that it would be stupid. Space appears a lot more often in situations where you need a separator than in filenames so why would you make the common case harder to use to save some typing in the edge case?

[–] [email protected] 2 points 6 months ago (1 children)

I disagree. The vast majority of the time when writing shell scripts, I quote variables, because that's almost always what I want. Splitting is basically only useful if you have a list of arguments, and you know for sure there are no spaces in any of the arguments (so no filenames).

(The workarounds in pure POSIX shell are btw super annoying if you want to pass a list arguments that may have spaces in them: You can abuse the special "$@" variable. Or you could probably also construct something with xargs.)

[–] taladar 3 points 6 months ago* (last edited 6 months ago) (1 children)

Every single command, option and argument in the shell is split by spaces, regardless of what it contains. That is clearly the more common case. I am not talking about splitting when the space comes out of a variable but in general, as part of the syntax.

I am well aware of how quoting works to avoid accidental splitting and it is an absolute non-issue in practice once you get used to quoting things, about as annoying as the fact that you have to quote strings in every other programming language, i.e. not at all.

[–] [email protected] 2 points 6 months ago* (last edited 6 months ago)

Ah that's your point. Yeah I agree that splitting literal a b c is convenient. It is surprising to many (like here) that this happens after variable substitution, and that's not very convenient since you almost never want that. You could define this to happen the other way around, but then you'd obviously have to invent a new syntax for explicit splitting, which would be its own kind of annoying.

Edit: YSH (oil) does that btw. See here.

[–] [email protected] 12 points 6 months ago* (last edited 6 months ago) (1 children)

@barbara Is bash itself not already an improvement on ""the basic stuff""?

...and whitespace in filenames is simply unacceptable, and should not be encouraged. 😆

What's wrong with the method we've been usin forever of working with dumbly named files? Just "enclose em", or use\ an\ escape\ char in em.

[–] [email protected] 7 points 6 months ago

Why don't we improve the basic stuff, like processor architecture?

Because if we do, we make everything we have working now break. So everything would need be ported to this new architecture.

The same with bash or any other foundation lib.

And also these "improvements" may make these libs more complex and, therefore, more unstable and hackable.

The simple is simple for a reason: It works trustfully

[–] [email protected] 5 points 6 months ago (1 children)

i have bad news: bash is already a massively improved/extended ksh clone. ksh was a massively improved/extended sh clone. sh got a ton of improvements early on.

this is about as good as you can get without breaking compatibility completely (bash already breaks compatibility with posix sh in some ways).

anyway, once you've figured out the hermetic incantations required to work with filenames with whitespace in them, it'll be time to write scripts that can handle filenames with newlines in them :D

[–] [email protected] 2 points 6 months ago

Most shells will issue $PS2 as the continuation prompt if you quote a filename and try to insert a carriage return.

Ctrl-V Ctrl-J is the explicit keypress pair to insert a carriage return without triggering $PS2, but beware: If the carriage return is outside of quotes, that's equivalent to starting a new command in much the same way a semicolon or a new line in a shell script would.

echo "hello^V^Jthere" [Enter] echoes hello on one line and then there on the next, but echo hello^V^Jthere [Enter] will echo hello then try to run a command called there

We'd have to assume that whatever fixes spaces in filenames would also have an option to fix this subtlety. And I say to whoever tries: Good luck with that.

[–] [email protected] 5 points 6 months ago* (last edited 6 months ago) (1 children)

That's quite a lot of comments so far with nobody saying
IFS=' '

[–] [email protected] 1 points 6 months ago (1 children)
[–] [email protected] 1 points 6 months ago* (last edited 6 months ago)

Sure, or IFS=`echo "\n"` if you like

[–] [email protected] 4 points 6 months ago

Probably backwards compatibility with existing scripts.

[–] LemoineFairclough 1 points 6 months ago

I write for loops involving white space in file names pretty frequently. For example:

export LC_ALL=POSIX && unset IFS && for file in 'a file' 'another file'; do touch -- "${file}" || exit; done

As for improving "the basic stuff", the Shell Command Language is defined with https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html but it is possible to improve it, as evidenced by things such as: