Shell scripts

29 points by wink

sigmonsez

what's wrong with the natural progression of, "Whoa, my bash script reached 100 lines" and then rewrite it in the same $LANG your project is in?

square_usual

Because, as often happens, you only have enough time to add the hundredth line, and so the script remains in shell, decaying forever until some poor soul has to dive into it to fix a bug. And before you know it, it's one of those corners of the code base everyone dreads to touch but nobody has the time to fix.
- adrien
  
  I've regularly seen a specific difficulty when it comes to migrating away from shell scripts: input and output formats.
  
  What's convenient in shell scripts is often not as convenient with other languages; moreover, it's pretty easy to depend on weird/unknown semantics of some shell script or tool and replicating that can be difficult and risky.
- sigmonsez
  
  A long time ago i wrote 1000's of bash to automate a cicd package pipeline. Then I rewrote it in python and treated it as a program. Some time after, I intended to write it in go, but never got around to it.
  
  I have followed that natural progression based on what my intuition says. bash -> python -> go. Or sometimes, just straight bash -> go.
  
  bash is great for simple things, but if you find yourself fixing bugs in bash scripts, it's time to move onto something more robust.
- mxuribe
  
  True, you're so correct! Though i knew an old timer years ago who advised that anyone who ever touches code - whether dev or sys admin, or whatever - should always know at least 2 languages...And 1 of the lanugages MUST be scripting (e.g.bash, Perl, python), and the other MUST be compiled (e.g. C, C++, java, etc.)....so that one prototypes with scripting and then when inevitably the function outgrows such a script, its much easier to migrate said function to the compiled language, etc. Its not a 1-to-1 thing all that time, but it made sense back in the day, and to me at least makes sense still today. So, when that 100 lin bash script reaches a certain point, it becomes a near-no-brainer to "upgrade it to a more robust, compiled language, etc.
  - spc476
    
    I tried that at $PREVIOUS_JOB. I wrote a proof-of-concept in Lua to get familiar with the problem, and unknown to me, my manager at the time put it into production. That's the problem with prototypes---you may think it's a prototype, but if it works at all, it's put into production.
    
    mxuribe
    
    unknown to me, my manager at the time put it into production.
    
    Oof, that's a rough one! I guess there are downsides to that approach. And, yeah, besides any accidental deplopyments, there's the whole aspect where you only meant something as a prototype, but others take it as gospel...so, yeah, there are risks. Also, separate of this, i suppose its possible that every on a team chooses different sets of languages, which doesn't help in all cases, etc. ;-)
  - MatejKafka
    
    Many (most?) shell scripts are about filesystem operations and calling/chaining commands, both of which tend to be annoying in most general purpose languages. Having to compile your scripts is also pretty annoying.
    
    dijit
    
    I think the "go run" command was designed to make the compile step invisible, but I never see go run on go files in practice.
    
    sigmonsez
    
    have you seen the go shell script hack?
    
    $ cat script.go /*usr/bin/env go run "$0" "$@"; exit; */ package main import "fmt" func main() { fmt.Println("Hello world") }
    
    and then chmod +x and run it
    
    $ ./script.go Hello world
    
    dijit
    
    lol, I hadn't, that's amazing.
    
    I hope you won't be offended if I never use this in production though :P
    
    wink
    
    Cursed. I like it.
    
    dijit
    
    Aside from the organic growth that others mention, sometimes shell scripts really are the right call.
    
    Bash (and perl) are excellent for executing arbitrary commands and processing streams of text; if the job you have is to glue a load of binaries together, chew on some text and then use that to call another program with correct arguments (over, say, 64 processes using xargs) then doing such a thing in a "real language" will be terribly unergonomic.
    
    I've seen a lot of bash, the best bash comes unfortunately from Perl shops, there aren't many of those left I think. But Perl developers know well how to manage languages that are easy to become "write only". Bash falls into that category.
    
    andyc
    
    Rewriting bash scripts is difficult in general ... shell and especially bash are obscure languages, with idioms that aren't readily available in other languages
    
    I'd say it's harder to rewrite a bash script in Python than say Perl or Ruby script
    
    BTW YSH is designed to be the easiest language to write a bash script into ... because OSH is the most bash-compatible shell, and OSH and YSH share a runtime (they live in the same binary)
    
    The idea is that you can put shopt --set ysh:upgrade at the top of your bash script, and start using a cleaner language ... and then gradually rewrite it, rather than all at once
    
    rtpg
    
    python argument parser gives you good documentation for your script more or less immediately and cleanly. No help() metafunction you gotta keep up to date, and less "oops I didn't handle args correctly" stuff.
    
    And also you get useful data structures like hashmaps for free.
    
    But well... subprocess.run is annoying. from subprocess import run; run("foo bar baz".split()) works fine enough but feels silly.
    
    andyc
    
    ... I think it represents exactly that type of small helper script. It's simple enough it could have been a shell script, but if I needed more stuff, I could have added it, in a proper programming language.
    
    Compared to Janet, you can just remove the ($ and ) with YSH. In fact, I think it's identical to shell for this script, without the pitfalls.
    
    #!/usr/bin/env janet # boring header, including the sh package (use sh) # One task is cleaning up old branches, so list them ($ echo "# Local branches") ($ git br) ($ echo "# Remote branches") ($ git br -r)
    
    YSH:
    
    #!/usr/bin/env ysh echo "# Local branches" git br echo "# Remote branches" git br -r
    
    And it has a "real language" for control flow, loops, functions with params, dicts and lists like Python/JS, etc.
    
    A Tour of YSH
    
    YSH vs Shell Idioms
    
    jmmv
    
    I've long been unhappy with shell scripts for anything that's more than 20 lines of glue code
    
    But... why? What is it that makes you unhappy about "anything longer than 20 lines"?
    
    usually that you need to fiddle with various versions of exec or popen or whatever the language calls their wrapper
    
    Right. I often see pushback for scripts at $WORK of the form "omg this shell script should be a Go program or a Python script!!!!one!!" and then someone goes and rewrites the script. Then... what used to be ~100 lines of shell becomes a giant monster for I'm not sure what benefit.
    
    I'm also seeing this more and more now because LLMs can supposedly convert those scripts to other language with ease. The reality is that the resulting converted program ends up being the same as the shell script (exec-ing external commands) but with many more lines, instead of using native libraries and primitives to achieve the same outcome.
    
    pie_flavor
    
    The benefit is that it doesn't contain five thousand bugs and gotchas. The bugs and gotchas it does contain are the sort you solve every day in the normal course of work and so have well-honed reflexes for.
    
    Bashism of the day: =~ is the [[ operator for regex. [[ abc =~ a.c ]] is true. [[ abc =~ 'a.c' ]] is false. Why? Because quotes means a literal string, not regex, even though it's the regex operator. If you have spaces, escape them with \. If that sounds dumb and you want a normal regex, assign it to a variable beforehand reg='a. c' and use the variable unquoted, [[ 'ab c' =~ $reg ]]. And if your shellcheck brain sees that three months later and updates it to =~ "$reg" you've broken it again.
    
    Now, what's the incantation for checking whether an array of strings contains each of the strings in another array, where those strings can contain spaces?
    
    There may be a lot more than 100 lines in the final product, but those lines are more readable, more maintainable, and more tolerant of edge cases.
    
    stephenr
    
    90% of your comment could be simplified to "don't target/use Bash specifically".
    
    singpolyma
    
    "run shellcheck"
    
    andyc
    
    Yes, this is a design flaw in bash. And it's even mentioned in the bash manual, as I said here:
    
    Oils 0.22.0 - Docs, Pretty Printing, Nix, and Zsh
    
    https://www.oilshell.org/blog/2024/06/release-0.22.0.html#driven-by-nix
    
    Probably the most useful part of the bash manual is the acknowledgement that there's a lexing design bug with [[ and regular expressions
    
    https://news.ycombinator.com/item?id=38414011
    
    It is sometimes difficult to specify a regular expression properly without using quotes, or to keep track of the quoting used by regular expressions while paying attention to shell quoting and the shell’s quote removal. Storing the regular expression in a shell variable is often a useful way to avoid problems with quoting characters that are special to the shell.
    
    wink
    
    I can't give an objective answer except: I've been using shell scripts since I've been using Linux and they've always been "fine" but never great.
    
    Testing is bad. Arrays are bad. Everything that is not pure text is bad.
    
    Absolute line length does not matter that much to me, 20 was an arbitrary pick. The backup shell script I mentioned was 60 lines for mysql OR postgres (I never combined them) and the resulting Python script has 240 lines for both and I like it so much more (I could have golfed it but I think I also added more features)
    
    borisk
    
    What is it that makes you unhappy about "anything longer than 20 lines"?
    
    My top three complaints about shell scripting (bash, etc):
    
    Error handling.
    
    Quoting.
    
    The shittiness of the "scripting" part. Specifically, whoever thought that control flow, functions, etc., can all be reasonably represented as a command line was wrong.
    
    andyc
    
    The first two issues were empirically found to be the most common ShellCheck errors:
    
    Bash in the Wild: Language Usage, Code Smells, and Bugs
    
    https://cs.uwaterloo.ca/%7Ecnsun/public/publication/tosem22/tosem22.pdf
    
    And I wrote a blog post mentioning that YSH addresses them specifically:
    
    YSH Addresses Common Errors in 1.3 Million Shell Scripts
    
    https://oils.pub/blog/2025/12/links.html#ysh-addresses-common-errors-in-13-million-shell-scripts
    
    YSH also has a more familiar syntax for control flow, and is statically parsed, e.g.
    
    if test --dir /tmp { echo yes }
    
    versus shell
    
    if test -d /tmp; then echo yes fi
    
    Many people have noted that they have problems remembering the ; then syntax
    
    YSH vs Shell Idioms
    
    koala
    
    As expected, ~andyc is in the thread providing lots of great information, but I think:
    
    Internal DSLs for Shell
    
    Alternative Shells
    
    Are two awesome resources in the Oils wiki, listing both shells without the usual footguns or libraries to make scripting in usual programming languages not be so burdensome.
    
    ...
    
    Right now my strategy is to port any shell script into Python as soon as set -e does not suffice for error handling, when I need to use anything which I'm not 100% sure it's POSIX, or when I reach for anything like awk or jq.
    
    The only thing that bothers me is that every script then starts with a subprocess.run wrapper that sets check=True and then grows organically to support what's needed in every script.
    
    (This is because I never add any dependencies. Except for YAML and other similar stuff which is not JSON, really anything you can do with a shell you can do with the Python standard library, so it simplifies things to not use dependencies, even if it's a bit of a pain sometimes. And most stuff that does structured input/output supports JSON so it's fine.)
    
    But until something better is more widespread on the environments I have to use, Python it is.
    
    andyc
    
    I'm glad they are useful! Those pages have seen many contributions, which has made them better, so anyone should feel free to add more projects
    
    ( It sounds like we need to make Oils available in more environments :-) )
    
    koala
    
    I think Python was the last thing that became widespread in systems. And likely because a lot of software is written in Python- including a considerable amount of distribution/OS plumbing (e.g. package managers).
    
    So I think the way to make something be installed in more places by default is just... make it used a lot by other widespread software.
    
    The other way is what is already mentioned in "alternative shells": languages that compile to what's already popular. However, I'm not entirely sure that's the best idea.
    
    felixyz
    
    I've tentatively moved to Elvish for shell scripting (while sticking with fish as my interactive shell).
    
    Single binary
    
    Just enough improvements over bash, while staying a shell language rather than a general purpose language with some shell scripting conveniences, most importantly...
    
    A clever way to make binaries on the path callable without special syntax or ceremony
    
    Love it so far!
    
    dz4k
    
    Tcl actually makes a great shell scripting language, with sensible syntax (no implicit wordsplitting) and a really convenient exec function. The only reason I haven't switched to it fully is because I like using immutable OSs, and Tcl is inconvenient to install without distro packages, and those are often stuck at 8.6.
    
    wink
    
    Interesting, I've only used Tcl once (for Advent of Code) and I was not impressed (for whatever reason).
    
    hwj
    
    Footnote: Perl
    
    In Perl you can execute a shell script using backticks or qx:
    
    `git log` qx { git log }
    
    E.g. here's how you can iterate over each output line:
    
    for my $line (`git log --oneline`) { print $line; }
    
    cpurdy
    
    In Perl you can ...
    
    Yes.
    
    It's not the most readable language, but whatever it is you want to be able to do, it has at least four ways to do it.
    
    hwj
    
    You're absolutely right ;-) The remaining two ways I didn't mention are: system() and IPC::Open3.
    
    IPC::Open3 is only interesting if you to access stdin, stdout, and stderr via separate file handles.
    
    ZicZacBee
    
    LLMs have solved the Perl readability problem. They also solve Perl's insane syntax error problem and the ref problem.
    
    wink
    
    That's like saying decompilers have solved the assembly problem.
    
    Most of us want to (fluently, quickly) read the code that's in the file, especially in the context of (shell) scripts - not use whatever tool to make it readable.
    
    wink
    
    Ah yes, I also completely forgot that also works in PHP, but I'm not enthusiastic about that either. Thanks for the pointer though.
    
    ngp
    
    I've been mostly using rc, the plan9 shell and really like it.
    
    rsalmond
    
    I've been digging risor and rsx for these needs lately.
    
    wink
    
    risor looks interesting but I don't understand why you would use rsx instead of Go unless you were already heavily invested in risor and had a lot of scripts already.
    
    I mean sure, in the end it's a solution of a past problem, but the combo nearly looks too close to introduce this special new tooling over sticking with Go
    
    rsalmond
    
    A very fair point.
    
    I'm not heavily invested in risor, but I guess you could say I am heavily not invested in go. I'm not much of a gopher. I'm not much of a programmer for that matter. I don't write complex software, I write a little plumbing code. My eyes glaze over when people start talking about type safety and generics. I don't care about the pros/cons of composition vs inheritance. Thinking about concurrency hurts my brain. My needs from a language are primative.
    
    But I do appreciate the value of readily producing static binaries for multiple platforms.
    
    wink
    
    Good point. I don't love Go but in the end I am pragmatic. I've used in the past if it made sense, e.g. to write standalone monitoring checks as static binaries. Copy over and it works, no complicated deployment.
    
    pie_flavor
    
    I've been using Bun Shell more and more for this. Takes a bit to get the muscle memory for always prefixing await, but it replaces more shell features than most (like set -e vs +e), has sanitized string substitution for the exact "$var" experience, and supports redirection syntax to those substitutions. This example from the docs originally convinced me to try it, redirecting a command's output to a variable:
    
    const buffer = Buffer.alloc(100); await $`echo "Hello World!" > ${buffer}`; console.log(buffer.toString()); // Hello World!\n
    
    wink
    
    (Un)fortunately anything JS or TS or touching npm is right out for me without further discussion, but TIL, thanks!
    
    veqq
    
    we have a checklist of tasks I felt could be automated, or at least be made easier, as I was doing them anyway
    
    Do nothing scripts are quite lovely, just document the task as comments and slowly automate it! Although Janet offers nice features, you need extra boilerplate besides just $ and parentheses for some basic things e.g. preparing a file object for > such that a shell script can be better. The sh library's API needs a bit of work.
    
    Now, this could have been a shell script or even used sh but the ["sudo" "systemctl" that you sought to avoid does just as well - because the actual work is done via more interesting Janet features, justifying it!
    
    wink
    
    Good link, I had read that and it absolutely seems to fit here.
    
    I guess my example didn't do it justice, but I think this does not really apply here. Yes, it's kind of a check list, but everyone seems to do the steps slightly differently (which is fine). The tasks that look like "we could script that" are often so that they involve a bit more thinking and checking other not-so-easily-automatable ressources, and it was really just a side remark for context :)
    
    winter
    
    The first link 404s due to a markdown typo (double ]), jfyi.
    
    veqq
    
    boiler plate ... preparing a file object for >
    
    I forked and committed the (surprisingly easy) change: https://codeberg.org/veqq/janet-sh/pulls
    
    cblake
    
    the benefit for me is that I don't have to write anything like: ["git", "log", "--oneline", ...
    
    I have some sympathy with much of the article, but this part is silly. Python has list concatenation and split on strings. So, you can just write "git log --oneline".split() + someOtherSubList + ... or if that's too verbose just def s(x): return x.split() and use s("git log --oneline") and manipulate the lists of strings. In Nim, that could even be s"git log --oneline". It doesn't have to be that hard to work with lists of strings when some are easily tokenized and others need weird quoting.
    
    zie
    
    You probably don't want to use pythons string.split() method for unix command lines. If you insist on going down this road, you would be better served with shlex's split; https://docs.python.org/3/library/shlex.html
    
    All that said, you can make subprocess.run() do the splitting for you, it takes a string as an argument. When combined with shell=True, and it will make sh/bash/etc run the command for you, so you get PATH access and everything. Just be sure to avoid shell injection problems when doing so: https://docs.python.org/3/library/subprocess.html#security-considerations
    
    cblake
    
    My point was about the syntax of argv/string list construction & manipulations being easy to make nicer, not the semantics of what kind of splitting or what kind of running. (I don't disagree that is also a valid topic - it just wasn't the one I was talking about.)
    
    zie
    
    Gotcha. I agree that you can make python look prettier for doing shell stuff all while staying within the stdlib. If you stray outside of the stdlib, via say uv/x or something, then you can use any of the multitude of shell libraries available.
    
    I normally have a little def run(cmd): that either executes and returns the output or raises.
    
    cblake
    
    Similarly, I have a little pipeline([cmd1, cmd2], dryRun) construction built out of subprocess stuff where cmd[12] are lists of strings. It would be easy to abbreviate that pl [cmd1, cmd2] in Nim, and while yeah it's a little more verbose than cmd1 | cmd2, it's only by a little and it bypasses a lot of problems.
    
    jmtd
    
    cid = subprocess.run(av, stdout=subprocess.PIPE).stdout.decode('utf-8').strip()
    
    Every time I see python like this (urgh) I puzzle over how python displaced Perl.
    
    koala
    
    subprocess.run takes a text argument that specifies that the standard streams are to be considered (UTF-8) text and not binary. This covers common cases, avoiding the decode call.
    
    You could argue that even with text this is annoying, but I think there must be some distinction between text and binary streams. And I certainly always use a subprocess.run wrapper to make things more terser, because most of the time in a given script I just need a specific set of behaviors. The wrapper is just a few lines and IMHO with a wrapper, Python scripts are pleasant to read and write.
    
    Halkcyon
    
    You can also just use subprocess.check_output if you don't need control over the process and pipes.
    
    koala
    
    I have abandoned those calls because "Older high-level API" does not sound like they want you to use them. I write my own wrappers anyway all the time and it's fine.
    
    I used to rely a lot on check_call because check=True is the most critical part of using subprocess, and I think that's the only library call that sets it by default, though.
    
    Halkcyon
    
    While they say "Older", they are not deprecated and so continue to be safe to use. It's much easier to write
    
    stdout = check_output([my, command])
    
    Rather than
    
    stdout = run([my, command], encoding="utf-8", check=True, stdout=subprocess.PIPE).stdout
    
    koala
    
    Ah, but unfortunately check_output does not set text=True or anything :(
    
    $ python3.13 >>> import subprocess >>> type(subprocess.check_output(["uptime"])) <class 'bytes'>
    
    For me it's generally _("echo", "foo") when I don't care about standard output, and some variation like s = _("echo", "foo", capture=True) or s = _out("echo", "foo") or something similar, (I tend to vary a bit the wrapper design depending on the script or my mood.) (I also like _(foo, bar, check=False).returncode, etc.) (In general, most scripts need just 1-3 different execution types, so I tend to go for very specific and unflexible wrappers.)
    
    I find that normally, when the script is worth writing in Python, writing your wrappers tends to be worth your while.
    
    I wish there was someone with authority in the Python stdlib with similar tastes to mine and better design chops, but unfortunately I think Python devs are too burned out from their initial batteries included approach :(
    
    Halkcyon
    
    Ah yeah, still need to pass text or encoding to get a str back.
    
    I wish there was someone with authority in the Python stdlib with similar tastes to mine and better design chops, but unfortunately I think Python devs are too burned out from their initial batteries included approach :(
    
    It would be nice, but I've tried engaging with the Python core devs in the past and have found the overall process of just getting through to them to understand my user needs exhausting.
    
    koala
    
    To be fair, I guess the Python core devs are outnumbered and likely their margin to add "small" stuff is low.
    
    Python is an incumbent now, I guess true innovation needs to happen in younger languages.
    
    (Although of course, unless you get a huge install base, your language is a no-go for most scripting purposes.)
    
    Corbin
    
    Have you seen execline?
    
    wink
    
    I have not, thanks - but I don't think it is what I am looking for.
    
    Shell constructs can be terse and short, where execline constructs will be verbose and lengthy.
    
    BinaryIgor
    
    Since Python is usually installed or easy to install on any OS, I usually write small & simple scripts in bash and more complex ones in Python. Python has many good utilities to call various os'y things from it and arguably is much more readable than bash; I also often combine both approaches and call parameterized bash scripts from python scripts, since it's easier to take an input or read something more complex from files in Python, rather than Bash.
    
    Gluing together a few bash scripts or generating them in Python is also an interesting idea - been using a few times ;)