ash: a hybrid between getopts and 'sh -c'

10 points by knl

that’s a cool approach

slight downside: name collision with busybox’s shell

fanf

Busybox is one of many systems that use ash. It’s the Almquist shell written for 4.3BSD-Net/2 as part of the effort to get rid of AT&T code. The Debian Almquist shell dash is a fork of it.

https://www.in-ulm.de/~mascheck/various/ash/

dutc

Thank you for sharing this. Congratulations on releasing a new tool!

I am generally skeptical of how argument parsing is done pretty much across all languages, but I admit that the simplistic style of argument parsing is genuinely useful in many circumstances. This is the general approach taking by getopt (C,) zparseopts (Zsh,) argparse (Python,) flag (Go) which may appear to have different APIs but, in my view, are largely interchangeable. Personally, I have totally abandoned zparseopts in my Shell scripting, but I still use argparse regularly. (I think flag is pretty terrible, though, and is the cause of design flaws like Docker -v.)

It occurs to me that the CLI interface is often the primary way in which a user interacts with a tool. I don’t think I’ve done more than the briefest skim of the socat source code—the only exposure I have to the tool is through using it via its CLI interface and reading its documentation. It is only the extraördinary thoroughness of possible endpoints (“address types”) that brings me back to this tool on a consistent basis; it’s CLI interface departs from expectations and, in doing so, introduces ambiguities that are frustrating to resolve. (e.g., socat - exec:'ssh remote socat - exec:zsh,pty,rawer' fails on launch because the embedded : is ambiguous and even if you fix this, you still have to contend with the ,pty,rawer being ambiguous. A standard -- interface would not have such ambiguities.) Indeed, there are many tools that have poor CLI interface design, yet are the only available, comparable way to solve a problem, so we have no real choice. (Which CLI interface more pleasant to use? gstreamer or ffmpeg?) It is also true that in many cases we reach for the first command-line tool we can find to solve a problem, and we struggle through whatever interface it provides, hoping to never revisit that problem or that tool again. In my experience, it is not that common that a CLI interface might be so unergonomic that I abandon a tool completely; however, it is definitely the case that a CLI interface may have limitations that lead me to try solving (a smaller version of) the problem myself, and it is definitely the case that an unergonomic or unpleasant CLI interface pushes me away from frequent, non-^r use of a tool.

In other words, I think it’s quite important to think through the design of your CLI interface to make it pleasant and useful for end-users. This is something that is worth paying attention to, and you may readily find that doing a good job requires departing from getopt/zparseopts/argparse/flags-style libraries! (Similarly, I think the -h/--help documentation they automatically generate is a place where we can dramatically improve user experience at low effort; however, most of these libraries let you get pretty close with a “big block of extra text to print to the screen” escape hatch. This, coupled with good error messages, usually works well enough.)

It occurs to me that argument parsing has two main goals (though most tools only achieve the former):

transforming argv into programme-specific data structures (and identifying errors)
generating completion logic for interactive environments

While the former may look like a parsing problem, thus we might think that a more generalised version of the getopt/zparseopts/argparse/flags-style might involve writing a parser of some form, I believe that’s still a simplification. I think a lot of common CLI interfaces actually form context-sensitive grammars, so typical parsing approaches are insufficient. From this perspective, you can see why I have abandoned zparseopts in lieu of a classical while (( # )); do case "${1}" in *) ... ;; esac; shift; done loop (as you can see in other recent posts.) The latter is the tersest way of encoding the necessary state machine. This is also why I group together almost all argument parser libraries in one category. Despite their superficial differences, none of them allow you to directly encode the state machine that performs this transformation.

Setting that aside, reviewing this specific tool, I have the following feedback:

you need a better error message on ./ash than panic: no command: []
usage text should go to stderr on errors but to stdout on -h/--help; in the latter case, this is the output the user requested, and it may be the case they want to easily grep through the output to find, e.g., the proper way to spell a flag
when emitting usage text on an error; the error message should come after the usage text, so that it is more visible, so that users don’t have to scroll up, and so that output is more consistent
the design of ash’s command-line interface is poor, even within the limitations of flag (which is… just not very good by itself); e.g., how do we provide help-text but no default value for a mandatory argument (other than encoding this into the template)? how do we provide short vs long aliases? how do you specify boolean flags, repeated flags, &c.?
(in my limited experience, go template is just also not very good…?)
for the most part, you handle quoting correctly (at a quick glance, this should be correct up to the correctness of shlex.Split)
however, the provided way to interpolate ${@} appears to be via strings.Join which you expose to the template as join; first, this is potentially ambiguous; second, this does not properly perform quoting; e.g., the output of /ash 'printf "arg=%s\n" {{ join .args " " }}' a 'b c' is wrong
it’s good that you propagate the error code, but a command error is not an ash error; I think you want to fail with an ash error on a bad parsing, but dispatch to the command on all other cases; in other words, I think you don’t want to use os/exec.Command.Run but rather just a fork-less exec (which I think is os/exec.Command.Start); once you’ve parsed arguments, your job is done
template errors like misspellings in flag names result in potentially dangerous commands; ./ash '--flag' 'command {{ .Flag }}' becomes /usr/bin/sh -c command <no value> (!!)
the command output should not be printed to the stdout, and should not be printed by default; this should be behind a verbose flag and go to the stderr; be careful about the ambiguity in designing runtime modalities for ash (e.g., is a flag for ash or is it for the command ash calls?)
the value proposition is a bit muddied†

† Basically, the value proposition of this tool is that rather than writing…

some-command/wrapper() {
  local -ar usage=(
    'some-command [--flag] [-h|--help]'
    'does some command'
    '--flag some modality'
    '--help print this message and exit
  )
  local -a targets=()
  local -i flag=0
  while (( # )); do
    case "${1}" in
      --flag) flag=1 ;;
      -h|--help) for ln ( "${(@)usage}" ) <<< "${ln}" ; exit 0;;
      --) break ;;
      *)  for ln ( "${(@)usage}" ) <<< "${ln}" >&2 ; exit 1 ;;
    esac
    shift
  done
  (( # )) && shift
  (( # )) && targets+=( "${@}" ) || targets=( default )
  if (( flag )); then
      exec some-command --mode "${(@)targets}"
  else
      exec some-command --no-mode "${(@)targets}"
  fi
}

… you write (not strictly equivalent to the above)…

# XXX: this doesn't actually work, because I believe
#      the flags are passed as `*string` but `eq`
#      requires a `string`…
# XXX: also, `| join` interpolation here won't do
#      correct quoting
some-command/wrapper() {
  local -ar flags=( '--flag true "sets mode or no mode"' )
  local -r template=<<-EOF
    {{ if eq .flag "true" -}}
        some-command --mode {{ if gt (.args | len) 0 }}{{ join .args " " }}{{ else }}default{{ end }}
    {{- else -}}
        some-command --no-mode {{ if gt (.args | len) 0 }}{{ join .args " " }}{{ else }}default{{ end }}
    {{- end }}
EOF
  exec ash "${(@)flags}" "${template}" "${@}"
}

Obviously, the former encoding (using just a Shell function) is much more generalisable. It is similarly obvious that there are formulations where the latter encoding (using ash) will be significantly shorter at no loss of reuqired generality. However, just as the ash encoding needs to have certain key details fixed; so can the Shell function encoding have certain inconveniences addressed. We may be able to cover the gap between the amount of boilerplate necessary in these cases, while still benefitting from the significantly greater generality of the Shell approach. Additionally, when we consider the evolution of this encoding over time, we could argue that the Shell function encoding is less discontinuous: as the complexity of our requirements for argument handling increases, the complexity of the code increases minimally. In the case of ash, as the complexity of our argument handling increases, the complexity of the template may increase substantially, up to a point where we are no longer able to encode what we want, and then we have to start over from scratch and write the Shell function. From this perspective, you can imagine how the moment we step even slightly beyond the simplest example where the terseness of ash clearly wins, we may decide to just ditch the tool for the standard approach. Thus, you can see the value proposition is quite muddled and somewhat weak.

Again, congratulations on releasing a new tool!

dutc

I haven’t written Golang in some time, but I remember I didn’t dislike it the last time I wrote it.

I liked how many third-party libraries there were for useful tasks (libraries that don’t readily exist in other languages, like userspace Wireguard or SSH servers,) the performance seemed adequate, and it was nice being able to produce a single binary for distribution.

However, in putting together the example above, I (again?) got the impression that parts of the Go standard library (like flag and template) just seem very sloppily designed. Why is that? Is it just me?

dutc

For anyone who is curious what while (( # )); do case "${1}" in *) ... ;; esac; shift; done-style parsing looks like, here is a consolidated example that shows:

short & long arguments
default argument values
repeatable arguments (-v)
arguments with settable or appendable parameters (-c, --see, -d)
concatenated-style arguments (-a -b or -ab; e.g., tar -xavf ...)
arguments with multiple parameters (-d abc 123; e.g., bwrap --dev-bind / / …)
arbitrary argument placement ($0 -a target or $0 target -a; e.g., rm / -rf)
long-style arguments with = parameters (e.g., head --lines=10)
error handling

It should be obvious that -- separation (common for disambiguation; e.g., rm -- --file-with-dashes) is just a --) break ;; to exit the while (( # )) loop, followed by a (( # )) && shift && args+=( "${@}" ).

It should be obvious that defaults can be handled either in the parsing code or in the code that calls the parsing code.

It should also be obvious that if the command were broken into subcommands (e.g., git pull,) this would be implemented by composing multiple functions. e.g., case "${subcmd}" in ...) /:/parse/args/subcmd/. "${@}" ;; esac

It’s been a while since I last used zparseopts, but I am certain that some of the above features are simply not supported (e.g., multiple parameters.)

zparseopts (unlike Python argparse) does not give you usage text for free. Since zparseopts does not give you defaults, and since it would be possible to handle defaults within this parsing code (though there are reasons why you may want to handle it in the calling code,) and since zparseopts does not give you contextual help messages (i.e., not just what went wrong but why,) we might try to be fair and say that for every additional flag, the zparseopts approach requires 1+1=2 lines of code (one for the argument to zparseopts, ½ for the error handling, and ½ for the default,) and the below approach requires 2½ lines of code (one for the short-form, one for the long-form, and ½ for the default.) The below code has about 10-lines of boilerplate, whereas zparseopts has about two. That means the effort for zparseopts is total lines = 2×argument + 2 and the effort for the below approach is total lines = 2½×argument + 10.

But the below is also relatively obvious imperative code, so we must account for the difficulty of understanding a zparseopts invocation such as zparseopts -D -E -K -M -- f+:=_f -flag:=_flag … v::=verbose -verbose=verbose h=_help -help=_help. Thus, I argue that the below option is only marginally more boilerplate, minimally more conceptual effort, and all at no loss of generality.

#!/bin/zsh

/:/parse/args/.() {
    local _1 _iarg _aarg
    local -i a=0 b=0 f=0 g=0 verbose=0
    local -a c=() d=() args=()
    local -i help=0 error=0
    local -a message=()

    while (( # )); do
        case "${1}" in
            --*) case "${1}" in
                --verbose) verbose+=1 ;;
                --see)     c+=( "${2}" )
                           (( # < 2 )) && error+=1 message+=( "Missing arg: ${1}" ) || shift ;;
                --see=*)   _1="${1#--see=}"
                           c+=( "${_1}" )
                           [[ -z "${_1}" ]] && error+=1 message+=( 'Missing arg' ) ;;
                --dee)     d+=( "${2}" "${3}" )
                           (( # < 3 )) && error+=1 message+=( "Missing arg: ${1}") || shift 2 ;;
                --help)    help+=1 ;;
                --*)       error+=1 message+=( "Bad flag: ${1}" ) ;;
            esac ;;
            -*) for _1 ( "${(A@s..)1#-}" )
            case "${_1}" in
                a) a+=1 ;;
                b) b+=1 ;;
                c) c+=( "${2}" )
                   (( # < 2 )) && error+=1 message+=( "Missing arg: ${1}" ) || shift ;;
                f) f+=1 ;;
                g) g+=1 ;;
                h) help+=1 ;;
                v) verbose+=1 ;;
                *) error+=1 message+=( "Bad flag: ${1}" ) ;;
            esac ;;
            *) args+=( "${1}" ) && <<< "x=${(q)1}" >&2 ;;
        esac
        shift
    done

    for _iarg ( a b f g help error verbose )
        <<< "${_iarg}" <<< "${(q)"${(P)_iarg}"}"
    for _aarg ( c d args message )
        <<< "${_aarg}" <<< "${(q)"$(for x ( "${(@P)_aarg}" ) <<< "${(q)x}")"}"
}

##############################################################################
/:/util/center/. () { <<< "${(pl:40::\u2500:::r:40::\u2500:::)1}" }
/:/util/heading/.() { for ln ( "${@}" ) /:/util/center/. "${ln}" }
/:/util/scalar/.() {
    local -r name="${1}" ; shift
    local -r value="${1}" ; shift
    <<< "local    ${name}=${value}"
}
/:/util/array/.() {
    local _x
    local -r name="${1}" ; shift
    local -ar values=( "${@}" )
    if (( ${#values} )); then
        <<< "local -a ${name}=("
        for _x ( "${(@)values}" ) <<< "  ${(q-)_x}"
        <<< ") #⇒${#values}"
    else
        <<< "local -a ${name}=()"
    fi
}
/:/util/associative-array/.() {
    local _x _y
    local -r name="${1}" ; shift
    local -Ar values=( "${@}" )
    if (( ${#values} )); then
        <<< "local -A ${name}=("
        for _x _y ( "${(@kv)values}" ) <<< "  ${(q-)_x} ${(q-)_y}"
        <<< ") #⇒${#values}"
    else
        <<< "local -A ${name}=()"
    fi
}

##############################################################################
typeset -gar tests=(
    '-vvv --verbose'
    '-c abc --see def --see xyz'
    '--dee def 456'
    '-a -b -f -abf'
    '-h'
    '--help'
    '-c' # error: missing
    '-d abc' # error: missing
    '--does-not-exist' # error: bad
    # '-a -b -c abc --see def --see=xyz --dee jkl 123 --dee mno 456 PRQ STU -fg -vvv --verbose'
)
() {
    for args ( "${@}" ) {
        /:/util/heading/. "${args}"
        typeset -A config=( "${(@AfQ)"$(/:/parse/args/. "${(@AzQ)args}")"}" )
        for k ( "${(@ok)config}" )
            case "${k}" in
                c|args|message) /:/util/array/.             "${k}" "${(@AfQ)config[${k}]}" ;;
                d)              /:/util/associative-array/. "${k}" "${(@AfQ)config[${k}]}" ;;
                *)              /:/util/scalar/.            "${k}" "${config[${k}]}" ;;
            esac
    }
} "${(@)tests}"

Now, I am not fooling myself that the above is readable to most people who don’t write a lot of Shell scripts. There are definitely some extreme Zsh-isms present, and most will probably consider this code very dense for the typical Shell script. However, I use this as an example to illustrate the idea of eschewing simplistic argument handling libraries which typically rob you of too much generality.

In Python, the above would look only slightly different, given the better choices we have for modelling things in Python. Likely, since our goal is to encode an arbitrary state machine, we would model this as a generator coroutine.

e.g.,

#!/usr/bin/env python3

from collections import namedtuple
from dataclasses import dataclass
from functools import wraps
from itertools import chain, repeat
from types import SimpleNamespace
from typing import Callable, Generator
from sys import argv

Binding = namedtuple('Binding', 'path value')
Error = namedtuple('Error', 'message')
sentinel = namedtuple('Sentinel', '')()

@dataclass(frozen=True)
class Parser:
    coro : Generator[[Binding | Error]]

    def __call__(self, args):
        for arg in args:
            yield from self.coro.send(arg)
        while True:
            try:
                yield from self.coro.send(sentinel)
            except StopIteration:
                break

    @classmethod
    def from_coro(cls, coro):
        return wraps(coro)(lambda *a, **kw: [cls(ci := coro(*a, **kw)), next(ci)][0])

def take(n):
    rv = []
    for _ in range(n):
        rv.append((yield []))
    return rv

def binding_or_error(values, binding, error):
    if any(x is sentinel for x in values):
        return error(*values)
    else:
        return binding(*values)

@Parser.from_coro
def parser():
    class Flags:
        @staticmethod
        def c():
            return binding_or_error(
                (yield from take(1)),
                lambda v: Binding('c', v),
                lambda _: Error(f'missing argument for {flag}'),
            )
        @staticmethod
        def d():
            return binding_or_error(
                (yield from take(2)),
                lambda k, v: Binding('d', (k, v)),
                lambda *_: Error(f'missing argument for {flag}'),
            )

    rv = []
    while True:
        if (flag := (yield rv)) is sentinel:
            break
        rv.clear()
        if flag.startswith('--'):
            match flag.removeprefix('--'):
                case 'verbose': rv.append(Binding('verbose', True))
                case 'help': rv.append(Binding('help', True))
                case 'see': rv.append((yield from Flags.c()))
                case 'dee': rv.append((yield from Flags.d()))
                case _ if flag.startswith('--see='):
                    rv.append(Binding('c', flag.removeprefix('--see=')))
                case _: rv.append(Error(f"unknown flag {flag}"))
        elif flag.startswith('-'):
            for f in flag.removeprefix('-'):
                match f:
                    case 'a': rv.append(Binding('a', True))
                    case 'b': rv.append(Binding('b', True))
                    case 'f': rv.append(Binding('f', True))
                    case 'g': rv.append(Binding('f', True))
                    case 'v': rv.append(Binding('verbose', True))
                    case 'h': rv.append(Binding('help', True))
                    case 'c': rv.append((yield from Flags.c()))
                    case 'd': rv.append((yield from Flags.d()))
                    case _: rv.append(Error(f"unknown flag {flag}"))

if __name__ == '__main__':
    tests = [
        ['-vvv', '--verbose'],
        ['-c', 'abc', '--see', 'def', '--see', 'xyz'],
        ['--dee', 'def', '456'],
        ['-a', '-b', '-f', '-abf'],
        ['-h'],
        ['--help'],
        ['-c'], # error: missing
        ['-d'], # error: missing
        ['-d', 'abc'], # error: missing
        ['--does-not-exist'], # error: bad
        # ['-a', '-b', '-c', 'abc', '--see', 'def', '--see=xyz', '--dee', 'jkl', '123', '--dee', 'mno', '456', 'PRQ', 'STU', '-fg', '-vvv', '--verbose'],
    ]
    for t in tests:
        bindings = parser()(t)
        print(*bindings, sep='\n')

I’ve included just enough “helper” code to illustrate how we can “build our way back up” to an interface like argparse. The design of the above is, of course, just a sketch. If we were to try this in earnest, we would want to make sure that as we “build our way back up” to the declarative interface, we do so at no loss of composition and without burying the state machine/even loop again. Additionally, we want to figure out how to nicely bifurcate the parts of the problem which are so regular that we can encode the state machine explicitly (i.e., as distinct, first-class nodes and edges,) from those which are so irregular that we have to encode the state machine as a generator coroutine. In the case of the former, we would use the explicit modeling to automatically generate help text (which is something that argparse does for us very nicely,) and in the case of the latter, we would handle ad hoc modalities (something with argparse simply cannot do to my knowledge.)

From this perspective, we could consider these approaches as lying somewhere on a spectrum measuring the explicitness or implicitness of the modelling of the state machine. argparse and the like are clearly toward the explicit extreme; the above approach is toward the implicit extreme. It is my belief that, following the above path, we can eventually discover the ideal design for an argument parsing library, which I suspect would feature not only modelings at both ends of this spectrum, but mechanisms that nicely compose these formulations.

dutc

Also, for anyone wondering if I am using an LLM to draft these exceptionally long (and exceptionally tedious…) posts, rest assured.

Generator coroutine approaches in Python are so obscure (for whatever reason) that in my experience LLMs simply cannot and will not write this kind of code! If you ask, “Write me a design for an argument parsing library that uses (generator) coroutines, you’ll probably get an async def in there somewhere.” It’ll probably include PEP-484 type hints, lots of unnecessary comments, and do it all with a casual but also nauseatingly fawning attitude. Also, emoji.

This is one reason why LLMs don’t seem particularly useful if you are able to write code better than the most-represented examples in their training set. In the case of Python, this means some extremely powerful techniques (like generator coroutines) are simply unavailable to you.

avamsi

Hi! Author here – just wanted to say, thanks for the detailed feedback! I’m a bit crunched for time right now, but I’ll go through your comments properly in the next couple of days. If you’re into CLIs, believe it or not, I actually have another Go project aimed to make writing CLIs easier at https://github.com/avamsi/climate. It’s built on top of Cobra, so should better address some of the pain points you mentioned.

cadey

This is really cool. I love it.

kbd

This seems good but I don’t totally get it. Could anyone give more examples?

knl

The author uses it this way in setting up one alias for jj:

	push = [
		'util', 'exec', '--', 'ash', '-r @- revision', '--remote', ''' \
			jj bookmark move \
				--from='heads(ancestors({{.r}}) & tracked_remote_bookmarks())' \
				--to={{.r}} \
			&& jj git push --revisions={{.r}} \
				{{if len .remote}}--remote={{.remote}}{{end}}
		''']