V6 Shell
6 points by Melkor333
6 points by Melkor333
The C source for this Thompson v6 shell is quite archaic, using =+
instead of +=
and doing a very funky syntax tree representation. Understandable since C itself was evolving (being evolved by the same person, actually), but has anyone ported it to something more modern like ANSI C 89?
Tsh by the same author would appear to be that
I agree, though that link you have (same guy/website as TFA, actually) only links to enhanced tsh / etsh or “osh” sources that I can see. The very oldest release from 2003, osh-030730.tar.gz, is starting to look closer to the structure of the v6 source version, but still has a lot more distance than I was thinking a “port” would have. Anyway, THANK YOU!
xv6 has a shell in modern C:
xv6 is a re-implementation of Dennis Ritchie’s and Ken Thompson’s Unix Version 6 (v6). xv6 loosely follows the structure and style of v6, but is implemented for a modern x86-based multiprocessor using ANSI C.
https://github.com/mit-pdos/xv6-public/blob/master/sh.c#L100
I like to look at how they implement pipelines, which is:
case PIPE:
pcmd = (struct pipecmd*)cmd;
if(pipe(p) < 0)
panic("pipe");
if(fork1() == 0){
close(1);
dup(p[1]);
close(p[0]);
close(p[1]);
runcmd(pcmd->left);
}
if(fork1() == 0){
close(0);
dup(p[0]);
close(p[0]);
close(p[1]);
runcmd(pcmd->right);
}
and then let’s look at Thompson shell - https://v6sh.org/src/sh.c
case TFIL:
f = t[DFLG];
pipe(pv);
t1 = t[DLEF];
t1[DFLG] =| FPOU | (f&(FPIN|FINT|FPRS));
execute(t1, pf1, pv);
t1 = t[DRIT];
t1[DFLG] =| FPIN | (f&(FPOU|FINT|FAND|FPRS));
execute(t1, pv, pf2);
return;
So they both do 2 recursive invocations – to runcmd() or execute(), on the left and right node of the AST.
Oh weird, but actually now I notice a difference … xv6 is less efficient?
pipe()
and 2 fork()
calls for every PIPE nodeSo basically in xv6:
cat | cat | cat | cat
is executed like:
cat | (cat | (cat | cat)))
That is, you would expect there to be 4 cat
processes to run, but there are at least 2 more processes.
But not Thompson shell? I’m curious if I’m reading that right …
BTW https://oils.pub/ stores all the pipeline components in a list, so it doesn’t have this left/right recursion. I’m pretty sure most shells, like bash and dash, are like that – you can have a variable number of children in an AST node, not just 2
The xv6 shell also has a different syntax - no backslash escaping or any quoting and no $vars
and probably other differences. It is also very different in structure and function, as you observe. It is more a re-implementation of an “even more simple” shell-like thing than even a “loose following” of the v6 shell. I’m unsure how many similar statements might apply to other aspects of the xv6 project.
EDIT: btw, yes you are reading it correctly. I patched & compiled the xv6 sh.c to just run on Linux and then ran it as ./a.out and then just copy pasted cat|cat|cat|cat with the programs waiting on the terminal and ps
(well, really my pd
) output looks like this:
750 12M 4.6M 1.8M 196m 35j p1 -zsh
7996 1.4M 4058 10 205j 0j p1 ./a.out
7997 668K 0 0 162j 0j p1 ./a.out
7998 1.3M 4036 0 162j 0j p1 cat
7999 406K 0 0 162j 0j p1 ./a.out
8000 1.3M 4036 0 162j 0j p1 cat
8001 406K 0 0 162j 0j p1 ./a.out
8002 1.3M 4036 0 162j 0j p1 cat
8003 1.3M 4036 0 162j 0j p1 cat
Oh cool, thanks for checking! I remember noticing that quirk with xv6, and yeah it’s good to know it wasn’t present in the original Thompson shell
I think we’re splitting hairs now but AFAIU vars and backslash only came with the PWB shell?
If you read the OG v6 shell source in TFA, you can see these things handled (at least $0 $1 and $$). { Might well all be just academic/historical/hair splitting, but that’s sort of the whole topic, too. :-) }
Can anyone explain https://v6sh.org/src/exit.c? What does seek do here?
The v6 shell shares the file descriptor of the script (0, aka stdin) with if, goto, exit, which implement control flow by seeking to the appropriate position in the script. There’s no buffering, so seeking the file position in a child process directly affects the parent shell process. The exit program seeks to EOF, so that when the parent shell next tries to read, it sees it is at the end of the script so it stops. 0, 0, 2 == STDIN_FILENO, 0, SEEK_END (the middle 0 is the offset from the end)
The “memory management” link is not to be missed! A great collection of usenet messages from well-known unix people discussing how the 7th Edition Bourne shell allocated memory by calling sbrk() in its SIGSEGV handler, after it had already started using the yet-to-be-allocated space. This caused some severe headaches when unix was ported to architectures which did not make it easy to restart instructions that trapped.