Stack Computers: the new wave
2 points by veqq
2 points by veqq
Stack-based CPU architectures seem to have been a dead end. I assume a major factor is the increasing expense of accessing RAM vs registers; note how modern ABIs even pass parameters and return values in registers not the stack.
I do remember a 1980s CPU (by AMD? Maybe the 29000?) that was entirely stack-based but kept the top of the stack (maybe 64 elements?) in on-chip cache that was as fast to access as registers. When a stack access overflowed it triggered a sort of page fault that would slide the cache by some quantum, reading and writing RAM.
Obviously modern CPUs have lots of L1 cache, and the stack is hot enough to stay in L1, but using registers must still be faster — probably due to details of pipelining that are beyond my limited knowledge.
I assume a major factor is the increasing expense of accessing RAM vs registers
A modern register machine isn't really a register machine, it's a crappy way of communicating SSA between the compiler back end and the CPU's scheduler. Stack-based ISA were a much worse way of communicating SSA to the scheduler. In the common cases, instruction A writes register X, instruction B reads register Y is easy to track, and then tracking this across basic blocks is just adding some additional naming. In contrast, determining that two instructions are independent in a stack architecture is really hard.
That said, I'm somewhat surprised that GPUs aren't stack machines. GPUs typically don't bother with out-of-order execution, they just do so many threads that there's always one with instructions ready to run. I wouldn't be surprised if you could get better code density with a stack architecture for GPUs, especially if you did variable-length vectors with 'operate on the top N stack slots' encodings.