How is Stack memory allocated when using 'push' or 'sub' x86 instructions?
21 points by matklad
21 points by matklad
Kinda random StackOverflow question, but I personally haven’t realized that the stack memory for the main thread on Linux is not only lazily faulted in, but also lazily mmapped!
This is where the Stack-Clash vulnerability came from and why it wasn’t exploitable on FreeBSD (where the initial stack is mapped as normal).
This has “interesting” consequences for variable-length arrays on the stack. Traditionally, alloca()
and VLAs simply adjusted the stack pointer, so they could be used as a fast bump allocator. But on modern systems stack allocation has to have a lot of extra machinery to touch pages to make sure the kernel can see what’s going on, and maybe prevent stack clash. So variable sized on-stack allocation might not be as much of a win as it might seem, plus it requires a great deal of care and attention to avoid oversize allocations: the only way VLA allocation can fail is by crashing the process, and there’s no way to find out what the limit is. VLAs are only usable in toy programs.
That feels sort of like saying that recursion is only usable in toy programs.
One difference is that recursion occurs in small increments that don’t require the compiler to insert an unpredictable number of page probes. Another difference is that recursion is typically O(log n) but allocation is O(n).
But it’s true that if you use a recursive algorithm in a program that works on untrusted data, you need to take care. In my experience it’s possible to justify reasonable recursion bounds when necessary.
If you can justify the bounds on a VLA, you now have an array with a static size, so you didn’t need a VLA.
You can maybe quibble about when it is OK for a program to crash ungracefully on large inputs, but that’s beside my point: VLAs are much more dangerous than their syntax suggests.