- a little program to help you look at your memory map
- the current situation in stack games for attackers
- actually protecting the return addresses for real
First, these are the general addressable regions of memory that you want to be able to separate out. I'll put them in the order I've been using in the other two rants:
0x7FFFFFFFFFFFFFFF
stack (dynamic variables, stack frames, return pointers)
0x7FFFxxxxxxxxxxxx ← SP
gap
guard page (Access to this page triggers OS responses.)
gap
heap (malloc()ed variables, etc.)
statically allocated variables
0x4000000000000000
application code
0x2000000000000000
operating system code, variables, etc.
0x0000000000000000
The regions we see are
- Stack (dynamic variables, stack frames, return pointers)
- Heap (malloc()ed variables, etc.)
- application code (including object code, constants, linkage tables, etc.)
Memory management hardware provides the ability to move OS code out of the application map. Let's see how that would look:
0x7FFFFFFFFFFFFFFF
stack (dynamic variables, stack frames, return pointers)
0x7FFFxxxxxxxxxxxx ← SP
gap
guard page (Access to this page triggers OS responses.)
gap
heap (malloc()ed variables, etc.)
statically allocated variables
0x4000000000000000
application code
0x0000000000000000
We used to talk about the problems of accidentally using small integers as pointers. Basically, when pointer variables get overwritten with random integers, the overwriting integers tend to be relatively small integers. Then when those integers are used as pointers, they access arbitrary stuff in low memory. We can notice that and refrain from allocating small integer space. And we realize that we have already dealt with small negative integers by buffering the wraparound into highest memory:
0xFFFFFFFFFFFFFFFF
gap (wraparound and small negative integers)
0x8000000000000000
stack (dynamic variables, stack frames, return pointers)0x7FFFxxxxxxxxxxxx ← SP
gap
guard page (Access to this page triggers OS responses.)
gap
heap (malloc()ed variables, etc.)
statically allocated variables
0x4000000000000000
application code
0x0000000100000000
gap (small integers)
I've posted a rant about using a split stack, with a little of the explanation for why at the end. Basically, that would allow us to move those local buffers that can oerflow, crash, and/or smash the stack way away from the return address stack.
Thus, even if the attacker could muck in the local variables, he would still be at least one step from overwriting a return address. That means he has to use some harder method to get control of the instruction pointer.
Stack usage patterns actually point us to using a third stack, or a stack-organized heap separate from the random allocation heap. Parameters and small local variables could be on one stack, and large local variables on the other.
In other words, scalar local/dynamic variables would be on the second stack and vector/structure local/dynamic variables on the third. This would be especially convenient for Forth and C run-times, virtually eliminating all need of function preamble and cleanup, and simplifying stack management.
Another way to use the third stack would be to just put all the local variables on it. It might be easier to understand it this way, and I'll use the parameter/locals division below. As far as the discussion below goes, the two divisions can be interchanged. (The run-time details are significant, but I'll leave that for another day. Besides, there is no reason for a single computer to limit itself to one or the other. With a little care, the approaches could even be mixed in a running process.)
But the third stack could be optional, and its use determined by the language run time support. The OS run-time support really doesn't need to see it other than as a region to be separated from the others. Here is a possible general map, using 64 bit addressing:
0xFFFFFFFFFFFFFFFF
gap (wraparound and all negative integers)
0x8000000000000000
gap (large positive integers)
0x7FFFFF0000000000
gap
return stack ← RP
gap
0x7FFFFE0000000000
guard page (240 addresses)
0x7FFFFD0000000000
gap
parameter stack ← SP
gap
0x7FFFFC0000000000
guard page (240 addresses)
0x7FFFFB0000000000
gap
local stack ← LP gap
0x7FFFFA0000000000
guard page (really huge)
0x7000000000000000
gap
heap (malloc()ed variables, etc.)
gap
0x4000020000000000
guard page (240 addresses)
0x4000010000000000
gap
statically allocated variables
gap
0x4000000000000000
gap
application code
gap
0x0000010000000000
gap (small positive integer pointer guard)
0x0000000000000000
If we choose to have stack frames, we could manage them very simply on the return stack by just pushing the local and/or pointer stack pointer when we push the IP. And we just discard them when we pop the IP. Or we can pop them, to force-balance the stack. This gets rid of pretty much all the complexity of walking the stack.
The gaps should be randomized, to make it harder for attacker code to find anything to abuse.
The regions we now have are
- Return Stack (return address and maybe frame pointers)
- Parameter stack (parameters only)
- Locals Stack (dynamically allocated local variables)
- Heap (malloc()ed variables, etc.)
- Statically allocated process variables (globally and locally visible)
- application code (including object code, constants, linkage tables, etc.)
What's missing?
Multiprocessing requires a region of memory dedicated to process (or thread) shared variables, semaphores, resource monitor counters and such. This is a separate topic, but basically the statically allocated variable area would have a section which could be protected from bare writes, with only reads and locked read-modify-write cycle instructions allowed. These would be a separate region, so their addresses could be somewhat randomized.
I'm not sure that it makes sense to manage allocation of shared variables in the malloc() sense, but there is room with this kind of scheme, and modern processors should support that many different regions of memory.
Also, regions of memory shared mmap-style would be in a separate region, or perhaps a guarded region for each. I'm not sure whether the would be protected in the same way as semaphores and monitor counters. It would seem, rather that the CPU instructions would be ordinary instructions, and the mmap region would be a resource protected by semaphore- or monitor- controlled access.
We can do the same sort of thing with 32-bit addressing, although, instead of guard pages 240 or so in size, we would be looking at guard pages between 220
and maybe 224 in size. This would be more appropriate for some controller applications.
We could do the same thing with 16-bit addressing, but it wouldn't leave us much room for the variables and code. On the other hand, looking twice at 16-bit addressing will give us clues for further refinement of these ideas. But I think I'll save that for another rant, probably another day. I have burned up enough of today on this prolonged rant.
No comments:
Post a Comment