Quantcast
Channel: Tech Blog is Tech » pointers
Viewing all articles
Browse latest Browse all 6

Examining the Stack to Debug Segfaults with gdb

$
0
0

EditEarlier, while writing my compare strings method, I made a mistake in the code and came across a segmentation fault. Based on how the program executed I was pretty sure of approximately where the error was occurring, but rather than go and find the mistake I thought it would be a lot more useful to step through the program in the debugger and examine the problem that way. By doing this I’ll make it easier for myself to debug similar (more complex) problems in the future.

Segmentation Faults

What are they?

Wikipedia more or less defines a segfault as “an attempt to access memory that the CPU cannot physically address”. Typically the hardware notifies the operating system about a memory access violation, so the kernel sends a signal to the process which caused the exception.

In English

Your program is trying to access something in memory. The hardware, OS, or some other component has decided that the memory you want to access does not belong to you or could be potentially harmful for you to access. So it politely tells you that you are not allowed.

How does this happen?

Well, it could be that you’re just being a dick and trying to access memory that doesn’t belong to you. Is that what you’re doing? … No? .. Ok, well then probably you just made a mistake when you were performing some memory-related operation. For instance, perhaps you treated an integer as a pointer and passed it to a string-related operation. Or maybe you copied 150 bytes of data into 100 byte buffer and smashed the stack. Whatever the case may be, you can be certain it’s related to some sort a memory-operation; unfortunately programming involves a lot of those.

Example code

Instead of using the large code sample that I was working on when the problem occurred, I’ve created a shorter sample in assembly that will generate a buffer overflow. The sample could be made even shorter, but I wanted a realistic example and I also wanted to keep the comments in the code so it’s easier to follow.

.data
    Str1:
        .asciz "Segfault's are awesome!\n"

.text
    .globl _start

    .type PrintString, @function

    _start:
        # for stepping through the debugger
        nop

        # string length is stored in rbx
        movq $25, %rbx

        # push the arguments onto the stack
        pushq %rbx
        pushq $Str1

        # print the string
        call PrintString

        # restore the stack pointer
        addq $8, %rsp

        # exit()
        jmp ExitProgram

    ######################################
    # print the string
    # @param str1
    # @param strlength
    ######################################
    PrintString:
        # save the current base pointer by pushing it onto the stack
        pushq %rbp

        # move the base pointer to the top of the stack
        movq %rsp, %rbp

        # retrieve our arguments from the stack
        movq 16(%rbp), %rcx
        movq 24(%rbp), %rdx

        # print the string
        movl $4, %eax
        movl $1, %ebx
        int $0x80

        # return
        ret

    ExitProgram:
        movl $1, %eax
        movl $0, %ebx
        int $0x80

Running the example

Here is how to run the example and what that looks like:

# as -gstabs -o Segfault.o Segfault.s
# ld -o Segfault Segfault.o
# ./Segfault
Segfault's are awesome!
Segmentation fault

Stepping through with the debugger (gdb)

Digging in

So clearly we have a problem in this code. Let’s load up gdb and find out why.

gdb ./Segfault

Setting a breakpoint

Let’s make some inferences about where to set our breakpoint. Str1 definitely prints out before the program crashes. The program is pretty short, so why not just set our breakpoint after that line? We’ll use list and breakpoint to do this.

(gdb) list PrintString
32	    # @param str1
33	    # @param strlength
34	    ######################################
35	    PrintString:
36	        # save the current base pointer by pushing it onto the stack
37	        pushq %rbp
38	
39	        # move the base pointer to the top of the stack
40	        movq %rsp, %rbp
41	
(gdb) 
42	        # retrieve our arguments from the stack
43	        movq 16(%rbp), %rcx
44	        movq 24(%rbp), %rdx
45	
46	        # print the string
47	        movl $4, %eax
48	        movl $1, %ebx
49	        int $0x80
50	
51	        # return
(gdb) break 49
Breakpoint 1 at 0x4000df: file Segfault.s, line 49.

Find the segfault

Now that we have a breakpoint just before the string is printed out, let’s run the program and find the exact line that causes the segfault.

(gdb) run
Starting program: /root/assembly/Segfault 

Breakpoint 1, PrintString () at Segfault.s:49
49	        int $0x80
(gdb) s
Segfault's are awesome!
PrintString () at Segfault.s:52
52	        ret
(gdb) s

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()

So after the breakpoint we stepped through the program twice and we can see that line 52 is causing the problem. The ret statement is causing some problem that results in the segmentation fault. More specifically, it looks like the instruction pointer (EIP/RIP) is being pointed to 0×0.

Examining the stack

Well, we know that when a function is called, the next instruction that should execute within the calling function is stored on the stack. That way when the function call returns it can simply restore the memory address on the stack into the EIP register and we’re suddenly back to the position where the function was called from. So with this theory, we know that ret basically pops the return pointer off of the stack and into the EIP register. Why don’t we try restarting the program and examining the stack (ESP/RSP) just before the ret instruction is run.

(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /root/assembly/Segfault 

Breakpoint 1, PrintString () at Segfault.s:49
49	        int $0x80
(gdb) s
Segfault's are awesome!
PrintString () at Segfault.s:52
52	        ret
(gdb) x/2xg $rsp
0x7fffffffe4a0:	0x0000000000000000	0x00000000004000c3

This is a 64-bit machine, so here we are examining two 64-bit values on the stack. We can see that the first value is 0×0; this is the value at the top of the stack. The following value, 0x00000000004000c3 is the next value on the stack, and if we examined further we could review the full stack if we wanted to (including other frames). For now let’s focus on 0×0, since that seems to be what is popped off and what’s causing our problem.

Verify the stack value is being popped into EIP/RIP

Let’s just demonstrate how executing the ret actually does pop that value into EIP/RIP by doing this..

(gdb) print /x $rip
$1 = 0x4000e1
(gdb) s

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) x/2xg $rsp
0x7fffffffe4a8:	0x00000000004000c3	0x00000000006000f0
(gdb) print /x $rip
$2 = 0x0

We can see that the RIP register clearly changes from 0x4000e1 to 0×0 and that 0×0 was removed from the stack. Notice how 0x00000000004000c3 is now at the top of the stack instead of being the second value on the stack like we saw before.

Looking up the memory address

Based on the fact that we know 0×0 is an invalid memory location, let’s see if this next value on the stack is valid using the disassemble command in gdb.

(gdb) disassemble 0x00000000004000c3
Dump of assembler code for function _start:
   0x00000000004000b0 <+0>:	nop
   0x00000000004000b1 <+1>:	mov    $0x19,%rbx
   0x00000000004000b8 <+8>:	push   %rbx
   0x00000000004000b9 <+9>:	pushq  $0x6000f0
   0x00000000004000be <+14>:	callq  0x4000c9 <PrintString>
   0x00000000004000c3 <+19>:	add    $0x8,%rsp
   0x00000000004000c7 <+23>:	jmp    0x4000e2 <ExitProgram>

Well, what do you know. The highlighted line above shows that 0x00000000004000c3 is the correct memory location for the line right after our call to PrintString. So when PrintString calls ret, it should actually be popping off 0x00000000004000c3 and not 0×0.

Finding the mistake

At this point we know that something has been added to the stack AFTER the return address and has not been popped back off. Since it was added after the return address we can be pretty confident that it was added inside of the PrintString function. Let’s take a look at the PrintString code.

(gdb) list PrintString
32	    # @param str1
33	    # @param strlength
34	    ######################################
35	    PrintString:
36	        # save the current base pointer by pushing it onto the stack
37	        pushq %rbp
38	
39	        # move the base pointer to the top of the stack
40	        movq %rsp, %rbp
41	
(gdb) 
42	        # retrieve our arguments from the stack
43	        movq 16(%rbp), %rcx
44	        movq 24(%rbp), %rdx
45	
46	        # print the string
47	        movl $4, %eax
48	        movl $1, %ebx
49	        int $0x80
50	
51	        # return

Sure enough, in the highlighted line above you can see where we pushed the base pointer onto the stack using pushq %rbp. At the time the base pointer was set to 0×0, so that’s what we’re seeing on the stack in gdb. Unfortunately we never popped this value back off, so it’s causing a segfault when ret is called.

Fixing the code

Fixing this problem is incredibly easy as it turns out. Just pop the stored value of the RBP register back into RBP before returning from the function. It’s a one line fix that should be placed just before the ret. Now when ret executes it will pull the correct return address off the stack and everything will run as expected.

popq %rbp
ret

You’re done!

Hopefully this guide helped you understand how you can examine the stack to find segfaults and other problems with your code. The GNU debugger is a powerful tool and the more you know the easier it becomes and the faster you can get back to writing code!

Have fun!


Viewing all articles
Browse latest Browse all 6

Trending Articles