If you’re an experienced developer, you may have encountered a strange phenomenon: a program can run perfectly fine under a debugger, yet still crash when executed standalone, i.e. outside a debugger session. The mere act of observing the program under a debugger yields to different results than running the program as-is.

Even very simple programs can be affected by this observer effect, which one would usually suspect in the field of physics (i.e. Heisenberg Uncertainty Principle) rather than that of our deterministic machines and computers!

In this post, I’ll illustrate the observer effect with a very simple assembly program, written both for FreeBSD/amd64 and FreeBSD/i386 platforms.

A decreasing counter

We start by writing a simple program in assembly that displays a decreasing counter. Let’s say, we want it to display the digits 5, 4, 3, 2, 1, 0, one per line, and then exit.

There are many ways to write such a program: the counter can be stored in a register, on the stack, or in some other memory location. Of course, register-based counters are the fastest, and stack-based counters are the second-easiest to set up because we already have stack memory at our disposal. However, to demonstrate the observer effect, we start by putting the counter in the .data section of our program.

The FreeBSD/amd64 version

Without much ado, here’s the program for FreeBSD/amd64. We use macros for the syscalls, as we did in the previous episode.

// selfmod0_amd64.S -- (not yet) self modifying code, FreeBSD/amd64 version
 
/* Some constants */
 
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4             /* WRITE syscall */
.equiv SYS_EXIT, 1              /* EXIT syscall */
 
/*
 * Syscalls implemented as macros for max performance.
 */
 
.macro write fd, buf, len
        movq \fd, %rdi
        movq \buf, %rsi
        movq \len, %rdx
 
        movq $SYS_WRITE, %rax
        syscall
.endm
 
.macro exit retcode
        movq \retcode, %rdi
 
        movq $SYS_EXIT, %rax
        syscall
.endm
 
/* The .data section contains our counter, and output buffer */
        .section .data
 
counter:
        .byte 0
buffer:
        .byte 'X', '\n'
 
/* Code section */
        .text
        .global _start
_start:
        /* set up a stack frame */
        pushq %rbp
        movq  %rsp, %rbp
 
init_counter:
        movb $6, (counter)        /* initial value of counter, + 1 */
 
loop_counter:
        movb (counter), %al
        decb %al
        je   end_loop
        movb %al, (counter)
        call write_counter
        jmp  loop_counter
 
end_loop:
        movb %al, (counter)       /* one last time to print 0 */
        call write_counter
 
bye:
        exit $0
 
        /* NOT REACHED */
        popq  %rbp
 
write_counter:
        pushq %rbp
        movq  %rsp, %rbp
 
        movb  (counter), %al     /* convert counter to ASCII */
        addb  $0x30, %al
        movb  %al, (buffer)
     
        write $1, $buffer, $2    /* write 2 bytes of buffer: value and \n */
     
        popq  %rbp
        ret

We define the WRITE and EXIT syscalls as macros write and exit. The counter counter is stored in the .data section, as well as an output buffer.

After initializing the counter to 6 at init_counter, we simply decrease it in loop_counter, until it reaches 0. Each time we decrease the counter, we call the assembly function write_counter to display it. Finally, we exit cleanly.

The write_counter function converts the counter to ASCII by adding 0x30 to it. Note that we assume that the counter is always less than 10, so it can be rendered in a single byte! We stuff the ASCII byte into the zeroth byte of the output buffer, and call write to display both the ASCII code, and the next byte of that buffer (which was already initialized to newline \n). This way, we display one digit per line.

Assembling, linking and calling the program is trivial:

% as --gstabs --64 -o selfmod0_amd64.o selfmod0_amd64.S
% ld -o selfmod0_amd64 selfmod0_amd64.o
 
% ./selfmod0_amd64
5
4
3
2
1
0

If you want, try running the same program under the debugger gdb(1). The output is the same.

The FreeBSD/i386 version

The FreeBSD/i386 version parallels the above program. We simply adapt to the different kernel call ABI and use 32-bit registers:

// selfmod0_i386.S -- (not yet) self modifying code, FreeBSD/i386 version
 
/* Some constants */
 
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4             /* WRITE syscall */
.equiv SYS_EXIT, 1              /* EXIT syscall */
 
/*
 * Syscalls implemented as macros for max performance.
 */
 
.macro write fd, buf, len
        sub  $0x10, %esp
        movl \fd, (%esp)
        movl \buf, 0x4(%esp)
        movl \len, 0x8(%esp)
 
        movl $SYS_WRITE, %eax
        call do_syscall
        add  $0x10, %esp
.endm
 
.macro exit retcode
        sub  $0x10, %esp
        movl \retcode, (%esp)
 
        movl $SYS_EXIT, %eax
        call do_syscall
        add  $0x10, %esp   /* NOTREACHED */
.endm
 
/* The .data section contains our counter, and output buffer */
        .section .data
 
counter:
        .byte 0
buffer:
        .byte 'X', '\n'
 
/* Code section */
        .text
        .global _start
_start:
        /* set up a stack frame */
        pushl %ebp
        movl  %esp, %ebp
 
init_counter:
        movb $6, (counter)        /* initial value of counter, + 1 */
 
loop_counter:
        movb (counter), %al
        decb %al
        je   end_loop
        movb %al, (counter)
        call write_counter
        jmp  loop_counter
 
end_loop:
        movb %al, (counter)       /* one last time to print 0 */
        call write_counter
 
bye:
        exit $0
 
        /* NOT REACHED */
        popl  %ebp
 
write_counter:
        pushl %ebp
        movl  %esp, %ebp
 
        movb  (counter), %al     /* convert counter to ASCII */
        addb  $0x30, %al
        movb  %al, (buffer)
     
        write $1, $buffer, $2    /* write 2 bytes of buffer: value and \n */
     
        popl  %ebp
        ret
 
do_syscall:
        int   $0x80
        jnc   ret_from_syscall
        neg   %eax
ret_from_syscall:
        ret

The program logic is exactly the same as above: the counter is located in the .data, alongside the output buffer. Here too, assembling and linking, then running that program doesn’t yield any surprises:

$ as --gstabs --32 -o selfmod0_i386.o selfmod0_i386.S
$ ld -o selfmod0_i386 selfmod0_i386.o
 
$ ./selfmod0_i386
5
4
3
2
1
0

No surprises, which is good.

A self-modifying decreasing counter

To trigger the dreaded observer effect, we modify both selfmod0_amd64.S and selfmod0_i386.S in such a way, that counter is in .text instead of .data. Note that buffer remains in .data in this case.

The FreeBSD/amd64 version

The program is nearly identical to the previous one: only the location of counter has changed:

// selfmod1_amd64.S -- self modifying code, FreeBSD/amd64 version
//                     Runs ONLY under gdb(1), and NOT standalone.
     
/* Some constants */
 
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4             /* WRITE syscall */
.equiv SYS_EXIT, 1              /* EXIT syscall */
 
/*
 * Syscalls implemented as macros for max performance.
 */
 
.macro write fd, buf, len
        movq \fd, %rdi
        movq \buf, %rsi
        movq \len, %rdx
 
        movq $SYS_WRITE, %rax
        syscall
.endm
 
.macro exit retcode
        movq \retcode, %rdi
 
        movq $SYS_EXIT, %rax
        syscall
.endm
 
/* The .data section contains our output buffer */
        .section .data
 
buffer:
        .byte 'X', '\n'
 
/* Code section */
        .text
 
// The counter is in the .text code section!
counter:
        .byte 0
 
        .align 8
        .global _start
_start:
        /* set up a stack frame */
        pushq %rbp
        movq  %rsp, %rbp
 
init_counter:
        movb $6, (counter)        /* initial value of counter, + 1 */
 
loop_counter:
        movb (counter), %al
        decb %al
        je   end_loop
        movb %al, (counter)
        call write_counter
        jmp  loop_counter
 
end_loop:
        movb %al, (counter)       /* one last time to print 0 */
        call write_counter
 
bye:
        exit $0
 
        /* NOT REACHED */
        popq  %rbp
 
write_counter:
        pushq %rbp
        movq  %rsp, %rbp
 
        movb  (counter), %al     /* convert counter to ASCII */
        addb  $0x30, %al
        movb  %al, (buffer)
     
        write $1, $buffer, $2    /* write 2 bytes of buffer: value and \n */
     
        popq  %rbp
        ret

Let’s assemble and link it:

% as --gstabs --64 -o selfmod1_amd64.o selfmod1_amd64.S
% ld -o selfmod1_amd64 selfmod1_amd64.o

And now, prepare to witness the observer effect (drum roll)!

% ./selfmod1_amd64
Bus error (core dumped)
 
% gdb -q ./selfmod1_amd64
(gdb) r
Starting program: /users/farid/Devel/ASM/selfmod1_amd64
5
4
3
2
1
0
 
Program exited normally.
(gdb) q

What do we see here? The program crashes when it executes standalone, but it doesn’t crash when executed under the debugger gdb(1)!

We can try to trace it with ktrace(1) and kdump(1):

% ktrace ./selfmod1_amd64
Bus error (core dumped)
 
% kdump
  2457 ktrace   RET   ktrace 0
  2457 ktrace   CALL  execve(0x7fffffffe94f,0x7fffffffe690,0x7fffffffe6a0)
  2457 ktrace   NAMI  "./selfmod1_amd64"
  2457 selfmod1_amd64 RET   execve 0
  2457 selfmod1_amd64 PSIG  SIGBUS SIG_DFL
  2457 selfmod1_amd64 NAMI  "selfmod1_amd64.core"

There’s not much that can be seen here, except that the process got a SIGBUS signal very early on, that was handled with the SIG_DFL handler. A SIGBUS signal is usually a hint that the process tried to access a location in memory that was either not available or protected. In other words: the operating system kernel terminated our process, because we tried to access memory in a certain way that was forbidden or impossible.

The sneaky (and nasty) part here is that even if you ran the program under the debugger, there’s no way to hit the location that would cause the core dump! Remember: under the debugger, the program runs just fine!

The mere act of observing selfmod1_amd64.S under the debugger causes it to behave differently then when ran standalone. This is a typical instance of the observer effect in information theory.

The FreeBSD/i386 version

Let’s repeat the experiment with FreeBSD/i386:

// selfmod1_i386.S -- self modifying code, FreeBSD/i386 version
//                    Runs ONLY under gdb(1), and NOT standalone.
     
/* Some constants */
 
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4             /* WRITE syscall */
.equiv SYS_EXIT, 1              /* EXIT syscall */
 
/*
 * Syscalls implemented as macros for max performance.
 */
 
.macro write fd, buf, len
        sub  $0x10, %esp
        movl \fd, (%esp)
        movl \buf, 0x4(%esp)
        movl \len, 0x8(%esp)
 
        movl $SYS_WRITE, %eax
        call do_syscall
        add  $0x10, %esp
.endm
 
.macro exit retcode
        sub  $0x10, %esp
        movl \retcode, (%esp)
 
        movl $SYS_EXIT, %eax
        call do_syscall
        add  $0x10, %esp   /* NOTREACHED */
.endm
 
/* The .data section contains our output buffer */
        .section .data
 
buffer:
        .byte 'X', '\n'
 
/* Code section */
        .text
 
// The counter is in the .text code section!
counter:
        .byte 0
 
        .align 8
 
        .global _start
_start:
        /* set up a stack frame */
        pushl %ebp
        movl  %esp, %ebp
 
init_counter:
        movb $6, (counter)        /* initial value of counter, + 1 */
 
loop_counter:
        movb (counter), %al
        decb %al
        je   end_loop
        movb %al, (counter)
        call write_counter
        jmp  loop_counter
 
end_loop:
        movb %al, (counter)       /* one last time to print 0 */
        call write_counter
 
bye:
        exit $0
 
        /* NOT REACHED */
        popl  %ebp
 
write_counter:
        pushl %ebp
        movl  %esp, %ebp
 
        movb  (counter), %al     /* convert counter to ASCII */
        addb  $0x30, %al
        movb  %al, (buffer)
     
        write $1, $buffer, $2    /* write 2 bytes of buffer: value and \n */
     
        popl  %ebp
        ret
 
do_syscall:
        int   $0x80
        jnc   ret_from_syscall
        neg   %eax
ret_from_syscall:
        ret

The program is the same as before, and only counter migrated to the .text section. Assembling and linking it as usual:

$ as --gstabs --32 -o selfmod1_i386.o selfmod1_i386.S
$ ld -o selfmod1_i386 selfmod1_i386.o

Will there be an observer effect here too?

$ ./selfmod1_i386
Bus error (core dumped)
 
$ gdb -q ./selfmod1_i386
(gdb) r
Starting program: /users/farid/selfmod1_i386
5
4
3
2
1
0
 
Program exited normally.
(gdb) q

Yes, there is (applause)! Tracing with ktrace(1) and kdump(1):

$ ktrace ./selfmod1_i386
Bus error (core dumped)
 
$ kdump
 64073 ktrace   RET   ktrace 0
 64073 ktrace   CALL  execve(0xbfbfedc7,0xbfbfeca4,0xbfbfecac)
 64073 ktrace   NAMI  "./selfmod1_i386"
 64073 selfmod1_i386 RET   execve 0
 64073 selfmod1_i386 PSIG  SIGBUS SIG_DFL
 64073 selfmod1_i386 NAMI  "selfmod1_i386.core"

We see again that the operating system kernel terminated our process with a SIGBUS signal, i.e. we accessed memory in a wrong way.

What causes this Heisenbug?

You may now be asking yourself what caused this strange observer effect (a Heisenbug), right? In order not to spoil the fun, I’ll postpone the discussion of this aspect to a (probable) subsequent post.

Please stop reading now if you want to do some serious thinking on your own and if you love challenges!

Alright, you’re still totally at loss? Here are a few hints that may help setting you on the right track:

  • The only difference between the working and non-working examples was that counter moved from .data to .text.
  • Memory in .data can be modified freely, but memory in .text is (supposed to be) read-only.
  • Debuggers must have a way to insert opcodes in the code they debug, so that they can set breakpoints, or even single step through the code. They obviously can’t do that in a read-only section of memory!
  • Operating systems like FreeBSD provide a system call that can modify the protection attributes of memory regions in a certain way. Debuggers like gdb(1) make extensive use of those system calls. Hint: ptrace(2).
  • On FreeBSD, you may want to investigate the mprotect(2) C-Function (and associated system call), and use that to deprotect the .text region.

With all those hints, you should be able to extend selfmod1_amd64.S and selfmod1_i386.S in such a way, that they run both outside and inside a debugger session, while still keeping counter where it is, i.e. inside the .text section. In the following post, I provide a solution to this little exercise.

Conclusion

When the mere act of observing a running program (e.g. in a debugger session) changes its behavior, we have an instance of the observer effect. While this effect can be annoying for developers seeking to find obscure bugs, it can be extremely useful for developers who would like to programatically detect if their programs are actually being reverse-engineered in a debugger session.

As we’ve seen in this post, the read-only property of the .text section disappears when ran under a debugger. Programs that would like to detect debugger sessions (e.g. DRM, or tamper-proofing) can intentionally test for this property, and refuse to execute properly when .text is writable.

Of course, this trick won’t prevent truly determined reverse-engineering efforts, and we’ve got to be thankful for that. But there are more advanced anti-tampering techniques: imagine a code that creates op-codes on the fly. With a judicious choice of interdependent op-codes and relative addressing, the modifications that a debugger inserts into that code could effectively break it, and you may have to run the program in a complete virtual machine environment in order to debug it. A real emulator in software like Bochs, while extremely slow, is preferable to hybrid emulators like VirtualBox for this kind of reverse-engineering.