In the previous post, we’ve explored the observer effect in IT by writing a program that behaved differently under a debugger session than standalone. In this post, we’ll extend selfmod1_amd64.S and selfmod1_i386.S in such a way, that they won’t crash anymore, irrespectively of the environment they run in.

Analyzing the problem

Why did we witness a different behavior inside and outside of a debugging environment?

Code pages are read-only by default

The programs selfmod1_amd64.S and selfmod1_i386.S crashed, because they tried to modify counter, a memory location that was stored in the read-only .text section. The operating system effectively prevented this by killing the processes with a SIGBUS signal.

This is perfectly desirable behavior of Unix systems: code pages are usually shared across many processes (imagine many shells, httpd processes etc. running concurrently and executing the same binary), and if all those pages are mapped read-only in the processes’ memory address space, they won’t have to be physically duplicated, thus saving a lot of RAM.

Furthermore, read-protecting code pages by default is also a safeguard against buggy programs: self-modifying code is usually not intended, and normally an excellent shooting-in-the-foot mechanism, so it is disabled by default.

So is self-modifying code impossible?

While read-protecting code pages is considered best practice, under some circumstances, processes may need to modify code pages (e.g. our program needs to do just that). Debuggers are notorious for this: in order to implement breakpoints and single stepping, they need to inject special op-codes in the code pages of the debugged processes. The operating system obviously needs to provide a mechanism to debuggers to (temporarily) lift the read-only protection of code pages.

In practice, debuggers on Unix systems would use the venerable ptrace(2) system call (it appeared as early as in Version 7 AT&T Unix!). While it is available on virtually all Unix systems, it is notoriously unportable, and has an ugly interface.

Some Unix systems that rely on Mach’s VM subsystem or a variant thereof, provide an additional way to set permissions of pages. FreeBSD is one such system, and it provides the mprotect(2) system call. With the following pseudo-code, we can effectively turn a read-only code page into a read-write page:

#include <sys/mman.h>
 
void *addr  = get_address_of_instruction_pointer();
int  length = 4096;   /* size of a page */
 
if (mprotect(addr, length, PROT_READ | PROT_WRITE | PROT_EXEC) == 0) {
    /* current code page is now writable */
}

Solving the challenge

So, to solve the challenge of the previous post, we simply need to translate the C call to mprotect(2) into a corresponding assembly call. What do we need to do? Basically, we’ll call mprotect(2) with appropriate arguments to set the PROT_WRITE bit on the .text page that contains the variable counter. As soon as that happens (i.e. if FreeBSD kernel lets us change the protection at all), the counter should decrease even outside a debugging session, without causing a SIGBUS.

What do we need precisely? We have to

  • detect the address of the current code page, and
  • execute the mprotect(2) syscall.

The first point is not as obvious as one may think. We can’t simply stuff the address of counter in mprotect(2)‘s addr argument. Why not? Because the assembler doesn’t know at this point the address where the program will be loaded! counter, being at the very beginning of the .text section has the address 0 at the time of assembly. Obviously, at run time, that address will be something else, determined by the dynamic linker and by the kernel.

It all boils down to this: we need to detect at run time where counter is located. We use the following observation to achieve this feat:

Since counter is very near of the remaining code (certainly not farther away than, say, 4K), we use the address of the instruction pointer %rip (or %eip) instead!

If we call mprotect(2) with %rip or %eip, we will not hit the very beginning of the page where counter is. But since this system call acts at best with page granularity on FreeBSD (you can’t change the protection of only parts of a page), if we unprotect the page %rip or %eip points to, counter will be unprotected too.

There’s a little technical issue to resolve though:

Suppose we will stuff %rbx or %ebx into the addr argument of the mprotect macro. Unfortunately, we can’t execute movq %rip, %rbx nor movl %eip, %ebx. There’s simply no op-code with this combination of registers in the i386 or amd64 instruction set! Fortunately, we can do it via the stack! Consider the following code on FreeBSD/amd64:

        /* simulating movq %rip, %rbx */
        call  next_location
next_location: 
        popq  %rbx

This strange-looking idiom transfers %rip (which contains the address following the call next_location instruction) into %rbx! How? Remember that the call instruction is normally meant to call a subroutine. As part of that instruction, the instruction pointer %rip is implicitely pushed on the stack so that ret from the called subroutine returns to the address following the call to call. Instead of ret, we can as well popq the stack (implicitely returning from the subroutine) into a different register (here: %rbx).

This idiom on FreeBSD/i386 looks exactly the same, except that we use different registers and pop a long, instead of a quad:

        /* simulating movl %eip, %ebx */
        call next_location
next_location:
        popl  %ebx

The other arguments to mprotect(2) are easy: the length can be constant, i.e. 4K (4096), and the protection bits will be PROT_READ, PROT_WRITE and PROT_EXEC or-ed together.

We’ll implement the invocation of the mprotect(2) syscall as a macro, just as we did for other syscalls.

Now, we’re ready to modify selfmod1_amd64.S and selfmod1_i386.S accordingly, and test our hypothesis.

The FreeBSD/amd64 version

The program for the self-modifying .text code section on FreeBSD/amd64 is thus:

// selfmod2_amd64.S -- self modifying code, FreeBSD/amd64 version
//                     Runs under gdb(1) AND standalone.
     
/* Some constants */
 
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4             /* WRITE syscall */
.equiv SYS_EXIT, 1              /* EXIT syscall */
.equiv SYS_MPROTECT, 74         /* MPROTECT syscall */
 
// These constants for mprotect(2) are from <sys/mman.h>
.equiv PROT_READ,  0x01         /* pages can be read */
.equiv PROT_WRITE, 0x02         /* pages can be written */
.equiv PROT_EXEC,  0x04         /* pages can be executed */
 
.equiv PROT_ALL, PROT_READ | PROT_WRITE | PROT_EXEC
.equiv PAGE_SIZE, 4096
 
/*
 * Syscalls implemented as macros for max performance.
 */
 
.macro write fd, buf, len
        movq \fd, %rdi
        movq \buf, %rsi
        movq \len, %rdx
 
        movq $SYS_WRITE, %rax
        syscall
.endm
 
.macro exit retcode
        movq \retcode, %rdi
 
        movq $SYS_EXIT, %rax
        syscall
.endm
 
.macro mprotect addr, len, prot
        movq \addr, %rdi
        movq \len, %rsi
        movq \prot, %rdx
 
        movq $SYS_MPROTECT, %rax
        syscall
.endm
 
/* The .data section contains our output buffer */
        .section .data
 
buffer:
        .byte 'X', '\n'
 
/* Code section */
        .text
 
// The counter is in the .text code section!
counter:
        .byte 0
 
        .align 8
        .global _start
_start:
        /* set up a stack frame */
        pushq %rbp
        movq  %rsp, %rbp
 
        /* simulating movq %rip, %rbx */
        call  next_location
next_location:   
        popq  %rbx
 
        /* make current code/text page writable */
        mprotect %rbx, $PAGE_SIZE, $PROT_ALL
        jc bye                    /* silently fail if we can't */
 
init_counter:
        movb $6, (counter)        /* initial value of counter, + 1 */
 
loop_counter:
        movb (counter), %al
        decb %al
        je   end_loop
        movb %al, (counter)
        call write_counter
        jmp  loop_counter
 
end_loop:
        movb %al, (counter)       /* one last time to print 0 */
        call write_counter
 
bye:
        exit $0
 
        /* NOT REACHED */
        popq  %rbp
 
write_counter:
        pushq %rbp
        movq  %rsp, %rbp
 
        movb  (counter), %al     /* convert counter to ASCII */
        addb  $0x30, %al
        movb  %al, (buffer)
     
        write $1, $buffer, $2    /* write 2 bytes of buffer: value and \n */
     
        popq  %rbp
        ret

The only additions to the previous version are the PROT_* and PAGE_SIZE constants, the mprotect macro, and our neat little trick to discover the address of the counter variable, followed by the call to the mprotect macro:

        /* simulating movq %rip, %rbx */
        call  next_location
next_location: 
        popq  %rbx
 
        /* make current code/text page writable */
        mprotect %rbx, $PAGE_SIZE, $PROT_ALL
        jc bye                    /* silently fail if we can't */

Will the program run without crashing? Let’s test it!

% as --64 -o selfmod2_amd64.o selfmod2_amd64.S
% ld -o selfmod2_amd64 selfmod2_amd64.o
 
% ./selfmod2_amd64
5
4
3
2
1
0

Not bad at all. But we’re curious, so we’ll also ktrace(1) it, to see if the mprotect(2) syscall was executed and what it did return. After executing ktrace ./selfmod2_amd64 we can kdump(1) it (output truncated):

% kdump
  3341 ktrace   RET   ktrace 0
  3341 ktrace   CALL  execve(0x7fffffffe94f,0x7fffffffe690,0x7fffffffe6a0)
  3341 ktrace   NAMI  "./selfmod2_amd64"
  3341 selfmod2_amd64 RET   execve 0
  3341 selfmod2_amd64 CALL  mprotect(0x4000c1,0x1000,PROT_READ|PROT_WRITE|PROT_EXEC)
  3341 selfmod2_amd64 RET   mprotect 0
  3341 selfmod2_amd64 CALL  write(0x1,0x500150,0x2)
  3341 selfmod2_amd64 GIO   fd 1 wrote 2 bytes
       "5
       "
  3341 selfmod2_amd64 RET   write 2
 
  (...)
 
  3341 selfmod2_amd64 CALL  write(0x1,0x500150,0x2)
  3341 selfmod2_amd64 GIO   fd 1 wrote 2 bytes
       "0
       "
  3341 selfmod2_amd64 RET   write 2
  3341 selfmod2_amd64 CALL  exit(0)

As we can see, mprotect(2) returned 0, i.e. the call was successful.

The FreeBSD/i386 version

The 32-bit version is just as simple, and you should understand it immediately now:

// selfmod2_i386.S -- self modifying code, FreeBSD/i386 version
//                    Runs under gdb(1) AND standalone.
     
/* Some constants */
 
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4             /* WRITE syscall */
.equiv SYS_EXIT, 1              /* EXIT syscall */
.equiv SYS_MPROTECT, 74         /* MPROTECT syscall */
 
// These constants for mprotect(2) are from <sys/mman.h>
.equiv PROT_READ,  0x01         /* pages can be read */
.equiv PROT_WRITE, 0x02         /* pages can be written */
.equiv PROT_EXEC,  0x04         /* pages can be executed */
 
.equiv PROT_ALL, PROT_READ | PROT_WRITE | PROT_EXEC
.equiv PAGE_SIZE, 4096
     
/*
 * Syscalls implemented as macros for max performance.
 */
 
.macro write fd, buf, len
        sub  $0x10, %esp
        movl \fd, (%esp)
        movl \buf, 0x4(%esp)
        movl \len, 0x8(%esp)
 
        movl $SYS_WRITE, %eax
        call do_syscall
        add  $0x10, %esp
.endm
 
.macro exit retcode
        sub  $0x10, %esp
        movl \retcode, (%esp)
 
        movl $SYS_EXIT, %eax
        call do_syscall
        add  $0x10, %esp   /* NOTREACHED */
.endm
 
.macro mprotect addr, len, prot
        sub  $0x10, %esp
        movl \addr, (%esp)
        movl \len, 0x4(%esp)
        movl \prot, 0x8(%esp)
 
        movl $SYS_MPROTECT, %eax
        call do_syscall
        add  $0x10, %esp
.endm
 
/* The .data section contains our output buffer */
        .section .data
 
buffer:
        .byte 'X', '\n'
 
/* Code section */
        .text
 
// The counter is in the .text code section!
counter:
        .byte 0
 
        .align 8
 
        .global _start
_start:
        /* set up a stack frame */
        pushl %ebp
        movl  %esp, %ebp
 
        /* simulating movl %eip, %ebx */
        call next_location
next_location:
        popl  %ebx
 
        /* make current code/text page writable */
        mprotect %ebx, $PAGE_SIZE, $PROT_ALL
        cmpl $0, %eax
        js   bye                  /* silently fail if we can't */
 
init_counter:
        movb $6, (counter)        /* initial value of counter, + 1 */
 
loop_counter:
        movb (counter), %al
        decb %al
        je   end_loop
        movb %al, (counter)
        call write_counter
        jmp  loop_counter
 
end_loop:
        movb %al, (counter)       /* one last time to print 0 */
        call write_counter
 
bye:
        exit $0
 
        /* NOT REACHED */
        popl  %ebp
 
write_counter:
        pushl %ebp
        movl  %esp, %ebp
 
        movb  (counter), %al     /* convert counter to ASCII */
        addb  $0x30, %al
        movb  %al, (buffer)
     
        write $1, $buffer, $2    /* write 2 bytes of buffer: value and \n */
     
        popl  %ebp
        ret
 
do_syscall:
        int   $0x80
        jnc   ret_from_syscall
        neg   %eax
ret_from_syscall:
        ret

As a little exercise, try to spot the changes w.r.t. the previous version selfmod1_i386.S.

How about testing? Sure, why not?

$ as --32 -o selfmod2_i386.o selfmod2_i386.S
$ ld -o selfmod2_i386 selfmod2_i386.o
 
$ ./selfmod2_i386
5
4
3
2
1
0

No crashes. The programs runs just fine. A kdump(1) shows no surpises:

73799 ktrace   RET   ktrace 0
73799 ktrace   CALL  execve(0xbfbfedc7,0xbfbfeca4,0xbfbfecac)
73799 ktrace   NAMI  "./selfmod2_i386"
73799 selfmod2_i386 RET   execve 0
73799 selfmod2_i386 CALL  mprotect(0x8048088,0x1000,PROT_READ|PROT_WRITE|PROT_EXEC)
73799 selfmod2_i386 RET   mprotect 0
73799 selfmod2_i386 CALL  write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO   fd 1 wrote 2 bytes
      "5
      "
73799 selfmod2_i386 RET   write 2
73799 selfmod2_i386 CALL  write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO   fd 1 wrote 2 bytes
      "4
      "
73799 selfmod2_i386 RET   write 2
73799 selfmod2_i386 CALL  write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO   fd 1 wrote 2 bytes
      "3
      "
73799 selfmod2_i386 RET   write 2
73799 selfmod2_i386 CALL  write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO   fd 1 wrote 2 bytes
      "2
      "
73799 selfmod2_i386 RET   write 2
73799 selfmod2_i386 CALL  write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO   fd 1 wrote 2 bytes
      "1
      "
73799 selfmod2_i386 RET   write 2
73799 selfmod2_i386 CALL  write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO   fd 1 wrote 2 bytes
      "0
      "
73799 selfmod2_i386 RET   write 2
73799 selfmod2_i386 CALL  exit(0)

Conclusion

Even though self-modifying code (or, more precisely, code that modifies the .text section) is rarely a good idea, it is possible nonetheless if the operating system provides a way to remove the read-only protection. FreeBSD provides the mprotect(2) system call to do just that, and we’ve made use of it, so that our counter variable could be modified, even though it was located in the .text section.

This was not really an example in self-modifying code, but since we’ve modified the bytes in the .text section of a running process, we can still claim to be at least able to modify code on-the-fly (at run time), by injecting bytes in the code pages, should the need arise.

3 Comments

  1. I want to quote your post in my blog. It can?
    And you et an account on Twitter?

     
  2. Sure, why not? I don’t have an account on Twitter.

     
  3. alderz

    Thank you for this nice articles. I would love to see more articles about assembly!

    For the linux users out there: (in linux) the address needs to be a multiple of the pagesize, so we can’t use %rip but we can calculate the page start address using: andq $0xfffffffffffff000, %rbx .

    Happy hacking!