Have you ever wondered how a simple hello, world! program looks like in assembly language? Or, more precisely, in assembly for the FreeBSD/i386 and FreeBSD/amd64 platforms?

In this tutorial, I’ll show you how to write such a bare-bones hello, world! program in assembler for both platforms. We’ll start with zero knowledge about the current OS architecture, using only the assembler output of the C compiler gcc. Then, we’ll gradually trim that output until we reach a minimalist assembler program, learning how to use tools like ktrace(1), kdump(1), objdump(1), gdb(1), and, of course as(1) and ld(1).

Don’t worry, the learning curve will be more or less gentle, if you know a little bit of C and assembler (you don’t need any FreeBSD-specific knowledge). Enjoy the trip.

Hello, World in C

Unless you’re familiar with the assembly language for the target platform, you’ll want to start with a portable C program, and let the compiler generate assembler code for you. So, we start by writing the traditional hello world program in C:

/* hello0.c -- hello, world! */

#include <sys/types.h>
#include <sys/uio.h>
#include <unistd.h>

const char *msg = "hello, world!\n";

int
main (int argc, char *argv[])
{
  write(1, msg, 14);

  return 0;
}

C veterans will notice that we prefer to use the system function write(2) instead of the more familiar stdio library function printf(3). Why did we do this? Actually, printf() is merely a rather complex wrapper, that ultimately calls write() anyway, so we cut through the chase and use the system function directly. This will result in easier to read assembler code.

The arguments that we passed to write(2) were:

  • 1, the file descriptor for the stdout (standard output) stream.
  • msg, the pointer to the string "hello, world!\n".
  • 14, the number of bytes from msg that write(2) should display.

You may notice that we didn’t care about write(2)‘s return value, and that we didn’t do any error checking. A real world program would need that, but here, it would unnecessarily complicate the resulting assembler code, so we omitted any extra error checking. We also assumed (hoped) that msg is a short-enough string to be written entirely out in one call, instead of chunk-wise in many calls.

So it’s time to test drive this little proggy, and to see how it fares. Compiling the program with cc(1) is trivial:

% cc -o hello0 hello0.c

% ./hello0
hello, world!

% echo $?
0

Please note that in addition to printing the string “hello, world!” (followed by a new line), the return code for the whole program in this example was 0, as requested with return 0;. We can see this by echoing the value of the C-Shell variable $?.

Peeking under the hood with ktrace / kdump

Of course, it works. But that’s not the whole story. Far from that. You probably know that write(2) ultimately calls the kernel to get the job done, and we’ll use the ktrace(1) and kdump(1) tools to peek under the hood:

% ktrace ./hello0
hello, world!

% kdump

  (... a lot of stuff omitted ...)

  1612 hello0   CALL  write(0x1,0x400623,0xe)
  1612 hello0   GIO   fd 1 wrote 14 bytes
       "hello, world!
       "
  1612 hello0   RET   write 14/0xe

  (... omitted more stuff ...)

  1612 hello0   CALL  exit(0)

I’ve trimmed a lot of stuff that belongs to the C runtime library, but which doesn’t matter in this example (it takes a lot of effort to set up a good and consistent C runtime, though you may not have expected it).

The most important call is, of course, that to write(0x1,0x400623,0xe), which outputs 0xe (i.e. 14) bytes of the string located in this particular case at address 0x400623 (msg) to file descriptor 0x1, i.e. to standard output stdout. The return value of this particular call to write(2), 14, is the number of bytes successfully output.

The second most interesting call for us was that to the syscall _exit(2), shown here as exit(0). Even though we didn’t call _exit(0); explicitely, the C runtime library took care of that for us, by stuffing the return code of main() into _exit(2).

So, the lesson we need to remember at this point is that we have to issue at least two syscalls to the kernel:

  • write(0x1,0x400623,0xe); (the address of the string will be different though)
  • exit(0);

Discovering the FreeBSD/amd64 ABI

To discover the assembler code needed to call the 64-bit FreeBSD/amd64 kernel, we must run the following examples on this platform. It doesn’t matter if we run them on the bare metal (amd, intel) or in a virtualized environment like VirtualBox, but it matters greatly that we use FreeBSD/amd64 and not FreeBSD/i386. The following examples were tested on FreeBSD 8.0-STABLE as of revision r200471.

Disassembling with gdb

While a kdump(1) output is interesting, it still doesn’t show the assembler code. All we see in kdump(1) is the interaction of our program with the kernel, i.e. the ABI boundary between user space and kernel space. One tool that can prove helpful here to get a disassembly is the debugger gdb(1). Let’s start it with hello0 and set a breakpoint at function main():

% gdb --quiet ./hello0
(no debugging symbols found)...

(gdb) b main
Breakpoint 1 at 0x400574

(gdb) r
Starting program: /users/farid/Devel/ASM/hello0 
(no debugging symbols found)...(no debugging symbols found)...
Breakpoint 1, 0x0000000000400574 in main ()

Now that we’ve hit the breakpoint, we’re inside or just before a valid stack frame, so we can request a disassembly:

(gdb) disassemble
Dump of assembler code for function main:
0x0000000000400570 <main+0>:    push   %rbp
0x0000000000400571 <main+1>:    mov    %rsp,%rbp
0x0000000000400574 <main+4>:    sub    $0x10,%rsp
0x0000000000400578 <main+8>:    mov    %edi,0xfffffffffffffffc(%rbp)
0x000000000040057b <main+11>:    mov    %rsi,0xfffffffffffffff0(%rbp)
0x000000000040057f <main+15:>    mov    1048858(%rip),%rsi        # 0x5006a0 <msg>
0x0000000000400586 <main+22>:    mov    $0xe,%edx
0x000000000040058b <main+27>:    mov    $0x1,%edi
0x0000000000400590 <main+32>:    callq  0x40041c 
0x0000000000400595 <main+37>:    mov    $0x0,%eax
0x000000000040059a <main+42>:    leaveq 
0x000000000040059b <main+43>:    retq  

(... omitted some lines ...)

End of assembler dump.

If you look at the register names (%rbp, %rsi, …), or at the long offsets (0xfffffffffffffffc, …) you can easily infer that we’re currently running on a 64-bit FreeBSD/amd64 platform (x86-64). We’ll see shortly how the output looks like on a 32-bit FreeBSD/i386 platform. Right now, we’ll stay with FreeBSD/amd64 for a while, because it has an easier Kernel-ABI.

So what do we see here?

  • The arguments for write(2) are saved into the registers %edi, %edx, and %rsi respectively.
  • Then, there’s a call to an assembler subroutine write.

That’s interesting, but still, we don’t see how the kernel is called. Maybe the assembler subroutine write has more details? Okay, let’s trace this by setting a breakpoint at main+32, just before the callq instruction, and using the c(ontinue) command until we reach that call:

(gdb) b *0x400590
Breakpoint 2 at 0x400590

(gdb) c
Continuing.

Breakpoint 2, 0x0000000000400590 in main ()

The next step is to execute the single machine instruction callq to the write label with the stepi gdb command:

(gdb) stepi
0x000000000040041c in ?? ()

We’ve effectively jumped to the address 0x40041c, the entry point of write. The double question marks are caused by the lack of debugging / symbol information in the compiled C library where write is defined (you probably know that write(2) is defined in an external dynamic library, in this case /lib/libc.so.7, right?). But there’s no reason to be deterred by that… yet. Let’s disassemble the current stack frame:

(gdb) disassemble
No function contains program counter for selected frame.

So this is a dead end (for now). There may be some black magic or heavy wizardry to get gdb(1) disassemble some code at this point, but I’m not aware of it (please comment on it if you know how to do it), and even if I were, I wouldn’t go into it any further to keep it simple.

All that we need to know now, is that we don’t get to see the assembler code of write with gdb(1), because /lib/libc.so.7 doesn’t contain debugging and symbol information. Okay, let’s finish our gdb(1) session, before we try another approach.

(gdb) q
The program is running.  Exit anyway? (y or n) y

On dynamic linking

We take a step back, and have a look at our binary hello0 again:

% file hello0
hello0: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD),
        dynamically linked (uses shared libs), for FreeBSD 8.0 (800108),
        not stripped

As we’ve thought, it’s a dynamically linked program. ldd(1) confirms it:

% ldd hello0
hello0:
        libc.so.7 => /lib/libc.so.7 (0x800647000)

But that’s not very satisfying. objdump(1) tells us a lot more. If you’ve never heard of that tool before, call it without arguments or with --help to get a list of options.

So, what’s in hello0?

]$ objdump --file-headers hello0

hello0:     file format elf64-x86-64
architecture: i386:x86-64, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000000000400460

This tells us that the binary is an ELF file, more specifically an ELF file for the 64-bit x86-64 platform (also known as amd64). That’s true: we’re running on FreeBSD/amd64 right now. It also tells us (and the loader) the entry point (start address). The first thing the loader does after mapping this file into memory, is to point the instruction pointer of the current CPU to this address. That’s where the ball gets rolling.

Okay, that’s all nice and so, but it’s not enough. You may know that an ELF file contains a lot of sections: there’s e.g. a section for data (named .data), another for read-only data (named .rodata), then a section for executable code (named .text) and so on, and so forth. objdump(1) can print a list of those sections, including the type of data they contain. Try the --section-headers option if you feel so inclined.

The really interesting part here is a disassembly of all sections that contain CODE, i.e. op-codes and operands for the CPU. Maybe we can get the assembly code for write this way? You can try it, but you’ll be disappointed:

]$ objdump --disassemble hello0

(... long output, but no code for write function ...)

Why is that so? Well, remember that this code was not located in our dynamically linked program. It was located in /lib/libc.so.7, which we didn’t examine nor disassemble.

Static linking to the rescue

If we really wanted to look at the assembler code for the assembler function write, we could inspect /lib/libc.so.7, object file for object file, until we find this function, but this can get very tedious.

A lot easier would be to tell the C compiler to simply include all needed external (library) functions right into the output file, so that this file doesn’t depend on external libraries anymore. The cc(1) switch to accomplish this on FreeBSD (and on any other platform supported by the GNU C compiler) is -static:

% cc -static -o hello0 hello0.c

% file hello0
hello0: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD),
        statically linked, for FreeBSD 8.0 (800108), not stripped

% ldd hello0
ldd: hello0: not a dynamic ELF executable

Alright. Let’s test it:

% ./hello0
hello, world!
Disassembling the static code with gdb

Just as we did before with the dynamic version of hello0, let’s disassemble this static file hello0 with gdb(1):

% gdb --quiet ./hello0
(no debugging symbols found)...

(gdb) b main
Breakpoint 1 at 0x400264

(gdb) r
Starting program: /users/farid/Devel/ASM/hello0 

Breakpoint 1, 0x0000000000400264 in main ()

(gdb) disassemble
Dump of assembler code for function main:
0x0000000000400260 <main+0>:    push   %rbp
0x0000000000400261 <main+1>:    mov    %rsp,%rbp
0x0000000000400264 <main+4>:    sub    $0x10,%rsp
0x0000000000400268 <main+8>:    mov    %edi,0xfffffffffffffffc(%rbp)
0x000000000040026b <main+11>:   mov    %rsi,0xfffffffffffffff0(%rbp)
0x000000000040026f <main+15>:   mov    1220002(%rip),%rsi        # 0x52a018 <msg>
0x0000000000400276 <main+22>:   mov    $0xe,%edx
0x000000000040027b <main+27>:   mov    $0x1,%edi
0x0000000000400280 <main+32>:   callq  0x41a080 <write>
0x0000000000400285 <main+37>:   leaveq 
0x0000000000400286 <main+38>:   retq   

(... omitted some lines ...)
End of assembler dump.

Unsurprisingly, this “static” main subroutine is the same as for the “dynamic” subroutine we’ve had before. Again, we see how the arguments for write are saved in %edi, %edx, and %rsi. We carefully execute code until we reach the callq instruction, by setting another breakpoint at main+32 and continuing execution:

(gdb) b *0x400280
Breakpoint 2 at 0x400280

(gdb) c
Continuing.

Breakpoint 2, 0x0000000000400280 in main ()

At this point, we execute just one (callq) instruction to step into the new assembler function write:

(gdb) stepi
0x000000000041a080 in write ()

Ah, no more ?? here, but a real function name. That’s encouraging. Could we get a disassembly of this new function? Let’s try it:

(gdb) disassemble
Dump of assembler code for function write:
0x000000000041a080 <write+0>:   mov    $0x4,%rax
0x000000000041a087 <write+7>:   mov    %rcx,%r10
0x000000000041a08a <write+10>:  syscall 
0x000000000041a08c <write+12>:  jb     0x41a08f <write+15>
0x000000000041a08e <write+14>:  retq   
0x000000000041a08f <write+15>:  jmpq   0x41a2d4 .<cerror>

(... omitted some lines ...)
End of assembler dump.

Interesting. We’ll learn later that syscall jumps right into the kernel, and the kernel will execute the syscall with the ID saved in register %rax (in this case the WRITE syscall). Just to prove us right, we set a breakpoint near retq, and continue executing:

(gdb) b *0x41a08e
Breakpoint 3 at 0x41a08e

(gdb) c
Continuing.
hello, world!

Breakpoint 3, 0x000000000041a08e in write ()

Aha, the syscall instruction was indeed the point where the kernel was called. Can you see the output string?

To finish this debugging session, we jump right out of the function write by stepping a single (retq) instruction:

(gdb) stepi
0x0000000000400285 in main ()

We’re back in main now.

Disassembling the static code with objdump

Another way to disassemble the static hello0 binary is with objdump(1)‘s --disassemble flag. Omitting a lot of irrelevant sections and functions, we get this for main:

0000000000400260 <main>:
  400260:       55                      push   %rbp
  400261:       48 89 e5                mov    %rsp,%rbp
  400264:       48 83 ec 10             sub    $0x10,%rsp
  400268:       89 7d fc                mov    %edi,0xfffffffffffffffc(%rbp)
  40026b:       48 89 75 f0             mov    %rsi,0xfffffffffffffff0(%rbp)
  40026f:       48 8b 35 a2 9d 12 00    mov    1220002(%rip),%rsi        # 52a018 <msg>
  400276:       ba 0e 00 00 00          mov    $0xe,%edx
  40027b:       bf 01 00 00 00          mov    $0x1,%edi
  400280:       e8 fb 9d 01 00          callq  41a080 <__sys_write>
  400285:       b8 00 00 00 00          mov    $0x0,%eax
  40028a:       c9                      leaveq 
  40028b:       c3                      retq

and the following code for __sys_write:

000000000041a080 <__sys_write>:
  41a080:       48 c7 c0 04 00 00 00    mov    $0x4,%rax
  41a087:       49 89 ca                mov    %rcx,%r10
  41a08a:       0f 05                   syscall 
  41a08c:       72 01                   jb     41a08f <__sys_write+0xf>
  41a08e:       c3                      retq   
  41a08f:       e9 40 02 00 00          jmpq   41a2d4 <.cerror>

Hello, World in FreeBSD/amd64 Assembler

All that’s left to do is to put the disassembled code we’ve gathered from gdb(1) or objdump(1) into a file:

// hello_amd64.S -- hello world in GNU ASM (GAS). FreeBSD/amd64 version.

// Some constants
.equiv STDOUT, 1                /* file descriptor for stdout */
.equiv RETVAL, 0                /* argument for exit() */
.equiv SYS_WRITE, 4             /* WRITE syscall ID */
.equiv SYS_EXIT, 1              /* EXIT syscall ID */
        
// Data section (read-only)
        .section        .rodata
msg:
        .string "hello, world!\n"
        len = . - msg
        
// Code section
        .text
        .global _start
_start:

        /*
         * According to http://www.x86-64.org/documentation/abi.pdf
         * page 124 (A.2.1), the arguments for a LINUX syscall are
         * passed in the following regs:
         *   %rdi, %rsi, %rdx, %r10, %r8 and %r9
         * One has to issue a syscall instruction,
         * which destroys %rcx.
         * %rax contains the number of the syscall.
         * Returing from syscall, register %rax contains the result.
         *
         * Note that FreeBSD/amd64 seems to follow the LINUX convention,
         * unlike FreeBSD/i386 which uses the stack for its arguments.
         */

        // Call write(STDOUT,msg,len);
        movl    $len,%edx
        movq    $msg,%rsi
        movl    $STDOUT,%edi
        
        movq    $SYS_WRITE,%rax
        movq    %rcx,%r10         /* syscall destroys %rcx! */
        syscall
        
bye:
        // Call exit(0);
        movl    $RETVAL,%edi

        movq    $SYS_EXIT,%rax
        movq    %rcx,%r10         /* syscall destroys %rcx! */
        syscall

We stuffed the string “hello, world!\n” into the .rodata section of the ELF file, because it is read-only data (we don’t intend to modify it), at the label msg.

Instead of hard coding the length 14 into a variable, we’ve calculdated len with the as(1) expression len = . - msg, where dot refers to the current location pointer. This way, we could change the string in the source file, without having to adjust len accordingly.

Note well that len is now 15, instead of 14, because the as(1) instruction .string automatically \0-terminates the string, i.e. it appends a NUL byte to the 14 characters. We’ll see this below in the kdump output.

The code itself goes into the .text section, as is customary for ELF files. Furthermore, the linker ld(1) needs an entry point, i.e. a label somewhere in the code, that is supposed to be executed first when the code is loaded into memory. If we don’t want to specify the entry point as an option to ld(1), we can use the label _start, which is the default entry point. ld(1) will generate an ELF file with a header that points (among other things) to this specific entry point.

Testing hello_amd64.S

We start by assembling this file into an object file, and linking that object file to an ELF file:

% as --64 -o hello_amd64.o hello_amd64.S
% ld -o hello_amd64 hello_amd64.o

Without further ado, let’s try it:

% ./hello_amd64
hello, world!

How about tracing it?

% ktrace ./hello_amd64
hello, world!

% kdump
  1885 ktrace   RET   ktrace 0
  1885 ktrace   CALL  execve(0x7fffffffe967,0x7fffffffe5f8,0x7fffffffe608)
  1885 ktrace   NAMI  "./hello_amd64"
  1885 hello_amd64 RET   execve 0
  1885 hello_amd64 CALL  write(0x1,0x4000de,0xf)
  1885 hello_amd64 GIO   fd 1 wrote 15 bytes
       "hello, world!
        \0"
  1885 hello_amd64 RET   write 15/0xf
  1885 hello_amd64 CALL  exit(0)

Now ain’t that sweet?

Can we learn more?

% file hello_amd64
hello_amd64: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD),
             statically linked, not stripped

Of course, it’s a static program.

And we can disassemble it with objdump(1) too:

% objdump --disassemble hello_amd64

Disassembly of section .text:

00000000004000b0 <_start>:
  4000b0:       ba 0f 00 00 00          mov    $0xf,%edx
  4000b5:       48 c7 c6 de 00 40 00    mov    $0x4000de,%rsi
  4000bc:       bf 01 00 00 00          mov    $0x1,%edi
  4000c1:       48 c7 c0 04 00 00 00    mov    $0x4,%rax
  4000c8:       49 89 ca                mov    %rcx,%r10
  4000cb:       0f 05                   syscall

00000000004000cd <bye>:
  4000cd:       bf 00 00 00 00          mov    $0x0,%edi
  4000d2:       48 c7 c0 01 00 00 00    mov    $0x1,%rax
  4000d9:       49 89 ca                mov    %rcx,%r10
  4000dc:       0f 05                   syscall

And it is a truly minimalistic ELF file:

% objdump --full-contents hello_amd64

Contents of section .text:
 4000b0 ba0f0000 0048c7c6 de004000 bf010000  .....H....@.....
 4000c0 0048c7c0 04000000 4989ca0f 05bf0000  .H......I.......
 4000d0 000048c7 c0010000 004989ca 0f05      ..H......I....  

Contents of section .rodata:
 4000de 68656c6c 6f2c2077 6f726c64 210a00    hello, world!..

There are four sections, but section .data and section .bss are empty (size 0):

% objdump --section-headers hello_amd64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         0000002e  00000000004000b0  00000000004000b0  000000b0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata       0000000f  00000000004000de  00000000004000de  000000de  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .data         00000000  00000000005000f0  00000000005000f0  000000f0  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  3 .bss          00000000  00000000005000f0  00000000005000f0  000000f0  2**2
                  ALLOC

Discovering the FreeBSD/i386 ABI

The following section will be a lot shorter than the previous one, because now, we know what to do. So we fire up FreeBSD/i386, either on bare metal or in an emulator, and compile hello0.c with the -static flag. Then, a gdb(1) session will help us understand what’s going on.

Disassembling with gdb

$ gdb --quiet ./hello0
(no debugging symbols found)...

(gdb) b main
Breakpoint 1 at 0x80481f0

(gdb) r
Starting program: /users/farid/hello0 

Breakpoint 1, 0x080481f0 in main ()

Alright, let’s disassemble main:

(gdb) disassemble
Dump of assembler code for function main:
0x080481f0 <main+0>:    lea    0x4(%esp),%ecx
0x080481f4 <main+4>:    and    $0xfffffff0,%esp
0x080481f7 <main+7>:    pushl  0xfffffffc(%ecx)
0x080481fa <main+10>:   push   %ebp
0x080481fb <main+11>:   mov    %esp,%ebp
0x080481fd <main+13>:   push   %ecx
0x080481fe <main+14>:   sub    $0x14,%esp
0x08048201 <main+17>:   mov    0x807000c,%eax
0x08048206 <main+22>:   movl   $0xe,0x8(%esp)
0x0804820e <main+30>:   mov    %eax,0x4(%esp)
0x08048212 <main+34>:   movl   $0x1,(%esp)
0x08048219 <main+41>:   call   0x8061088 <write>
0x0804821e <main+46>:   mov    $0x0,%eax
0x08048223 <main+51>:   add    $0x14,%esp
0x08048226 <main+54>:   pop    %ecx
0x08048227 <main+55>:   pop    %ebp
0x08048228 <main+56>:   lea    0xfffffffc(%ecx),%esp
0x0804822b <main+59>:   ret

Unlike FreeBSD/amd64, FreeBSD/i386’s kernel ABI requires us to push the arguments of write onto the stack, in reverse order:

0x08048201 <main+17>:   mov    0x807000c,%eax
0x08048206 <main+22>:   movl   $0xe,0x8(%esp)
0x0804820e <main+30>:   mov    %eax,0x4(%esp)
0x08048212 <main+34>:   movl   $0x1,(%esp)

Calling the function write:

(gdb) b *0x08048219
Breakpoint 2 at 0x8048219

(gdb) c
Continuing.

Breakpoint 2, 0x08048219 in main ()

(gdb) stepi
0x08061088 in write ()

We’re now in write, so let’s disassemble that:

(gdb) disassemble
Dump of assembler code for function write:
0x08061088 <write+0>:   mov    $0x4,%eax
0x0806108d <write+5>:   int    $0x80
0x0806108f <write+7>:   jb     0x8061080 <close+12>
0x08061091 <write+9>:   ret

Aha, the kernel is invoked in a different way. Instead of the syscall instruction, we call int $0x80. The 4 in %eax is (again) the syscall ID for the WRITE kernel syscall. Let’s make sure that’s the point where the string “hello, world!” is output:

(gdb) b *0x08061091
Breakpoint 3 at 0x8061091

(gdb) c
Continuing.
hello, world!

Breakpoint 3, 0x08061091 in write ()

(gdb) c
Continuing.

Program exited normally.

(gdb) q

That’s all we need to know to write a bare-bones i386 hello world program.

Hello, World in FreeBSD/i386 Assembler

Okay, here’s the program:

// hello_i386.S -- hello world in GNU ASM (GAS). FreeBSD/i386 version.

// Some constants
.equiv STDOUT, 1                /* file descriptor for stdout */
.equiv RETVAL, 0                /* argument for exit() */
.equiv SYS_WRITE, 4             /* WRITE syscall ID */
.equiv SYS_EXIT, 1              /* EXIT syscall ID */
.equiv DUMMY, 0xdeadbeef        /* Used for stack padding */
        
// Data section (read-only)
        .section .rodata
        .align 4
msg:
        .string "hello, world!\n"
        len = . - msg
        
// Code section
        .text
        .align 4
        .global _start
_start:
        // Setup a new stack frame
        push    %ebp
        movl    %esp,%ebp
        sub     $0x14,%esp

        // Prepare arguments of write(STDOUT,msg,len);
        movl    $len,0x8(%esp)
        movl    $msg,0x4(%esp)
        movl    $STDOUT,(%esp)

        // WRITE SYSCALL
        pushl   $DUMMY
        movl    $SYS_WRITE,%eax
        int     $0x80
        popl    %ebx                /* pop DUMMY from stack */
        
        // Unwind stack frame
        add     $0x14,%esp
        pop     %ebp
        
bye:
        // Setup a new stack frame
        push    %ebp
        movl    %esp,%ebp
        sub     $0x8,%esp

        // Prepare arguments of exit(RETVAL);
        movl    $RETVAL,(%esp)

        // EXIT SYSCALL
        pushl   $DUMMY
        movl    $SYS_EXIT,%eax
        int     $0x80
        popl    %ebx                 /* NOTREACHED: pop DUMMY from stack */

        // NOTREACHED: No need to unwind stack frame
        add     $0x8,%esp
        pop     %ebp

Note the dummy on the stack? This is because we didn’t want to put the call to the WRITE syscall in an assembler function of its own (like it used to be in __sys_write), so we needed to shift the stack somewhat to make place for a (ficticious) return address. We’ve used the catchy steak^W0xdeadbeef as a place holder, but we must ensure that we pop it from the stack right after the syscall returns. We burn one register for this: %ebx.

Testing hello_i386.S

All that remains to be done is to test this program on a 32-bit platform, just as we did earlier on the 64-bit platform.

$ as --32 -o hello_i386.o hello_i386.S
$ ld -o hello_i386 hello_i386.o
Tracing hello_i386

Again, let’s try it:

$ ./hello_i386
hello,world!

Tracing could be useful too:

$ ktrace ./hello_i386
hello, world!

$ kdump
 47541 ktrace   RET   ktrace 0
 47541 ktrace   CALL  execve(0xbfbfedc7,0xbfbfeca4,0xbfbfecac)
 47541 ktrace   NAMI  "./hello_i386"
 47541 hello_i386 RET   execve 0
 47541 hello_i386 CALL  write(0x1,0x80480c0,0xf)
 47541 hello_i386 GIO   fd 1 wrote 15 bytes
       "hello, world!
        \0"
 47541 hello_i386 RET   write 15/0xf
 47541 hello_i386 CALL  exit(0)

Okay, it looks like under FreeBSD/amd64.

Disassembling with objdump

Learning more? Sure:

$ file hello_i386
hello_i386: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD),
            statically linked, not stripped

But this, we knew already.

Disassembling with objdump(1)? I hear ya!

$ objdump --disassemble hello_i386
hello_i386:     file format elf32-i386-freebsd

Disassembly of section .text:

08048074 <_start>:
 8048074:       55                      push   %ebp
 8048075:       89 e5                   mov    %esp,%ebp
 8048077:       83 ec 14                sub    $0x14,%esp
 804807a:       c7 44 24 08 0f 00 00    movl   $0xf,0x8(%esp)
 8048081:       00 
 8048082:       c7 44 24 04 c0 80 04    movl   $0x80480c0,0x4(%esp)
 8048089:       08 
 804808a:       c7 04 24 01 00 00 00    movl   $0x1,(%esp)
 8048091:       68 ef be ad de          push   $0xdeadbeef
 8048096:       b8 04 00 00 00          mov    $0x4,%eax
 804809b:       cd 80                   int    $0x80
 804809d:       5b                      pop    %ebx
 804809e:       83 c4 14                add    $0x14,%esp
 80480a1:       5d                      pop    %ebp

080480a2 <bye>:
 80480a2:       55                      push   %ebp
 80480a3:       89 e5                   mov    %esp,%ebp
 80480a5:       83 ec 08                sub    $0x8,%esp
 80480a8:       c7 04 24 00 00 00 00    movl   $0x0,(%esp)
 80480af:       68 ef be ad de          push   $0xdeadbeef
 80480b4:       b8 01 00 00 00          mov    $0x1,%eax
 80480b9:       cd 80                   int    $0x80
 80480bb:       5b                      pop    %ebx
 80480bc:       83 c4 08                add    $0x8,%esp
 80480bf:       5d                      pop    %ebp

It is again a truly minimalistic ELF file:

$ objdump --full-contents hello_i386

hello_i386:     file format elf32-i386-freebsd

Contents of section .text:
 8048074 5589e583 ec14c744 24080f00 0000c744  U......D$......D
 8048084 2404c080 0408c704 24010000 0068efbe  $.......$....h..
 8048094 addeb804 000000cd 805b83c4 145d5589  .........[...]U.
 80480a4 e583ec08 c7042400 00000068 efbeadde  ......$....h....
 80480b4 b8010000 00cd805b 83c4085d           .......[...]    
Contents of section .rodata:
 80480c0 68656c6c 6f2c2077 6f726c64 210a00    hello, world!..

There are four sections here, two of them empty:

$ objdump --section-headers hello_i386

hello_i386:     file format elf32-i386-freebsd

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000004c  08048074  08048074  00000074  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rodata       0000000f  080480c0  080480c0  000000c0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .data         00000000  080490d0  080490d0  000000d0  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  3 .bss          00000000  080490d0  080490d0  000000d0  2**2
                  ALLOC
Running hello_i386 under gdb

Unlike FreeBSD/amd64, we do want to run this program in a gdb(1) session, so we can examine the stack and the registers just before the WRITE and EXIT syscalls:

$ gdb --quiet ./hello_i386
(no debugging symbols found)...

(gdb) b _start
Breakpoint 1 at 0x804807a

(gdb) r
Starting program: /users/farid/hello_i386 

Breakpoint 1, 0x0804807a in _start ()

We’re now ready to examine main:

(gdb) disassemble
Dump of assembler code for function _start:
0x08048074 <_start+0>:  push   %ebp
0x08048075 <_start+1>:  mov    %esp,%ebp
0x08048077 <_start+3>:  sub    $0x14,%esp
0x0804807a <_start+6>:  movl   $0xf,0x8(%esp)
0x08048082 <_start+14>: movl   $0x80480c0,0x4(%esp)
0x0804808a <_start+22>: movl   $0x1,(%esp)
0x08048091 <_start+29>: push   $0xdeadbeef
0x08048096 <_start+34>: mov    $0x4,%eax
0x0804809b <_start+39>: int    $0x80
0x0804809d <_start+41>: pop    %ebx
0x0804809e <_start+42>: add    $0x14,%esp
0x080480a1 <_start+45>: pop    %ebp
End of assembler dump.

We want to look at the stack, just before executing int $0x80, i.e. just before we call the kernel. So we set a breakpoint there, and continue executing until we reach that point:

(gdb) b *0x804809b
Breakpoint 2 at 0x804809b

(gdb) c
Continuing.

Breakpoint 2, 0x0804809b in _start ()

Let’s first make sure that %eax contains 4, the syscall ID of write:

(gdb) info registers
eax            0x4      4
ecx            0x0      0
edx            0x0      0
ebx            0x0      0
esp            0xbfbfec3c       0xbfbfec3c
ebp            0xbfbfec54       0xbfbfec54
esi            0x0      0
edi            0x0      0
eip            0x804809b        0x804809b
eflags         0x282    642
cs             0x33     51
ss             0x3b     59
ds             0x3b     59
es             0x3b     59
fs             0x3b     59
gs             0x3b     59

Yup, %eax is set to 4. How about the stack? We look at 4 double words, starting at %esp (which is here, according to info registers equal to 0xbfbfec3c):

(gdb) x/4x 0xbfbfec3c
0xbfbfec3c:     0xdeadbeef      0x00000001      0x080480c0      0x0000000f

We see here the dummy 0xdeadbeef, then the write arguments (1 is stdout, 0x080480c0 is probaby the string, and 0xf is the length of the string). To confirm that the 3rd double word points to the string, we examine that too:

(gdb) x/s 0x080480c0
0x80480c0 <msg>:         "hello, world!\n"

Very good.

We continue executing until we reach the EXIT syscall:

(gdb) b bye
Breakpoint 3 at 0x80480a8

(gdb) c
Continuing.
hello, world!

Breakpoint 3, 0x080480a8 in bye ()

Okay, let’s disassemble that too:

(gdb) disassemble
Dump of assembler code for function bye:
0x080480a2 <bye+0>:     push   %ebp
0x080480a3 <bye+1>:     mov    %esp,%ebp
0x080480a5 <bye+3>:     sub    $0x8,%esp
0x080480a8 <bye+6>:     movl   $0x0,(%esp)
0x080480af <bye+13>:    push   $0xdeadbeef
0x080480b4 <bye+18>:    mov    $0x1,%eax
0x080480b9 <bye+23>:    int    $0x80
0x080480bb <bye+25>:    pop    %ebx
0x080480bc <bye+26>:    add    $0x8,%esp
0x080480bf <bye+29>:    pop    %ebp
End of assembler dump.

So we need to add a breakpoint at bye+23, just before invoking the kernel with the EXIT syscall:

(gdb) b *0x080480b9
Breakpoint 4 at 0x80480b9

(gdb) c
Continuing.

Breakpoint 4, 0x080480b9 in bye ()

The registers should show that %eax contains 1 (the syscall ID of the EXIT syscall), and the stack should contain a dummy and 0 (because we want _exit(2) to exit the program with a return code of 0). Really?

(gdb) info registers
eax            0x1      1
ecx            0x0      0
edx            0x0      0
ebx            0xdeadbeef       -559038737
esp            0xbfbfec48       0xbfbfec48
ebp            0xbfbfec54       0xbfbfec54
esi            0x0      0
edi            0x0      0
eip            0x80480b9        0x80480b9
eflags         0x292    658
cs             0x33     51
ss             0x3b     59
ds             0x3b     59
es             0x3b     59
fs             0x3b     59
gs             0x3b     59

%eax is correct. And what’s in the stack at %esp?

(gdb) x/2x 0xbfbfec48
0xbfbfec48:     0xdeadbeef      0x00000000

Okay. The instructions below int $0x80 (i.e. at bye+25 and below) are never executed, as they are not reached. Just after the syscall, the program exits. We confirm this by executing just a single instruction with gdb(1)‘s stepi command:

(gdb) stepi

Program exited normally.
(gdb) q

Summary

In this tutorial, you’ve learned how to write a bare-bones assembler program, starting with nothing more than the assembler output of a C compiler. FreeBSD/amd64’s kernel ABI requires us to put arguments of a syscall in specific registers, and then to call the syscall machine instruction. FreeBSD/i386’s kernel ABI uses a stack instead, and the kernel is invoked with int $0x80. In both cases, we are required to put the syscall ID in %rax resp. %eax. We also need to call the EXIT syscall to exit the program cleanly, avoiding a core dump.

We’ve used cc(1)‘s -static flag to create a binary that doesn’t use dynamic shared librares to facilitate disassembly, both with gdb(1) and objdump(1). Learning from the disassembled code, we finally looked at a bare-bones hello world program in assembly languge in hello_amd64.S and hello_i386.S (see above). Assembling those with as(1) and linking them into regular ELF files with ld(1) has been trivial.

If you want to dive deeper into assembly language programming, try to write a program that outputs “hello, world!\n” into a file. Discover the syscall IDs of OPEN, WRITE, CLOSE and EXIT, and how they are supposed to be called on your platform (i386 or amd64). Then write the bare-bones assembly program(s) for that. Update 2010/02/05: here’s the solution for FreeBSD/amd64 and FreeBSD/i386.

You may also want to try what you’ve learned in this tutorial on the 32-bit and 64-bit versions of Linux and OpenSolaris, which both have different Kernel-ABIs.

Update 2010/02/07: Just a quick note: hello_amd64.S is compatible with OpenSolaris/amd64, running in VirtualBox in 64-bit mode. Even though it is not possible to statically link files on OpenSolaris (anymore?), looking at the disassembly output of gdb(1) at “write” and “__write” (it takes quite some time and a lot of stepi to get there), shows that OpenSolaris uses the x86-64 ABI convention, and it also happens that the SYS_WRITE ID is the same as under FreeBSD (SunOS’s BSD heritage shows!). So we can use hello_amd64.S unchanged!

To assemble and link:

$ as --64 -o hello_amd64.o hello_amd64.S
$ /usr/ccs/bin/ld -o hello_amd64 hello_amd64.o

Note that the Solaris linker creates 64-bit binaries if the first object file (here hello_amd64.o) is 64-bit, and 32-bit binaries otherwise:

$ file hello_amd64
hello_amd64: ELF 64-bit LSB executable AMD64 Version 1, dynamically linked,
not stripped, no debugging information available

$ ./hello_amd64
hello, world!

Update 2010/02/05: If you are already familiar with the Intel syntax (e.g. of NASM), you may be confused by the somewhat archaic AT&T syntax of GAS. You may want to read a side by side comparison of Intel and AT&T syntax to get up to speed.

Update 2010/05/25: I just finished a tutorial about writing Hello, World on the Bare Metal (Real Mode); an environment where no operating system kernel is available.

Happy hacking!

One Comment

  1. Skippy VonDrake

    Excellent stuff, Farid. It’s like a tutorial tailored for just what I need!
    Skippy.