x86 Assembly

If you don't assemble the (assembly) code, it's complete gibberish to the computer. Wikibooks

Introduction

Introduction

Why Learn Assembly?

Basic FAQ

Basic FAQ

How Computer Reads Assembly

The computer cannot read the assembly language that you write. Your assembler will convert the assembly language into a form of binary information called "machine code" that your computer uses to perform its operations.

Platform differences

The basic x86 machine code is dependent only on the processor. The x86 versions of Windows and Linux are obviously built on the x86 machine code. There are a few differences between Linux and Windows programming in x86 Assembly:

  1. On a Linux computer, the most popular assemblers are the GAS assembler, which uses the AT&T syntax for writing code, and the Netwide Assembler, also known as NASM, which uses a syntax similar to MASM.
  2. On a Windows computer, the most popular assembler is MASM, which uses the Intel syntax.
  3. The available software interrupts, and their functions, are different on Windows and Linux.
  4. The available code libraries are different on Windows and Linux.

x86 Family

x86 Family

The term "x86" can refer both to an instruction set architecture and to microprocessors which implement it. The name x86 is derived from the fact that many of Intel's early processors had names ending in "86".

The x86 instruction set architecture originated at Intel and has evolved over time by the addition of new instructions as well as the expansion to 64-bits. As of 2009, x86 primarily refers to IA-32 (Intel Architecture, 32-bit) and/or x86-64, the extension to 64-bit computing.

Versions of the x86 instruction set architecture have been implemented by Intel, AMD and several other vendors, with each vendor having its own family of x86 processors.

x86 Architecture

x86 Architecture

The x86 Architecture

The x86 architecture has:

64-bit x86 has additional registers.

General-Purpose Registers (GPR) (32-bit naming conventions)

The 8 GPRs are:

  1. Accumulator register (AX). Used in arithmetic operations.
  2. Counter register (CX). Used in shift/rotate instructions and loops.
  3. Data register (DX). Used in arithmetic operations and I/O operations.
  4. Base register (BX). Used as a pointer to data (located in segment register DS, when in segmented mode).
  5. Stack Pointer register (SP). Pointer to the top of the stack.
  6. Stack Base Pointer register (BP). Used to point to the base of the stack.
  7. Source Index register (SI). Used as a pointer to a source in stream operations.
  8. Destination Index register (DI). Used as a pointer to a destination in stream operations.

The order in which they are listed here is for a reason: it is the same order that is used in a push-to-stack operation, which will be covered later.

All registers can be accessed in 16-bit, 32-bit and 64-bit modes:

It is also possible to address the first four registers (AX, CX, DX and BX) in their size of 16-bit as two 8-bit halves:

For example, CL is the LSB of the counter register, whereas CH is its MSB.

The following table summarizes five ways to access the accumulator, counter, data and base registers: 64-bit, 32-bit, 16-bit, 8-bit LSB, and 8-bit MSB:

Five ways to access the accumulator, counter, data and base registers: 64-bit, 32-bit, 16-bit, 8-bit LSB, and 8-bit MSB

Segment Registers

The 6 Segment Registers are:

Most applications on most modern operating systems (FreeBSD, Linux or Microsoft Windows) use a memory model that points nearly all segment registers to the same place and uses paging instead, effectively disabling their use. Typically the use of FS or GS is an exception to this rule, instead being used to point at thread-specific data.

EFLAGS Register

The EFLAGS is a 32-bit register used as a collection of bits representing Boolean values to store the results of operations and the state of the processor.

EFLAGS bits

The bits named 0 and 1 are reserved bits and shouldn't be modified.

The different use of these flags are:

Instruction Pointer

The EIP register contains the address of the next instruction to be executed if no branching is done.

EIP can only be read through the stack after a call instruction.

Memory

The x86 architecture is little-endian, meaning that multi-byte values are written least significant byte first. (This refers only to the ordering of the bytes, not to the bits.)

Little Endian

The 32 bit value B3B2B1B016 on an x86 would be represented in memory as:

+----+----+----+----+
| B0 | B1 | B2 | B3 |
+----+----+----+----+

The 32 bits double word 0x1BA583D4 (the 0x denotes hexadecimal) would be written in memory as:

+----+----+----+----+
| D4 | 83 | A5 | 1B |
+----+----+----+----+

This will be seen as 0xD4 0x83 0xA5 0x1B when doing a memory dump.

Two's Complement Representation

Two's complement is the standard way of representing negative integers in binary. The sign is changed by inverting all of the bits and adding one.

For example,

+----------+------+
| Start:   | 0001 |
+----------+------+
| Invert:  | 1110 |
+----------+------+
| Add One: | 1111 |
+----------+------+

Addressing modes

The addressing mode indicates how the operand is presented.

Register Addressing

Operand address R is in the address field.

mov ax, bx  ; moves contents of register bx into ax
Immediate

Aactual value is in the field.

mov ax, 1   ; moves value of 1 into register ax

Or:

mov ax, 010Ch ; moves value of 0x010C into register ax
Direct memory addressing

Operand address is in the address field.

.data
my_var dw 0abcdh ; my_var = 0xabcd
.code
mov ax, [my_var] ; copy my_var content in ax (ax=0xabcd)
Direct offset addressing

Uses arithmetics to modify address.

byte_tbl db 12,15,16,22,..... ; Table of bytes
mov al,[byte_tbl+2]
mov al,byte_tbl[2] ; same as the former
Register Indirect

Field points to a register that contains the operand address.

mov ax,[di]

The registers used for indirect addressing are BX, BP, SI, DI

Base-index
mov ax,[bx + di]

For example, if we are talking about an array, BX contains the address of the beginning of the array, and DI contains the index into the array.

Base-index with displacement
mov ax,[bx + di + 10]

General-Purpose Registers (GPR) (64-bit naming conventions)

16 32 and 64 Bits

64-bit x86 adds 8 more general-purpose registers, named R8, R9, R10 and so on up to R15. It also introduces a new naming convention that must be used for these new registers and can also be used for the old ones (except that AH, CH, DH and BH have no equivalents). In the new convention:

For 128-bit registers, see SSE.

Stack

The stack is a Last In First Out (LIFO) data structure; data is pushed onto it and popped off of it in the reverse order.

mov ax, 006Ah
mov bx, F79Ah
mov cx, 1124h
; Push the value in AX, BX, and CX onto the top of the stack
push ax
push bx
push cx

Now the stack has $006A, $F79A, and $1124.

call do_stuff

Do some stuff. The function is not forced to save the registers it uses, hence us saving them.

pop cx ;Pop the last element pushed onto the stack into CX, $1124; the stack now has $006A and $F79A.
pop bx ;Pop the last element pushed onto the stack into BX, $F79A; the stack now has just $006A.
pop ax ;Pop the last element pushed onto the stack into AX, $006A; the stack is empty.

The stack has two common uses:

CPU Operation Modes

CPU Operation Modes

Real Mode

Real Mode is a holdover from the original Intel 8086. The Intel 8086 accessed memory using 20-bit addresses. But, as the processor itself was 16-bit, Intel invented an addressing scheme that provided a way of mapping a 20-bit addressing space into 16-bit words. Today's x86 processors start in the so-called Real Mode, which is an operating mode that mimics the behavior of the 8086, with some very tiny differences, for backwards compatibility.

Protected Mode

Flat Memory Model

If programming in a modern operating system (such as Linux, Windows), you are basically programming in flat 32-bit mode. Any register can be used in addressing, and it is generally more efficient to use a full 32-bit register instead of a 16-bit register part. Additionally, segment registers are generally unused in flat mode, and it is generally a bad idea to touch them.

Multi-Segmented Memory Model

Using a 32-bit register to address memory, the program can access (almost) all of the memory in a modern computer. For earlier processors (with only 16-bit registers) the segmented memory model was used. The 'CS', 'DS', and 'ES' registers are used to point to the different chunks of memory. For a small program (small model) the CS=DS=ES. For larger memory models, these 'segments' can point to different locations.

Comments

When writing code, it is very helpful to use some comments explaining what is going on. A comment is a section of regular text that the assembler ignores when turning the assembly code into the machine code. In assembly comments are usually denoted with a semicolon ";", although GAS uses "#" for single line comments and "/ ... /" for multi-line comments.

For example:

Label1:
   mov ax, bx    ;move contents of bx into ax
   add ax, bx    ;add the contents of bx into ax
   ...

16 32 and 64 Bits

16 32 and 64 Bits

Intrinsic Data Types

Strictly speaking, assembly has no predefined data types like higher-level programming languages. Any general purpose register can hold any sequence of two or four bytes, whether these bytes represent numbers, letters, or other data. In the same way, there are no concrete types assigned to blocks of memory; you can assign to them whatever value you like.

That said, one can group data in assembly into two categories: integer and floating point. While you could load a floating point value into a register and treat it like an integer, the results would be unexpected, so it is best to keep them separate.

Integer

An integer represents a whole number, either positive or negative.

Some assembly instructions behave slightly differently in regards to the sign bit; as such, there is a minor distinction between signed and unsigned integers.

Floating Point Numbers

Floating point numbers are used to approximate the real numbers that usually contain digits before and after the decimal point (like π, 3.14159...). Unlike integers where the decimal point is understood to be after all digits, in floating point numbers the decimal point floats anywhere in the sequence of digits. The precision of floating point numbers is limited and thus a number like π can only be represented approximately.

Originally, floating point was not part of the main processor, requiring the use of emulating software. However, there were floating point coprocessors that allowed operations on this data-type, and starting with the 486DX, were integrated directly with the CPU.

As such, floating point operations are not necessarily compatible with all processors. If you need to perform this type of arithmetic, you may want to use a software library as a backup code path.

x86 Instructions

Conventions

Instructions that take no operands:

Instr

Instructions that take 1 operand:

Instr arg

Instructions that take 2 operands. Notice how the format of the instruction is different for different assemblers.

Instr src, dest    # GAS Syntax
Instr dest, src    ; Intel syntax

Instructions that take 3 operands. Notice how the format of the instruction is different for different assemblers.

Instr aux, src, dest   # GAS Syntax
Instr dest, src, aux   ; Intel syntax

Suffixes

Operation Suffixes

Some instructions require the use of suffixes to specify the size of the data which will be the subject of the operation, such as:

An example of the usage with the mov instruction on a 32-bit architecture, GAS syntax:

movl $0x000F, %eax          # Store the value F into the eax register

Data Transfer Instructions

Move: mov

mov src, dest  # GAS Syntax
mov dest, src  ; Intel Syntax

The mov instruction copies the src operand into the dest operand.

Operands

Modified flags: No FLAGS are modified by this instruction.

Data swap: xchg and cmpxchg

xchg src, dest
xchg dest, src

The xchg instruction swaps the src operand with the dest operand. It's like doing three move operations: from dest to a temporary (another register), then from src to dest, then from the temporary to src, except that no register needs to be reserved for temporary storage.

If one of the operands is a memory address, then the operation has an implicit LOCK prefix, that is, the exchange operation is atomic. This can have a large performance penalty.

It's also worth noting that the common NOP (no op) instruction, 0x90, is the opcode for xchgl %eax, %eax.

Operands.

Modified flags: No FLAGS are modified by this instruction.

cmpxchg arg2, arg1
cmpxchg arg1, arg2

Compare and exchange.

Move with zero extend

movz src, dest
movzx dest, src

Sign Extend

movs src, dest
movsx dest, src

Move String

movsb

movsb: Move byte

movsw

movsw: Move word

Load Effective Address

lea src, dest
lea dest, src

Control Flow Instructions

Almost all programming languages have the ability to change the order in which statements are evaluated, and assembly is no exception. The instruction pointer (EIP) register contains the address of the next instruction to be executed. To change the flow of control, the programmer must be able to modify the value of EIP. This is where control flow functions come in.

Comparison: test and cmp

test arg1, arg2
test arg2, arg1

Performs a bit-wise logical AND on arg1 and arg2 the result of which we will refer to as Temp and sets the ZF (zero), SF (sign) and PF (parity) flags based on Temp. Temp is then discarded.

cmp arg2, arg1
cmp arg1, arg2

Jump Instructions

Unconditional Jumps
jmp loc
Jump on Equality
je loc
Jump on Inequality
jne loc
Jump if Greater
Jump if Less
Jump on Zero
Jump on Sign

Function Calls

call proc

Pushes the address of the next opcode onto the top of the stack, and jumps to the specified location. This is used mostly for subroutines.

ret [val]

Loads the next value on the stack into EIP, and then pops the specified number of bytes off the stack. If val is not supplied, the instruction will not pop any values off the stack after returning.

Loop Instructions

loop arg

The loop instruction decrements ECX and jumps to the address specified by arg unless decrementing ECX caused its value to become zero. For example:

mov ecx, 5
start_loop:
; the code here would be executed 5 times
loop start_loop

Enter and Leave

enter arg

enter creates a stack frame with the specified amount of space allocated on the stack.

leave

leave destroys the current stack frame, and restores the previous frame. Using Intel syntax this is equivalent to:

mov esp, ebp
pop ebp

Other Control Instructions

hlt

Halts the processor. Execution will be resumed after processing next hardware interrupt, unless IF is cleared.

nop

No operation. This instruction doesn't do anything, but wastes an instruction cycle in the processor. This instruction is often represented as an XCHG operation with the operands EAX and EAX.

lock

Asserts #LOCK prefix on next instruction.

wait

Waits for the FPU to finish its last calculation.

Arithmetic Instructions

Logic Instructions

Shift and Rotate Instructions

Other Instructions

Stack Instructions

push arg

This instruction decrements the stack pointer and stores the data specified as the argument into the location pointed to by the stack pointer.

pop arg

This instruction loads the data stored in the location pointed to by the stack pointer into the argument specified and then increments the stack pointer.

Flags instructions

x86 Interrupts

Interrupts are special routines that are defined on a per-system basis. This means that the interrupts on one system might be different from the interrupts on another system. Therefore, it is usually a bad idea to rely heavily on interrupts when you are writing code that needs to be portable.

Interrupt Instruction

int arg

This instruction issues the specified interrupt. For instance:

int 0x0A

Calls interrupt 10 (0x0A (hex) = 10 (decimal)).

Types of Interrupts

There are 3 types of interrupts: Hardware Interrupts, Software Interrupts and Exceptions.

Hardware Interrupts

Hardware interrupts are triggered by hardware devices. Hardware interrupts are typically asynchronous: their occurrence is unrelated to the instructions being executed at the time they are raised.

Software Interrupts

Software interrupts are usually used to transfer control to a function in the operating system kernel. Software interrupts are triggered by the instruction int. For example, the instruction int 14h triggers interrupt 0x14. The processor then stops the current program, and jumps to the code to handle interrupt 14. When interrupt handling is complete, the processor returns flow to the original program.

Exceptions

Exceptions are caused by exceptional conditions in the code which is executing, for example an attempt to divide by zero or access a protected memory area. The processor will detect this problem, and transfer control to a handler to service the exception. This handler may re-execute the offending code after changing some value (for example, the zero dividend), or if this cannot be done, the program causing the exception may be terminated.

x86 Assemblers

GAS Syntax

MASM Syntax

Interfacing with Linux: System Calls

Interfacing with Linux

Syscalls

Syscalls are the interface between user programs and the Linux kernel. They are used to let the kernel perform various system tasks, such as file access, process management and networking. In the C programming language, you would normally call a wrapper function which executes all required steps or even use high-level features such as the standard IO library.

On Linux, there are several ways to make a syscall. This page will focus on making syscalls by calling a software interrupt using int $0x80 (x86 and x86_64) or syscall (x86_64). This is an easy and intuitive method of making syscalls in assembly-only programs.

Making a syscalls

To make a syscall using an interrupt, you have to pass all required information to the kernel by copying them into general purpose registers. Each syscall has a fixed number (note the numbers differ between int $0x80 and syscall in the following text). You specify the syscall by writing the number into the eax/rax register and pass the parameters by writing them in the appropriate registers before making the actual calls. Parameters are passed in the order they appear in the function signature of the corresponding C wrapper function.

After everything is set up correctly, you call the interrupt using int $0x80 or syscall and the kernel performs the task.

The return or error value of a syscall is written to eax or rax.

The kernel uses its own stack to perform the actions. The user stack is not touched in any way.

int 0x80

On both Linux x86 and Linux x86_64 systems you can make a syscall by calling interrupt 0x80 using the int $0x80 command. Parameters are passed by setting the general purpose registers as following:

Syscall # Param 1 Param 2 Param 3 Param 4 Param 5 Param 6
eax ebx ecx edx esi edi ebp

The return value is in the eax register.

The syscall numbers are described in the Linux source file arch/x86/include/asm/unistd_32.h.

All registers are preserved during the syscall.

syscall

The x86_64 architecture introduced a dedicated instruction to make a syscall. It does not access the interrupt descriptor table and is faster. Parameters are passed by setting the general purpose registers as following:

The syscall numbers are described in the Linux source file arch/x86/include/asm/unistd_64.h.

Syscall # Param 1 Param 2 Param 3 Param 4 Param 5 Param 6
rax rdi rsi rdx rcx r8 r9

The return value is in the rax register.

All registers, except rcx and r11, are preserved during the syscall.

Hello World example

This example will write the text "Hello World" to stdout using the write syscall and quit the program using the _exit syscall.

Syscall signatures:

ssize_t write(int fd, const void *buf, size_t count);
void _exit(int status);

The following is the C program of this example:

#include <unistd.h>

int main(int argc, char *argv[])
{
    write(1, "Hello World\n", 12); /* write "Hello World" to stdout */
    _exit(0);                      /* exit with error code 0 (no error) */
}

Both of the assembly examples start alike: a string stored in the data segment and _start as a global symbol.

.data
msg: .ascii "Hello World\n"

.text
.global _start
int 0x80

As defined in arch/x86/include/asm/unistd_32.h, the syscall numbers for write and _exit are:

#define __NR_exit 1
#define __NR_write 4

The parameters are passed exactly as one would in a C program, using the correct registers. After everything is set up, the syscall is made using int $0x80.

_start:
    movl $4, %eax   # use the write syscall
    movl $1, %ebx   # write to stdout
    movl $msg, %ecx # use string "Hello World"
    movl $12, %edx  # write 12 characters
    int $0x80       # make syscall

    movl $1, %eax   # use the _exit syscall
    movl $0, %ebx   # error code 0
    int $0x80       # make syscall
syscall

In arch/x86/include/asm/unistd_64.h, the syscall numbers are defined as following:

#define __NR_write 1
#define __NR_exit 60

Parameters are passed just like in the int $0x80 example, except that the order of the registers is different. The syscall is made using syscall.

_start:
    movq $1, %rax   # use the write syscall
    movq $1, %rdi   # write to stdout
    movq $msg, %rsi # use string "Hello World"
    movq $12, %rdx  # write 12 characters
    syscall         # make syscall

    movq $60, %rax  # use the _exit syscall
    movq $0, %rdi   # error code 0
    syscall         # make syscall

References