Chapter 2: Instructions and Assembly Language Fundamentals

Instructions: Language of the Computer

Welcome to the world of assembly language and machine instructions. This chapter explores how computers actually speak at their most fundamental level — not in C++ or Python, but in the raw language of operations that processors understand. By the end of these notes, you'll understand how high-level code becomes the instructions your CPU executes, how different types of operations work, and how computers manage memory, procedures, and even parallel execution.

Why Assembly Language Matters

You might wonder: why learn assembly language when modern programmers rarely write it directly? The answer is that assembly language is the bridge between human-readable code and what your processor actually does. Understanding it reveals the underlying constraints and opportunities in computer architecture. Every high-level program you write gets translated into assembly instructions, and knowing this language helps you write more efficient code, debug problems more effectively, and understand performance bottlenecks.

Think of assembly language as the vocabulary your computer uses to communicate. Just as learning English grammar helps you write better sentences, learning assembly helps you understand how computers work at a fundamental level.

Operations and Operands: The Building Blocks

Understanding Operations

Every processor understands a fixed set of operations — basic tasks it can perform. For most processors, these operations are simple: add two numbers, compare values, load data from memory, or store data back to memory. The MIPS (Microprocessor without Interlocked Pipeline Stages) processor, which we'll use as our example throughout this chapter, is a classic architecture that demonstrates these concepts clearly.

The most basic operation in nearly every processor is addition. In MIPS assembly, the add instruction performs this task:

add $s0, $s1, $s2

This instruction says: "Take the values in registers $s1 and$ s2, add them together, and store the result in register $s0." All three operands are registers — special, ultra-fast storage locations inside the processor itself.

Working with Registers

A register is like a variable that lives directly in the CPU. MIPS processors have 32 registers, each 32 bits wide (4 bytes). The " $" symbol indicates a register, and the names like$ s0, $s1 refer to different registers. The 's' prefix stands for "saved" — these are registers that preserve their values across procedure calls, making them ideal for variables you want to keep safe.

Here's the key constraint: MIPS can only perform arithmetic operations on register operands. You cannot directly add two numbers stored in memory — you must first load them into registers, perform the operation, and then store the result back if needed. This design choice simplifies the processor but creates the need for load and store instructions.

The lw (load word) instruction brings a value from memory into a register:

lw $t0, 12($sp)

This loads a 32-bit word from memory at the address stored in register $sp, offset by 12 bytes, and stores it in$ t0. The sw (store word) instruction does the reverse, writing a register's value to memory:

sw $t0, 8($sp)

Registers come in different categories. The $t0–$ t9 registers (temporary registers) hold values temporarily and don't need to preserve their values. The $s0–$ s7 registers (saved registers) should preserve their values across function calls. The $sp** (stack pointer) tracks the top of the stack, **$ ra (return address) stores where to return after a function call, and $a0–$ a3 (argument registers) pass parameters to functions.

📝 Section Recap: Processors perform operations on registers — special, fast storage directly in the CPU. MIPS has 32 registers with different purposes. Loading and storing are needed to move data between registers and main memory, creating a critical constraint: all arithmetic happens in registers, never directly in memory.

Representing Numbers: Binary, Hex, and Signed Values

Binary and Hexadecimal Representation

Computers store all data as binary digits (bits) — 0s and 1s. A byte is 8 bits, and a word in MIPS is 32 bits. Writing 32-bit numbers in pure binary (like 11010110101001011010110110110101) is tedious and error-prone, so we use hexadecimal (base 16) as a shorthand. Each hexadecimal digit represents exactly 4 binary bits:

Hex	Binary
0	0000
1	0001
...	...
F	1111

For example, the 32-bit binary number 11010110101001011010110110110101 becomes 0xD6A6DB5 in hexadecimal — much cleaner. The "0x" prefix tells us we're using hex notation.

Signed and Unsigned Integers

Numbers can be interpreted in two ways: unsigned (representing only non-negative values) or signed (representing both positive and negative values). For unsigned integers, all bit patterns simply represent their numeric value. A 32-bit unsigned integer ranges from 0 to $2^{32} - 1$ (roughly 4.3 billion).

Signed integers use two's complement representation, a clever scheme where the most significant bit (leftmost bit) indicates the sign: 0 for positive, 1 for negative. In two's complement:

Positive numbers (0 to $2^{31} - 1$ ) have a leading 0 and work like normal binary
Negative numbers represent $-2^{31}$ at the most significant bit

To negate a number in two's complement, flip all bits and add 1. For example:

The number 5 in 32-bit binary is 00000000000000000000000000000101
Flip all bits: 11111111111111111111111111111010
Add 1: 11111111111111111111111111111011 which represents -5

This representation is elegant because addition works the same way for both positive and negative numbers — the CPU doesn't need different instructions for signed and unsigned arithmetic.

Two's Complement: A system for representing negative numbers where the most significant bit indicates sign, allowing both positive and negative integers to coexist in the same bit pattern, with a range of $-2^{31}$ to $2^{31} - 1$ for 32-bit numbers.

Working with Constants

Often we need to perform operations with constants (fixed numbers) rather than just values in registers. MIPS provides the addi (add immediate) instruction:

addi $s0, $s1, 100

This adds the constant 100 to the value in $s1 and stores the result in$ s0. The word "immediate" means the constant is part of the instruction itself, making it immediately available without needing to load it first.

However, MIPS instructions are only 32 bits long, and some fields are reserved for the operation code and register specifiers, leaving only 16 bits for the constant. This means immediate values can only range from $-2^{15}$ to $2^{15} - 1$ (roughly -32,000 to +32,000). For larger constants, you must load them into a register first using multiple instructions.

📝 Section Recap: Numbers are represented in binary and displayed in hexadecimal for readability. Signed integers use two's complement, allowing negative and positive numbers to coexist. Constants can be used directly in instructions (immediates) but are limited to 16 bits, requiring special handling for larger values.

Representing Instructions: From Assembly to Machine Code

How Instructions Are Encoded

MIPS instructions are 32 bits long, split into distinct fields that tell the processor what to do. For arithmetic operations (R-type instructions), the format is:

\text{[opcode (6 bits)] [rs (5 bits)] [rt (5 bits)] [rd (5 bits)] [shamt (5 bits)] [funct (6 bits)]}

opcode: Identifies the instruction type (like arithmetic, load, store, branch)
rs, rt: Register source operands
rd: Register destination
shamt: Shift amount (for shift operations)
funct: Function code, distinguishing between add, subtract, and other operations

For example, the instruction add $t0, $s1, $s2 has these fields:

opcode = 0 (arithmetic)
rs = register number for $s1
rt = register number for $s2
rd = register number for $t0
shamt = 0 (not shifting)
funct = 32 (add operation)

Load and store instructions (I-type) use a different format:

\text{[opcode (6 bits)] [rs (5 bits)] [rt (5 bits)] [immediate (16 bits)]}

The 16-bit immediate field holds an address offset. For lw $t0, 24($sp), the 24 is the offset stored in the immediate field.

The Translation Hierarchy

Your code travels through several translation stages before execution:

Compilation converts high-level code (like C) into assembly language
Assembly converts assembly mnemonics into machine code (binary instructions)
Linking combines multiple assembled files and resolves external references
Loading places the final program into memory and begins execution

Each stage simplifies the previous one, making debugging and optimization possible at different levels.

📝 Section Recap: Instructions are 32-bit patterns with fields specifying the operation, operands, and additional parameters. R-type instructions work on registers, while I-type instructions include 16-bit immediate values. Code travels through compilation, assembly, linking, and loading stages before the CPU executes it.

Logical Operations: Bit Manipulation

AND, OR, and NOR Operations

Beyond arithmetic, processors support logical operations that manipulate individual bits. These are essential for tasks like setting flags, checking conditions, and extracting specific bit fields.

The AND operation produces a 1 bit only when both input bits are 1:

A	B	A AND B
0	0	0
0	1	0
1	0	0
1	1	1

The OR operation produces a 1 when at least one input bit is 1:

A	B	A OR B
0	0	0
0	1	1
1	0	1
1	1	1

The NOR operation is NOT-OR: it produces a 1 only when both input bits are 0. This is surprisingly useful because any logical function can be constructed from NOR operations alone.

In MIPS, you use and, or, and nor instructions just like arithmetic operations:

and $t0, $s0, $s1
or $t1, $s2, $s3
nor $t2, $s4, $s5

These operate on entire 32-bit registers, applying the logical operation to each bit pair independently. Additionally, NOT (inverting all bits) can be achieved using NOR with identical operands:

nor $t0, $s0, $s0    # Inverts all bits of $s0

Shift Operations

Shift operations move bits left or right within a register. A left shift multiplies the value by 2 for each position shifted (since we're moving toward higher-significance bits). A right shift divides by 2 (for unsigned values).

sll $t0, $s0, 2    # Shift left by 2 bits (multiply by 4)
srl $t1, $s1, 3    # Shift right by 3 bits (divide by 8, unsigned)

Shifts are efficient ways to multiply or divide by powers of 2, much faster than using multiply or divide instructions if you know the exact power of 2.

Why Logical Operations Matter

Logical operations enable bit-level manipulation essential for setting hardware flags, extracting fields from packed data structures, and implementing efficient algorithms. For instance, checking if a bit is set uses AND with a mask; toggling a bit uses XOR.

📝 Section Recap: Logical operations (AND, OR, NOR) work on individual bits across entire registers. Shift operations efficiently multiply or divide by powers of 2. These operations are fundamental for bit manipulation, flag handling, and data structure implementation.

Making Decisions: Conditional Branching

Conditional Branch Instructions

Programs must make decisions: executing different code based on conditions. MIPS provides branch instructions that change the program counter (the address of the currently executing instruction) based on a comparison.

The beq (branch if equal) instruction compares two registers:

beq $s0, $s1, target_label

If $s0 equals$ s1, the processor jumps to the instruction at target_label. Otherwise, it continues to the next instruction. Similarly, bne (branch if not equal) jumps when values are different.

For inequalities, MIPS provides slt (set if less than):

slt $t0, $s0, $s1

This compares $s0 and$ s1. If $s0 is less than$ s1, it sets $t0 to 1; otherwise,$ t0 becomes 0. You then use beq to branch based on this result.

Unconditional Jumps

Sometimes you need to jump unconditionally, without any condition. The j (jump) instruction does this:

j target_label

This always jumps to target_label, regardless of any condition. For jumping to an address stored in a register, use jr (jump register):

jr $ra    # Jump to address in $ra (common for returning from functions)

Loop and Conditional Structures

These instructions enable familiar programming structures. A loop uses a branch to jump backward to previously executed code:

loop: addi $s0, $s0, -1    # Decrement counter
      bne $s0, $0, loop    # Jump back if not zero

If-else structures use conditional branches to skip portions of code:

beq $s0, $s1, else_branch  # If equal, jump to else
      # ... code for if case ...
      j end_if
else_branch:
      # ... code for else case ...
end_if:

📝 Section Recap: Conditional branches (beq, bne) change program flow based on comparisons. The slt instruction enables less-than comparisons. Unconditional jumps (j, jr) provide absolute control flow. These instructions implement loops, if-else statements, and other control flow patterns.

Procedures: Function Calls and the Stack

Calling Functions

When one part of your code calls a function, several things must happen: arguments must be passed, the function must receive control, and when the function finishes, control must return to the caller. MIPS uses calling conventions — agreed-upon rules about how this handoff occurs.

Arguments are passed in registers $a0–$ a3. When a function finishes, it returns control using the jr $ra instruction, where $ra (return address) contains the address of the next instruction after the function call. The jal (jump and link) instruction performs a function call:

jal function_name    # Jump to function, save return address in $ra

This instruction both jumps to the function and saves the return address, enabling the function to return with jr $ra.

The Stack and Local Variables

Functions need space for local variables — variables used only within that function. This space comes from the stack, a region of memory that grows and shrinks as functions are called and return. The stack pointer ($sp) tracks the top of the stack.

When a function begins, it allocates space on the stack by subtracting from $sp:

addi $sp, $sp, -32    # Allocate 32 bytes on stack

The negative offset grows the stack. Local variables are then accessed relative to $sp:

sw $s0, 20($sp)    # Store $s0 at offset 20 from stack top
lw $t0, 20($sp)    # Load it back

Before returning, the function deallocates the stack space:

addi $sp, $sp, 32    # Deallocate 32 bytes
jr $ra               # Return

Saved Registers and Procedure Protocol

Here's a critical rule: if a function modifies a saved register ( $s0–$ s7), it must save and restore the original value. This preserves the registers for the caller, maintaining the contract that saved registers remain unchanged across function calls.

When a function calls another function, it must also save $ra because the called function will overwrite it:

addi $sp, $sp, -8      # Allocate space for $s0 and $ra
sw $ra, 4($sp)         # Save return address
sw $s0, 0($sp)         # Save $s0

# ... do work, possibly calling other functions ...

lw $s0, 0($sp)         # Restore $s0
lw $ra, 4($sp)         # Restore return address
addi $sp, $sp, 8       # Deallocate
jr $ra                 # Return to caller

This protocol ensures that function calls don't corrupt values needed by callers, enabling reliable recursion and function composition.

📝 Section Recap: Functions use jal to call and jr $ra to return. Arguments are passed in$ a0– $a3. The stack holds local variables and saved registers. Functions must preserve saved registers and$ ra by storing and restoring them, maintaining the calling convention.

Character and String Handling

ASCII and Unicode Representation

Text is stored in memory using character encodings. ASCII (American Standard Code for Information Interchange) represents characters as 7-bit or 8-bit values:

Character	ASCII Value (Decimal)	Hex
'A'	65	0x41
'a'	97	0x61
'0'	48	0x30
space	32	0x20

MIPS uses the lb (load byte) and sb (store byte) instructions to handle individual characters:

lb $t0, 0($s0)     # Load byte at address in $s0
sb $t0, 8($sp)     # Store byte to address at offset in $sp

These load or store single bytes rather than full 32-bit words, enabling character-by-character processing.

Strings are sequences of characters, typically terminated with a null byte (0x00) marking the end. This convention allows code to process strings without knowing their length in advance.

For larger character sets (like international languages), Unicode provides a richer encoding. UTF-8, a variable-length encoding, represents ASCII characters in one byte and non-Latin characters in multiple bytes, maintaining backward compatibility with ASCII.

📝 Section Recap: Characters are encoded as numeric values — ASCII for English text, Unicode for international support. Load byte (lb) and store byte (sb) instructions handle individual characters. Strings are sequences of characters terminated with a null byte, enabling dynamic-length text processing.

Program Translation and Execution

From High-Level Code to Machine Instructions

The journey from your C code to executing machine instructions involves four stages:

Compilation transforms high-level language (like C) into assembly language. The compiler reads your code, understands its structure and meaning, and generates equivalent MIPS assembly instructions. During this stage, high-level constructs like loops and function calls become branches and jumps.

Assembly converts assembly mnemonics into binary machine instructions. The assembler translates human-readable instructions like add $t0, $s0, $s1 into the 32-bit encoding the processor understands. Assemblers also handle pseudo-instructions — convenient shortcuts that expand into one or more real instructions. For example, move $t0, $s0 (copying one register to another) becomes add $t0, $s0, $0.

Linking combines separately assembled files and resolves external references. When one module calls a function defined in another, the linker finds that function's address and patches the calling instruction with the correct target address.

Loading places the final executable into memory and starts execution. The operating system allocates memory regions for code (read-only), initialized data, uninitialized data (the heap and stack), and handles any final address adjustments.

Addressing Modes and Memory Layout

Programs don't exist in isolation — they must coexist in memory alongside other programs. Addressing modes specify how to calculate the address of an operand:

Immediate addressing: The operand is part of the instruction
Register addressing: The operand is in a register
Base addressing: The address is a register plus an offset (like 24($sp))
PC-relative addressing: The address is relative to the program counter, used for branches

Memory layout divides addresses into regions:

Text segment: Your executable code (read-only)
Data segment: Initialized global variables
Heap: Dynamically allocated memory (grows upward)
Stack: Local variables and saved registers (grows downward)

📝 Section Recap: Compilation, assembly, linking, and loading transform high-level code into executing machine instructions. Addressing modes specify how operand addresses are calculated. Memory is divided into segments for code, data, heap, and stack, enabling programs to coexist and manage resources.

Parallelism and Synchronization

Why Synchronization Matters

Modern processors often execute multiple instructions simultaneously or split execution across multiple cores. When multiple processes or threads access shared data, they must synchronize — coordinate their access to prevent data corruption.

Consider two processes both trying to increment a shared counter. Process A reads the value (5), Process B reads the same value (5), Process A increments to 6 and writes back, Process B increments to 6 and writes back. The counter should be 7, but it's only 6 — we lost an increment. This is a race condition: the result depends on the timing of events.

Atomic Operations and Locks

The solution is atomic operations — instructions that complete without interruption. MIPS provides the ll (load-linked) and sc (store-conditional) instructions:

ll $t0, 0($s0)     # Load-linked: load and mark address for monitoring
addi $t0, $t0, 1   # Increment
sc $t0, 0($s0)     # Store-conditional: store only if address unchanged

If another process modifies the address between ll and sc, the sc fails (sets the destination register to 0), and you must retry. This prevents lost updates.

Another approach uses mutual exclusion locks (mutexes). Before accessing shared data, a process acquires the lock; after finishing, it releases it. Only one process holds the lock at a time, ensuring exclusive access:

acquire_lock: ll $t1, lock_address
              bne $t1, $0, acquire_lock  # Retry if locked
              addi $t2, $0, 1
              sc $t2, lock_address
              beq $t2, $0, acquire_lock  # Retry if store failed

# ... access shared data safely ...

sw $0, lock_address    # Release lock

Synchronization primitives are essential for correct behavior in parallel systems, where multiple processors or threads compete for shared resources.

📝 Section Recap: Race conditions occur when multiple processes access shared data simultaneously. Load-linked and store-conditional instructions provide atomic operations. Mutual exclusion locks (mutexes) ensure only one process accesses shared data at a time. These primitives are critical for correct parallel program behavior.

Putting It All Together

You've now explored the complete landscape of how computers execute programs at the instruction level. MIPS assembly language, while not commonly written by hand anymore, reveals the fundamental constraints and opportunities in computer architecture.

The key insight is this: Everything reduces to simple operations on registers and memory. High-level abstractions — functions, loops, if-statements, even object-oriented programming — ultimately become sequences of loads, stores, arithmetic, logical operations, and branches. Understanding these fundamentals transforms you from someone who writes code to someone who understands how code works.

As you write programs going forward, remember that each line of high-level code becomes multiple machine instructions. Optimizing your programs means understanding this translation process and making choices that lead to efficient instruction sequences. Debugging complex issues often requires dropping down to see what's actually happening at the instruction level. And designing new processors or computer systems demands deep understanding of how instructions flow through execution pipelines and how to coordinate massive parallelism safely.

The instruction set is the language of the computer, and now you speak it fluently.