Chapter 1: Foundations of Computer Systems

Introduction

Welcome to the world of computer systems—a field that is anything but static or dreary. Computers represent one of the most vibrant technological domains, driving nearly 10% of the United States' gross national product. The computer industry evolves at a breathtaking pace, with innovations appearing at speeds that would astound those in other fields.

To appreciate the rate of advancement, imagine if the transportation industry had kept pace with computers over the last 30 years. If it had, you could travel from New York to London in a single second for just a penny. Such improvements would fundamentally reshape how we live and work. The computer revolution has become a third major advancement for civilization, alongside the agricultural and industrial revolutions. Today's science fiction becomes tomorrow's reality: augmented reality glasses, cashless societies, and autonomous vehicles are already emerging.

Moore's Law: A principle stating that integrated circuit resources double every 18–24 months, driving the rapid advancement of computing technology.

As computing becomes more powerful and affordable, previously impossible applications become practical. Mobile phones, computer-controlled automobiles with safety features, genome sequencing, web search, and social networking all owe their existence to Moore's Law and the corresponding improvements in hardware capabilities.

Understanding Computing Applications and Their Classes

Not all computers are created equal. While they may use similar underlying technologies, different computing applications have distinct design requirements and characteristics. Understanding these distinctions is essential to appreciating how computers vary.

Personal Computers

Personal computers (PCs) are the most familiar class of computing devices. They emphasize delivering good performance to individual users at low cost. Most readers have extensive experience with PCs, which typically include a graphics display, keyboard, and mouse. The PC industry is remarkably young—only about 35 years old—yet it has driven the evolution of many computing technologies.

Servers

Servers represent the modern evolution of what were once massive, expensive computing systems. Unlike PCs, servers are accessed primarily through networks rather than directly. They are designed to handle large workloads, which may be single complex applications (such as scientific simulations) or many smaller jobs (like requests to a web server).

Servers span an enormous range in cost and capability. At the low end, a server might be little more than a desktop computer without a display, costing around a thousand dollars. These entry-level servers handle file storage, small business applications, or basic web services. At the high end are supercomputers, which consist of tens of thousands of processors and many terabytes of memory, costing tens to hundreds of millions of dollars. Supercomputers tackle problems like weather forecasting, oil exploration, and protein structure determination.

Compared to PCs, servers emphasize dependability more heavily. When a server crashes, the consequences are far more costly than when a personal computer fails, because servers support many users simultaneously.

Embedded Computers

Embedded computers represent the largest class by quantity and span the widest range of applications. These are the processors hidden inside automobiles, televisions, and the control systems of modern airplanes and cargo ships. Embedded systems are designed to run one application or a specific set of related applications, integrated seamlessly with the hardware.

Most users never realize they are interacting with embedded computers, despite their ubiquity. Embedded applications often have unique requirements, combining minimum necessary performance with strict cost or power constraints. For example, a music player processor only needs to be fast enough for its limited function; beyond that, minimizing cost and power are paramount.

One interesting aspect of embedded systems is their tolerance for failure. While consumer-oriented devices like televisions can tolerate occasional crashes, critical embedded systems (like those in aircraft) must be highly dependable. This dependability is often achieved through simplicity in consumer products and redundancy in large industrial systems.

📝 Section Recap: Computing devices fall into three main classes—personal computers optimized for individual user performance, servers designed for large workloads and multiple simultaneous users, and embedded computers representing the largest quantity but hidden within other devices with specific application requirements.

Eight Great Ideas in Computer Architecture

Throughout the 60-year history of computer design, architects have developed powerful ideas that have endured far beyond their initial implementation. These eight great ideas serve as foundational principles that recur throughout this book and guide modern computer design.

1. Design for Moore's Law

The most consistent factor in computer design is rapid technological change, driven by Moore's Law. This principle, articulated by Gordon Moore in 1965, states that integrated circuit resources double every 18–24 months. Computer architects must anticipate where technology will be when a design is complete, not where it is when design begins—like a skeet shooter predicting where a moving target will be, not shooting at where it currently stands.

2. Use Abstraction to Simplify Design

Both hardware and software engineers employ abstraction to manage complexity and increase productivity. Abstraction means representing designs at multiple levels of detail, with lower-level details hidden to provide simpler models at higher levels. This hierarchical approach prevents design complexity from growing unmanageably as technology resources expand.

3. Make the Common Case Fast

An essential design principle states that optimizing the common case will improve performance better than optimizing rare cases. Interestingly, the common case is often simpler than the rare case, making it easier to enhance. This principle requires careful measurement and experimentation to identify what truly is common in a given application. Think of a sports car: most trips carry one or two passengers, so optimizing for that common case yields better overall performance than trying to build a fast minivan.

4. Performance via Parallelism

Since computing's earliest days, architects have achieved performance improvements by performing multiple operations simultaneously. This principle appears throughout computer design in various forms.

5. Performance via Pipelining

Pipelining is a particularly prevalent form of parallelism, deserving its own principle. The concept resembles a bucket brigade: rather than one person running back and forth to move water from source to fire, people form a chain, passing buckets along. Pipelining similarly breaks a task into stages, with each stage handling part of the work while other stages handle other work.

6. Performance via Prediction

Sometimes it is faster to guess and proceed rather than wait for certainty, provided the recovery cost from misprediction is reasonable and the prediction accuracy is good. This principle underlies speculative execution in modern processors.

7. Hierarchy of Memories

Programmers want memory that is simultaneously fast, large, and cheap—three conflicting demands. Architects satisfy these competing requirements through a memory hierarchy: the fastest, smallest, and most expensive memory per bit sits at the top, while the slowest, largest, and cheapest memory occupies the bottom. Caches create the illusion that main memory is nearly as fast as the top of the hierarchy while being nearly as large and cheap as the bottom. A pyramidal shape visually represents this concept: the narrow top indicates speed and expense, while the wider base indicates capacity and lower cost.

8. Dependability via Redundancy

Beyond speed, computers must be dependable. Since any physical device can fail, systems become dependable by including redundant components that can take over when failures occur. This principle resembles the dual tires on truck rear axles: if one tire fails, the truck can continue operating (at least until reaching a service facility).

📝 Section Recap: The eight great ideas—Moore's Law, abstraction, making the common case fast, parallelism, pipelining, prediction, memory hierarchy, and redundancy—form the foundation of modern computer architecture and recur throughout computer design as guiding principles.

The Layers Below Your Program

Understanding how a high-level program transforms into machine instructions requires understanding the software layers that enable this translation. A typical application like a word processor or database system consists of millions of lines of code and relies on sophisticated software libraries.

Yet hardware executes only extremely simple, low-level instructions. Bridging this enormous gap requires multiple layers of software that interpret or translate high-level operations into simple computer instructions—a practical application of abstraction.

Systems Software and its Components

Software is organized hierarchically, with applications on the outermost ring and systems software between applications and hardware. Two types of systems software are central to every modern computer:

Operating systems interface between user programs and hardware, providing essential services:

Handling input and output operations
Allocating storage and memory
Managing program execution
Providing security and protection

Compilers translate programs written in high-level languages (like C or Java) into the machine language that hardware actually executes. This translation process determines how many machine instructions are required for each source-level statement, significantly affecting program performance.

The Instruction Set Architecture

Among the most important abstractions is the instruction set architecture (ISA), representing the interface between hardware and the lowest-level software. The ISA encompasses all information programmers need to make machine language programs work correctly, including instructions, I/O devices, and system functions.

The operating system typically encapsulates I/O details, memory allocation, and other low-level functions so that application programmers needn't worry about them. The combination of the basic instruction set and operating system interface is called the application binary interface (ABI).

An ISA allows designers to discuss computer functions independently from their implementations. Just as you can discuss a digital clock's functions (keeping time, displaying time, setting alarms) separately from its hardware (quartz crystal, LED displays, buttons), computer designers distinguish architecture (the ISA abstraction) from implementation (the specific hardware realizing that abstraction).

📝 Section Recap: Software layers organize computer systems hierarchically, with compilers and operating systems translating high-level programs into machine instructions through the instruction set architecture—a crucial abstraction enabling different hardware implementations to run identical software.

Under the Covers: Computer Hardware Components

To understand how software achieves its goals, we must examine the physical components that execute instructions and store data.

Classic Computer Components

Computers fundamentally consist of five classic components:

Input devices — mechanisms for users to communicate with the computer (keyboard, mouse, touchscreen)
Output devices — mechanisms for computers to communicate with users (display, speakers, printer)
Memory — storage for programs and data
Datapath — the component performing arithmetic operations
Control — the component directing the datapath, memory, and I/O according to program instructions

Understanding Integrated Circuits

The components visible when opening a computer contain integrated circuits, also called chips—devices combining dozens to millions of transistors on a single piece of silicon. These tiny rectangles contain the technology driving modern computing advancement.

Within a processor integrated circuit reside two main logical components:

The datapath performs arithmetic operations—additions, subtractions, and other calculations. The control acts as the processor's brain, commanding the datapath, memory, and I/O devices according to program instructions.

Memory Technologies

Memory stores the running programs and the data they need. Different memory technologies occupy different levels in the memory hierarchy:

Dynamic Random Access Memory (DRAM): The primary technology for main memory. DRAM provides random access (access time is essentially constant regardless of location), typical access times of 50–70 nanoseconds, and costs between $5–$ 10 per gigabyte as of 2012.

Static Random Access Memory (SRAM): Faster and less dense than DRAM, used for cache memory. SRAM is more expensive per bit than DRAM.

Cache memory: A small, fast memory acting as a buffer between the processor and slower DRAM. Cache holds recently used data and instructions, dramatically improving performance.

Volatile and Nonvolatile Memory

A critical distinction exists between two memory categories:

Volatile memory, such as DRAM, retains data only while receiving power. When power is lost, all data disappears—a significant limitation for any computer.

Nonvolatile memory retains data even without power:

Flash memory: A nonvolatile semiconductor memory used as secondary storage in personal mobile devices. Flash is slower than DRAM but much cheaper, more compact, more rugged, and more power-efficient than magnetic disks. Access times range from 5–50 microseconds, with 2012 costs of $0.75–$ 1.00 per gigabyte. However, flash memory degrades after 100,000–1,000,000 writes, requiring file systems to manage wear carefully.

Magnetic disk: A rotating platter coated with magnetic material, still dominating secondary storage in servers. Access times of 5–20 milliseconds make disks much slower than RAM, but costs of $0.05–$ 0.10 per gigabyte make them economical.

To distinguish memory roles, main memory (or primary memory) holds programs and data while running, while secondary memory stores data and programs between runs.

Graphics and Display Systems

Modern computers include specialized graphics hardware. Displays are composed of millions of pixels (the smallest picture elements), arranged in a matrix. A color display might use 8 bits for each primary color (red, blue, green), totaling 24 bits per pixel and allowing millions of different colors.

The frame buffer stores the bit pattern for each pixel, with the image representation refreshed continuously at the display refresh rate. The challenge in graphics systems arises because the human eye readily detects subtle visual changes.

Touchscreen technology, popular in tablets and smartphones, often uses capacitive sensing. Since humans conduct electricity, a transparent conductor on an insulator (like glass) can detect when touched by measuring changes in the electrostatic field.

Storage Capacity Terminology

Computing involves significant quantities of data, requiring precise terminology for storage capacity. The traditional decimal system and the modern binary system create ambiguity:

Decimal Term	Abbreviation	Value	Binary Term	Abbreviation	Value	Difference
Kilobyte	KB	$10^3$	Kibibyte	KiB	$2^{10}$	2%
Megabyte	MB	$10^6$	Mebibyte	MiB	$2^{20}$	5%
Gigabyte	GB	$10^9$	Gibibyte	GiB	$2^{30}$	7%
Terabyte	TB	$10^{12}$	Tebibyte	TiB	$2^{40}$	10%
Petabyte	PB	$10^{15}$	Pebibyte	PiB	$2^{50}$	13%
Exabyte	EB	$10^{18}$	Exbibyte	EiB	$2^{60}$	15%
Zettabyte	ZB	$10^{21}$	Zebibyte	ZiB	$2^{70}$	18%
Yottabyte	YB	$10^{24}$	Yobibyte	YiB	$2^{80}$	21%

The differences compound as numbers increase. These prefixes apply to bits as well as bytes, so gigabit (Gb) equals $10^9$ bits while gibibits (Gib) equals $2^{30}$ bits.

Networking and Communication

The final component enabling modern computing is networking. Networks interconnect computers, allowing them to share information and resources. Networks provide:

Communication: Rapid exchange of information between computers
Resource sharing: Computers on a network can share I/O devices rather than each maintaining separate ones
Nonlocal access: Users need not be near the physical computer they are using

Local area networks (LANs) connect computers within geographically confined areas, typically a single building. Wide area networks (WANs) span hundreds of kilometers or continents, forming the backbone of the Internet. Ethernet, a popular LAN technology, can span up to a kilometer and transfer data at up to 40 gigabits per second.

Networks have dramatically transformed computing over the past 30 years, evolving from rare, limited-capacity systems to ubiquitous, high-performance infrastructure. The Internet and web have fundamentally changed how information is accessed, stored, and shared globally.

📝 Section Recap: Computer hardware comprises five classic components—input, output, memory, datapath, and control—implemented as integrated circuits. Memory hierarchies combine volatile RAM for current operations with nonvolatile secondary storage for persistence, while networking enables computers to communicate and share resources globally.

Technologies for Building Processors and Memory

The foundation of modern computing rests on integrated circuit technology. Understanding these technologies illuminates why certain design decisions make sense and how the field will evolve.

Silicon and Transistors

Modern processors and memory are built using silicon as the semiconductor material. The transistor, a device that acts as an electronic switch, forms the basic building block. Millions or billions of transistors on a single chip create the logic and memory of modern computers.

The process technology describes the minimum feature size on a chip, measured in nanometers. A 45-nanometer process means the smallest transistors measure 45 nanometers. Smaller process technologies allow more transistors per chip, which correlates directly with improved performance and efficiency.

Manufacturing and Yield

Integrated circuits are created through a complex photolithographic process on circular silicon wafers. Each wafer contains hundreds of small rectangles called dies, each of which will become a single chip. Manufacturing is not perfect—defects occur randomly across the wafer.

Yield represents the percentage of dies that function correctly. Yield depends on two factors:

The number of defects per unit area on the wafer
The size of each die

A larger die has a higher probability of containing a defect. Improving yield requires either reducing defects per area (through better manufacturing processes) or designing smaller dies.

Cost per die is calculated as:

$\text{Cost per die} = \frac{\text{Wafer cost}}{\text{Number of dies per wafer} \times \text{Yield}}$

This equation shows why minimizing die size is economically important: smaller dies fit more per wafer and reduce defect impact.

Power and Heat Considerations

Modern processors dissipate significant power as heat. Understanding power consumption involves two components:

Dynamic power results from transistors switching between states. When a transistor switches, it charges and discharges capacitive loads, consuming energy.

Static power (or leakage power) flows continuously through transistors, even when they are not switching. Modern processes generate significant leakage current.

Total power consumption drives cooling requirements and system cost. Understanding power is particularly important for mobile devices, where battery life depends on efficiency.

📝 Section Recap: Integrated circuit technology relies on transistors built in silicon using increasingly smaller process technologies. Manufacturing yield depends on defect density and die size, affecting cost per die, while power consumption involves both dynamic switching power and static leakage power.

The PostPC Era

A significant shift is occurring in computing. Personal mobile devices (PMDs) and cloud computing are transforming the landscape that PCs dominated for decades.

Personal Mobile Devices

Personal mobile devices are small, battery-powered, wireless devices connecting to the Internet. Smartphones and tablets represent the current generation. Unlike PCs with keyboards and mice, PMDs typically use touch-sensitive screens or speech input.

Cloud Computing: Large collections of servers (known as Warehouse Scale Computers or WSCs) providing services over the Internet. Companies like Amazon and Google operate these massive datacenters with hundreds of thousands of servers, which they rent to companies needing computational resources. Software as a Service (SaaS) deployed via the cloud is revolutionizing software development.

The rise of PMDs and cloud computing represents generational change comparable to the shift to personal computers 30 years earlier. Modern software developers often architect applications with portions running on the PMD and portions in the cloud.

📝 Section Recap: The PostPC era features personal mobile devices as primary user interface points and cloud computing through warehouse-scale computers replacing traditional server architecture, fundamentally changing how software is developed and deployed.

Program Performance: Why It Matters and What Affects It

Programmers have always cared about performance. In the 1960s and 1970s, memory constraints limited program size, driving the principle "minimize memory space to make programs fast." Today, advances in memory technology have diminished memory size as a critical constraint for most applications beyond embedded systems.

Modern programmers must understand different performance bottlenecks: the parallel nature of modern processors and the hierarchical nature of memory systems. As explained in a subsequent section, energy efficiency has become increasingly important for both personal mobile devices and cloud computing.

Understanding the factors affecting program performance requires understanding what lies below your code. Performance depends on multiple factors working in concert:

Component	Role in Performance	Where Covered
Algorithm	Determines both source-level statement count and I/O operations	Other courses
Programming language, compiler, architecture	Determines machine instruction count for each source-level statement	Chapters 2–3
Processor and memory system	Determines execution speed for instructions	Chapters 4–6
I/O system and devices	Determines speed of I/O operations	Chapters 4–6

To improve program performance, you must understand which component is your bottleneck. Sometimes the limitation is the algorithm. Sometimes it is how efficiently the compiler translates your code. Often it is the processor and memory system characteristics.

📝 Section Recap: Program performance depends on the combined effectiveness of algorithms, language compilers, processor hardware, and I/O systems. Understanding what limits performance requires examining all layers from high-level code through hardware execution, with optimization efforts focused on actual bottlenecks rather than assumed limitations.

Key Concepts Summary

As you continue through this chapter and subsequent ones, you will encounter specialized terminology. Do not panic—there is significant terminology in computer architecture, but it enables precise description of function and capability. Computer designers love acronyms; once you know what the letters stand for, the terminology becomes intuitive.

Key definitions you will encounter include:

Microprocessor: A processor integrated circuit containing multiple processor cores in a single chip, increasingly common in modern design.

Acronym: A word constructed from the initial letters of a string of words, such as RAM (Random Access Memory) or CPU (Central Processing Unit).

The foundation laid in this chapter supports the entire remainder of the book. By understanding how software and hardware interact, how different performance factors combine, and the great ideas guiding computer architecture, you are prepared to explore these areas in depth.

Understanding these fundamentals enables programmers and designers to move beyond trial-and-error approaches to scientifically driven optimization and evaluation. Rather than guessing whether a particular optimization will improve performance, you can measure, analyze, and verify results based on architectural principles.

📝 Chapter Summary: This introductory chapter establishes the foundation for understanding modern computers by surveying the three classes of computing devices, introducing eight great architectural ideas, examining the layers of software and hardware, and explaining why understanding computer organization matters for program performance and system design. These concepts recur throughout the book as guiding principles for deeper exploration of computer architecture and design.