Browse Books

Go to Computer Organization and Design, Fifth Edition

The 5th edition of Computer Organization and Design moves forward into the post-PC era with new examples, exercises, and material highlighting the emergence of mobile computing and the cloud. This generational change is emphasized and explored with updated content featuring tablet computers, cloud infrastructure, and the ARM (mobile computing devices) and x86 (cloud computing) architectures. Because an understanding of modern hardware is essential to achieving good performance and energy efficiency, this edition adds a new concrete example, "Going Faster," used throughout the text to demonstrate extremely effective optimization techniques. Also new to this edition is discussion of the "Eight Great Ideas" of computer architecture. As with previous editions, a MIPS processor is the core used to present the fundamentals of hardware technologies, assembly language, computer arithmetic, pipelining, memory hierarchies and I/O. Instructors looking for4th Edition teaching materials should e-mail [email protected]. Includes new examples, exercises, and material highlighting the emergence of mobile computing and the Cloud. Covers parallelism in depth with examples and content highlighting parallel hardware and software topics Features the Intel Core i7, ARM Cortex-A8 and NVIDIA Fermi GPU as real-world examples throughout the book Adds a new concrete example, "Going Faster," to demonstrate how understanding hardware can inspire software optimizations that improve performance by 200 times. Discusses and highlights the "Eight Great Ideas" of computer architecture: Performance via Parallelism; Performance via Pipelining; Performance via Prediction; Design for Moore's Law; Hierarchy of Memories; Abstraction to Simplify Design; Make the Common Case Fast; and Dependability via Redundancy. Includes a full set of updated and improved exercises.

Cited By

Pitchanathan A, Grover K and Grosser T (2024). Falcon: A Scalable Analytical Cache Model, Proceedings of the ACM on Programming Languages , 8 :PLDI , (1854-1878), Online publication date: 20-Jun-2024 .

Godbole A, Cheang K, Manerkar Y and Seshia S Lifting Micro-Update Models from RTL for Formal Security Analysis Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, (631-648)

Mhatre S and Chandran P On the Measurement of Performance Metrics for Virtualization-Enhanced Architectures Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, (49-56)

An M, Song I, Song Y and Lee S Avoiding Read Stalls on Flash Storage Proceedings of the 2022 International Conference on Management of Data, (1404-1417)

Tracy K (2021). Software, 10.1145/3477339, Online publication date: 6-Sep-2021 .

Şuşu A (2020). A Vector-Length Agnostic Compiler for the Connex-S Accelerator with Scratchpad Memory, ACM Transactions on Embedded Computing Systems , 19 :6 , (1-30), Online publication date: 30-Nov-2020 .

Dai H, Wong R, Wang H, Zheng Z and Vasilakos A (2019). Big Data Analytics for Large-scale Wireless Networks, ACM Computing Surveys , 52 :5 , (1-36), Online publication date: 30-Sep-2020 .

Nguyen K, Tang H, Wang H and Zeng N New Code-Based Privacy-Preserving Cryptographic Constructions Advances in Cryptology – ASIACRYPT 2019, (25-55)

Ashouri A, Killian W, Cavazos J, Palermo G and Silvano C (2018). A Survey on Compiler Autotuning using Machine Learning, ACM Computing Surveys , 51 :5 , (1-42), Online publication date: 30-Sep-2019 .

Lee G, Shin S, Song W, Ham T, Lee J and Jeong J Asynchronous I/O stack Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference, (603-616)

Hou N, Yan X and He F (2019). A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design, Design Automation for Embedded Systems , 23 :1-2 , (57-77), Online publication date: 1-Jun-2019 .

Akshintala A, Jain B, Tsai C, Ferdman M and Porter D x86-64 instruction usage among C/C++ applications Proceedings of the 12th ACM International Conference on Systems and Storage, (68-79)

Boito F, Inacio E, Bez J, Navaux P, Dantas M and Denneulin Y (2018). A Checkpoint of Research on Parallel I/O for High-Performance Computing, ACM Computing Surveys , 51 :2 , (1-35), Online publication date: 31-Mar-2019 .

Zilberman N, Bracha G and Schzukin G Stardust Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation, (141-159)

Sam D and Agyeman M An Overview of Design Space Exploration of Cache Memory Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control, (1-6)

Leobas G, Guimarães B and Pereira F More than meets the eye Proceedings of the XXII Brazilian Symposium on Programming Languages, (27-34)

Richter E, Valancius S, Mcclanahan J, Mixter J and Akoglu A (2018). Balancing the learning ability and memory demand of a perceptron-based dynamically trainable neural network, The Journal of Supercomputing , 74 :7 , (3211-3235), Online publication date: 1-Jul-2018 .

Vanwinkle S and Kodi A (2018). SHARP, ACM Journal on Emerging Technologies in Computing Systems , 14 :2 , (1-22), Online publication date: 30-Apr-2018 .

Zekri A (2018). Optimizing image spatial filtering on single CPU core, Multimedia Tools and Applications , 77 :1 , (251-281), Online publication date: 1-Jan-2018 .

Li S, Niu D, Malladi K, Zheng H, Brennan B and Xie Y DRISA Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (288-301)

Fontana T, Almeida S, Netto R, Livramento V, Guth C, Pilla L and Güntzel J Exploiting cache locality to speedup register clustering Proceedings of the 30th Symposium on Integrated Circuits and Systems Design: Chip on the Sands, (191-197)

Bril R, Altmeyer S, Heuvel M, Davis R and Behnam M (2017). Fixed priority scheduling with pre-emption thresholds and cache-related pre-emption delays, Real-Time Systems , 53 :4 , (403-466), Online publication date: 1-Jul-2017 .

Hatvani L, Bril R and Altmeyer S Schedulability using native non-preemptive groups on an AUTOSAR/OSEK platform with caches Proceedings of the Conference on Design, Automation & Test in Europe, (244-249)

Abellán J, Chen C and Joshi A (2016). Electro-Photonic NoC Designs for Kilocore Systems, ACM Journal on Emerging Technologies in Computing Systems , 13 :2 , (1-25), Online publication date: 10-Mar-2017 .

Kurmas Z MIPSUnit Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education, (351-355)

Dato D, Lucchese C, Nardini F, Orlando S, Perego R, Tonellotto N and Venturini R (2016). Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees, ACM Transactions on Information Systems , 35 :2 , (1-31), Online publication date: 21-Dec-2016 .

Nikitopoulos K, Chatzipanagiotis D, Jayawardena C and Tafazolli R MultiSphere: Massively Parallel Tree Search for Large Sphere Decoders 2016 IEEE Global Communications Conference (GLOBECOM), (1-6)

Bederián C and Wolovick N A project-based HPC course for single-box computers Proceedings of the Workshop on Education for High Performance Computing, (1-6)

Eker A, Mert Y and Ergin O (2016). URFA-Update based register file architecture with partial register write for energy efficiency, Microprocessors & Microsystems , 47 :PB , (445-453), Online publication date: 1-Nov-2016 .

Jain A, Hill P, Lin S, Khan M, Haque M, Laurenzano M, Mahlke S, Tang L and Mars J Concise loads and stores The 49th Annual IEEE/ACM International Symposium on Microarchitecture, (1-13)

Burgess S and Page B (2016). Cuda programming in the core curriculum, Journal of Computing Sciences in Colleges , 32 :1 , (155-161), Online publication date: 1-Oct-2016 .

Lewis R (2016). Teaching computer architecture with a "live" assembler, Journal of Computing Sciences in Colleges , 32 :1 , (151-154), Online publication date: 1-Oct-2016 .

Li A, Song S, Wijtvliet M, Kumar A and Corporaal H SFU-Driven Transparent Approximation Acceleration on GPUs Proceedings of the 2016 International Conference on Supercomputing, (1-14)

Fu Y, Song F and Zhu L (2016). Modeling and Implementation of an Asynchronous Approach to Integrating HPC and Big Data Analysis1, Procedia Computer Science , 80 :C , (52-62), Online publication date: 1-Jun-2016 .

Bijo S, Johnsen E, Pun K and Tarifa S An operational semantics of cache coherent multicore architectures Proceedings of the 31st Annual ACM Symposium on Applied Computing, (1219-1224)

Cacho C, Souza P, Bruschi S, Barbosa E and Tiosso F An interactive approach for the teaching of virtual memory using open educational resources Proceedings of the 31st Annual ACM Symposium on Applied Computing, (225-231)

Piessens F and Verbauwhede I Software security Proceedings of the 2016 Conference on Design, Automation & Test in Europe, (990-999)

Santini T, Rech P, Nazar G and Wagner F (2015). Beyond Cross-Section, ACM Transactions on Embedded Computing Systems , 15 :1 , (1-16), Online publication date: 20-Feb-2016 .

Esiner E and Datta A Layered security for storage at the edge Proceedings of the 17th International Conference on Distributed Computing and Networking, (1-10)

Huang K and Chen Y Improving Performance of Floating Point Division on GPU and MIC Proceedings, Part II, of the 15th International Conference on Algorithms and Architectures for Parallel Processing - Volume 9529, (691-703)

Wang E and Dang Z A message-passing architecture without public ids using send-to-behavior Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, (902-905)

Golnari A, Vizel Y and Malik S Error-Tolerant Processors Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, (286-293)

Melletti M, Goldweber M and Davoli R The JaeOS Project and the μARM Emulator Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Education, (3-8)

Chaker H, Cudennec L, Dahmani S, Gogniat G and Sepúlveda M Cycle-based Model to Evaluate Consistency Protocols within a Multi-protocol Compilation Tool-chain Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, (1-10)

Fox A and Patterson D (2015). Do-it-yourself textbook publishing, Communications of the ACM , 58 :2 , (40-43), Online publication date: 28-Jan-2015 .

Cai L, Guan X, Chi P, Chen L and Luo J (2015). Big data visualization collaborative filtering algorithm based on RHadoop, International Journal of Distributed Sensor Networks , 2015 , (3-3), Online publication date: 1-Jan-2015 .

Patterson D (2014). How to build a bad research center, Communications of the ACM , 57 :3 , (33-36), Online publication date: 1-Mar-2014 .

Save to Binder

David A Patterson

Google LLC

John L. Hennessy

Stanford University

Index Terms

Computer Organization and Design, Fifth Edition: The Hardware/Software Interface

Reviews

Reviewer: Isil Oz

With the advent of the post-personal computer (PC) era, computing systems have been evolving from traditional desktop computers to cloud-based mobile devices. Energy and reliability concepts gain importance as well as performance in modern computer architectures. Due to frequent generational change in computing hardware and requirements, it is not an easy task to explain both basics and state-of-the-art techniques in a single material. However, this book achieves this harmony in its fifth edition (as well as in previous editions). It is the fundamental computer organization book, both as an introduction for readers with no experience in computer architecture topics, and as an up-to-date reference for computer architects. It presents a large spectrum, from the very basics of computers to advanced research topics in computer architecture. Chapter 1 attracts the reader's attention with plain and friendly language by talking about the impact of computers in our everyday lives. It introduces basic definitions related to computer systems and explains the main concepts of computer architecture by using a high-level perspective. The chapter presents software abstraction levels and hardware components to build a general view of a complete computer system. Performance and power measurement/comparison metrics are also provided. The authors also introduce parallelism and refer to the related sections throughout the book, which provide a good reference for the reader who wants to jump to parallelism concepts. In chapter 2, the authors introduce instruction sets and present the details of the chosen MIPS instruction set. The chapter includes both theoretical concepts (arithmetic, memory, logic, branch operations, and number representations) and many practical source code examples. It builds step-by-step content for the instruction set of the MIPS language, and this approach makes the language easier to understand thoroughly, even for a reader with no background in assembly languages. The authors provide insight into the implementation of binary arithmetic operations on computers in chapter 3. While the chapter presents algorithms and hardware for these operations, it also gives numerical examples for each operation. Floating point representation and related operations are also provided. The details of processor design based on MIPS implementation is explained in chapter 4. It not only explains the basic single-cycle processor concepts, but also covers complex pipeline design. Datapath design is explained in a step-by-step manner, starting from the basics of logic design to a complete datapath. The complex concept becomes easier and pleasant with this gradual approach. The chapter explains datapath and control details of the pipelining approach, and also describes data and control hazards that cause incorrect computation results. Chapter 5 presents a summary of memory technologies and details of caches, virtual memory, and memory performance issues. It includes a short section for dependable memory hierarchy to raise the criticality of reliability as well as performance for contemporary architectures. In chapter 6, the authors draw attention to parallel computer architectures. They start with a classification of parallel systems, and describe the basic concepts of vector processors, shared memory systems, graphics processing units, and clusters. The chapter presents performance results for benchmark applications to reflect the parallel speedup and motivation for parallel systems. The book includes two appendices. Appendix A presents details of the MIPS assembly language, and Appendix B provides the fundamental logic design background necessary to understand the concepts in the book. Visual icons offer a quick reference to basic ideas throughout the book. Moreover, each chapter includes “Real Stuff” and “Fallacies and Pitfalls” sections. The former introduces a real computer system that implements the related concepts in the chapter, and the latter explains common misconceptions and mistakes. Real system examples in this edition are mostly updated to the mobile device processor ARM Cortex-A8 and the latest Intel microarchitecture Intel Core i7 (AMD Opteron X4 and Intel Nehalem were used in the previous edition [1]). This edition's chapter on parallel processors has been reorganized by adding a discussion about warehouse scale computers and cluster networking, which provides background information related to cloud computing systems. Moreover, a separate chapter on storage and input/output (I/O) is omitted in this edition; instead, the I/O-related concepts have been spread throughout the book. All of the exercises have been updated with more comprehensive and instructive ones. As an undergraduate student with no background, and later as a more experienced graduate student studying computer architecture, I had enjoyed reading the earlier editions of this book. As I take my first steps as a computer architecture researcher, the same is true for the fifth edition. The language and the presentation of the book make it easier to understand the concepts. The material could be an excellent resource for both naive and experienced readers. More reviews about this item: Amazon , Goodreads Online Computing Reviews Service

Computing Reviews logoComputing Reviews logo

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.