Is Parallel Programming Hard, And, If So, What Can You Do About It?

"Is Parallel Programming Hard, And, If So, What Can You Do About It?", Paul E. McKenney, 2014 https://www.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html → 2-column typesetting PDF file, [1-column typesetting PDF file](http: //kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook-1c-e1p.pdf)

Chapter 1 - How To Use This Book 　The purpose of this book is to help you program shared-memory parallel machines without risking your sanity. We hope that this book’s design principles will help you avoid at least some parallel-programming pitfalls. That said, you should think of this book as a foundation on which to build, rather than as a completed cathedral. Your mission, if you choose to accept, is to help make further progress in the exciting field of parallel programming-progress that will in time render this book obsolete. Parallel programming is not as hard as some say, and we hope that this book makes your parallel-programming projects easier and more fun. 　In short, where parallel programming once focused on science, research, and grand-challenge projects, it is quickly becoming an engineering discipline. We therefore examine specific parallel-programming tasks and describe how to approach them. In some surprisingly common cases, they can even be automated. 　This book is written in the hope that presenting the engineering discipline underlying successful parallel-programming projects will free a new generation of parallel hackers from the need to slowly and painstakingly reinvent old wheels, enabling them to instead focus their energy and creativity on new frontiers. We sincerely hope that parallel programming brings you at least as much fun, excitement, and challenge that it has brought to us!

(Reference translation)

** Chapter 1-How to use this book ** The purpose of this book is to help you program in a shared memory massively parallel machine without losing your sanity. We hope that the design principles in this book will at least help you avoid the pitfalls of parallel programming. However, this book should not be seen as a completed cathedral, but as a starting point for architecture. Your mission, if you decide to accept, is to help you make further progress in the area of exciting parallel programming progress, not to make this book obsolete. I hope parallel programming isn't as difficult as it sounds, and this book will make your parallel programming project easier and more interesting. In short, parallel programming, which used to focus on academic research and national projects, is rapidly changing to the field of engineering. Therefore, we will discuss specific parallel programming tasks and how to approach the problem. In very common cases, it can also be automated. This book presents the engineering realms that underlie successful parallel programming projects, freeing new generations of parallel hackers from the need to reinvent the slow and painful old wheels, instead their energy and creativity. It is written in the hope of heading to the frontier. We sincerely hope that parallel programming will be at least more fun and exciting for you, and that's our challenge!

** Table of Contents ** (up to Level 2) 1 How To Use This Book 1.1 Roadmap 1.2 Quick Quizzes 1.3 Alternatives to This Book 1.4 Sample Source Code 1.5 Whose Book Is This? 2 Introduction 2.1 Historic Parallel Programming Difficulties 2.2 Parallel Programming Goals 2.3 Alternatives to Parallel Programming 2.4 What Makes Parallel Programming Hard? 2.5 Discussion 3 Hardware and its Habits 3.1 Overview 3.2 Overheads 3.3 Hardware Free Lunch? 3.4 Software Design Implications 4 Tools of the Trade 4.1 Scripting Languages 4.2 POSIX Multiprocessing 4.3 Atomic Operations 4.4 Linux-Kernel Equivalents to POSIX Operations 4.5 The Right Tool for the Job: How to Choose? 5 Counting 5.1 Why Isn't Concurrent Counting Trivial? 5.2 Statistical Counters 5.3 Approximate Limit Counters 5.4 Exact Limit Counters 5.5 Applying Specialized Parallel Counters 5.6 Parallel Counting Discussion 6 Partitioning and Synchronization Design 6.1 Partitioning Exercises 6.2 Design Criteria 6.3 Synchronization Granularity 6.4 Parallel Fastpath 6.5 Beyond Partitioning 6.6 Partitioning, Parallelism, and Optimization 7 Locking 7.1 Staying Alive 7.2 Types of Locks 7.3 Locking Implementation Issues 7.4 Lock-Based Existence Guarantees 7.5 Locking: Hero or Villain? 7.6 Summary 8 Data Ownership 8.1 Multiple Processes 8.2 Partial Data Ownership and pthreads 8.3 Function Shipping 8.4 Designated Thread 8.5 Privatization 8.6 Other Uses of Data Ownership 9 Deferred Processing 9.1 Reference Counting 9.2 Sequence Locks 9.3 Read-Copy Update (RCU) 9.4 Which to Choose? 9.5 What About Updates? 10 Data Structures 10.1 Motivating Application 10.2 Partitionable Data Structures 10.2.1 Hash-Table Design 10.3 Read-Mostly Data Structures 10.4 Non-Partitionable Data Structures 10.5 Other Data Structures 10.6 Micro-Optimization 10.7 Summary 11 Validation 11.1 Introduction 11.2 Tracing 11.3 Assertions 11.4 Static Analysis 11.5 Code Review 11.6 Probability and Heisenbugs 11.7 Performance Estimation 11.8 Summary 12 Formal Verification 12.1 What are Promela and Spin? 12.2 Promela Example: Non-Atomic Increment 12.3 Promela Example: Atomic Increment 12.4 How to Use Promela 12.5 Promela Example: Locking 12.6 Promela Example: QRCU 12.7 Promela Parable: dynticks and Preemptible RCU 12.8 Simplicity Avoids Formal Verification 12.9 Formal Verification and Memory Ordering 12.10 Summary 13 Putting It All Together 13.1 Counter Conundrums 13.2 RCU Rescues 13.3 Hashing Hassles 14 Advanced Synchronization 14.1 Avoiding Locks 14.2 Memory Barriers 14.3 Non-Blocking Synchronization 15 Ease of Use 15.1 What is Easy? 15.2 Rusty Scale for API Design 15.3 Shaving the Mandelbrot Set 16 Conflicting Visions of the Future 16.1 The Future of CPU Technology Ain’t What it Used to Be 16.2 Transactional Memory 16.3 Hardware Transactional Memory 16.4 Functional Programming for Parallelism A Important Questions A.1 What Does "After" Mean? A.2 What Time Is It? B Synchronization Primitives B.1 Organization and Initialization B.2 Thread Creation, Destruction, and Control B.3 Locking B.4 Per-Thread Variables B.5 Performance C Why Memory Barriers? C.1 Cache Structure C.2 Cache-Coherence Protocols C.3 Stores Result in Unnecessary Stalls C.4 Store Sequences Result in Unnecessary Stalls C.5 Read and Write Memory Barriers C.6 Example Memory-Barrier Sequences C.7 Memory-Barrier Instructions For Specific CPUs C.8 Are Memory Barriers Forever? C.9 Advice to Hardware Designers D Read-Copy Update Implementations D.1 Sleepable RCU Implementation D.2 Hierarchical RCU Overview D.3 Hierarchical RCU Code Walkthrough E Read-Copy Update in Linux E.1 RCU Usage Within Linux E.2 RCU Evolution F Answers to Quick Quizzes G Glossary and Bibliography H Credits