Parallel and Distributed Programming
Kenjiro Taura

1 What’s new (in the newest-first order)

(Posted: Jan. 20, 2024) Plan for Jan. 20
- Divide and Conquer
(Posted: Jan. 06, 2024) Plan for Jan. 06
- Analyzing Data Access of Algorithms and How to Make Them Cache-Friendly
(Posted: Jan. 06, 2024) Details about how to get credit are released
(Posted: Dec. 23, 2024) Plan for Dec. 23
- What You Must Know about Memory, Caches, and Shared Memory
- pd07_mem_ext (an extended version of pd07_mem)
- Analyzing Data Access of Algorithms and How to Make Them Cache-Friendly
(Posted: Dec. 16, 2024) Plan for Dec. 16
- What You Must Know about Memory, Caches, and Shared Memory
- pd07_mem and pd30_mlp
(Posted: Dec. 9, 2024) Plan for Dec. 9
- pd30_mlp
- What You Must Know about Memory, Caches, and Shared Memory
(Posted: Dec. 2, 2024) Plan for Dec. 2
- pd06_ilp
- Instruction Level Parallelism or How to Get nearly peak FLOPS on CPU
- pd30_mlp
(Posted: Nov. 25, 2024) Plan for Nov. 25
- SIMD high level approach (recap)
- pd04_simd_high_level
- SIMD low level approach
- pd05_simd_low_level
- pd30_mlp
(Posted: Nov. 16, 2024)
- I released a new Jupyter notebook, pd30_mlp
- It is going to be the next assignment you will submit
- See the notebook, as well as this github page for more details and further updates
- As due is still undecided, there is no entry in UTOL yet, but I will make it once due is fixed
(Posted: Nov. 13, 2024) Plan for Nov. 13
- pd03_omp_gpu
- SIMD high level approach
- pd04_simd_high_level
- SIMD low level approach
- pd05_simd_low_level
(Posted: Nov. 11, 2024) Plan for Nov. 11
- CUDA
- pd02_cuda
- OpenMP for GPU
- pd03_omp_gpu
(Posted: Nov. 11, 2024)
- we’ll have another class this week, on Wednesday, Nov. 13th
- no class in the next Monday, Nov. 18th; I’ll deliver on-demand materials
(Posted: Nov. 02, 2024)
- Example answer for pd01_omp released
- Example answer for pd02_cuda released
(Posted: Oct. 27, 2024) Plan for Oct. 28
- CUDA
- pd02_cuda
(Posted: Oct. 20, 2024) Plan for Oct. 21
- OpenMP
- pd01_omp
(Posted: Oct. 06, 2024) Plan for Oct. 07
- make sure you can see the course page in UTOL and check the comments from the instructor of “Assignment 0: Jupyter password”
- Introduction
- Play with Jupyter
- pd00_intro
- OpenMP
- pd01_omp
(Posted: Sep. 28, 2024) Site up

2 Slides

3 Languages

All written materials (slides, home pages, etc.) will be in English
Lectures will be in English

4 Hands-on programming exercise

You will have an access to latest CPU and GPU machines and hands-on experiences on parallel programming
This year, I emphasize a programming model targetting both CPUs and GPUs (OpenMP + GPU offloading)

5 How to get the credit

Jan. 6 Details released in a separage page

6 Topics covered

Parallel Programming in Practice
- It’s easy! — a quick and gentle introduction to parallel problem solving
- Some examples of parallel problem solving and programming
Taxonomy of parallel machines and programming models
- What today’s machines look like — parallel computer architecture
  - Distributed memory machines
  - Multi-core / multi-socket nodes
  - SIMD instructions
- Parallel programming models
  - Finding and expressing parallelism
  - Mapping computation onto compute resources
  - Coordination and communication
  - Examples of parallel programming languages/models
Understanding performance of parallel programs (and achieving high performance)
- The maximum performance of your CPU/GPU and why you don’t get it for your program?
- The maximum performance of memory and why you don’t get it for your program?
- How to reason about memory traffic of your programs
- Provable bounds of greedy schedulers
- Provable bounds of work-stealing schedulers
- Cache miss bounds of work-stealing schedulers

Parallel and Distributed Programming Kenjiro Taura

1 What’s new (in the newest-first order)

2 Slides

3 Languages

4 Hands-on programming exercise

5 How to get the credit

6 Topics covered

Parallel and Distributed Programming
Kenjiro Taura