<aside> ❗ Headings and text with “▶” can be toggled to expand contained info. You can also see the Table of Contents as a collapsible sidebar on the right →

</aside>

<aside> 🚨 Quizzes 6, 7, and 8 are significantly more difficult than the first 5 quizzes. The quizzes make the assumption that you have understood the lectures and thoroughly read the required papers.

</aside>

Table of Contents

Introduction

In this module, we will expand our exploration of GPU architecture, providing students with valuable insights into the optimization possibilities for GPU memory. Through this comprehensive perspective, students will gain proficiency in optimizing GPUs, with a particular focus on virtual memory and warp scheduling policies.

Objectives

Understand memory management issues in the GPUs.

Readings

Required

Optional

Notes / Things that stand out

Lesson 1 - GPU Virtual Memory

Supporting Virtual Memory

Two aspects
- Address Translation
- Page Allocations
In CPU, OS handles page allocation.
In GPU, CPU and GPU driver collaborate for page allocations.
- CPU handles GPU page allocations via stuff like: cudaMalloc cudaMallocManaged cudaMemcpy

Review: Address Translations

Programmers write programs assuming Virtual Addresses, and these virtual addresses need to be mapped to Physical Addresses.
Some terms:
- VPN: Virtual Page Number
- PFN / PPN: Physical Frame Number (Physical Page Number)
- PO: Page Offset
Address Translation Process
During Address Translation, VPN becomes PFN, while the page offset remains the same.