Table of Contents
May 14
- Project 2 is the most difficult according to previous students, but also the most rewarding.
- Avoid working on Project 2 near the deadline, as there will be contention on the cluster and the performance metrics might fluctuate unnecessarily.
- Projects 3, 4, and 5 don’t need a GPU.
- All videos will unlock in the next week.
May 28
- Hardware specs can change between generations of GPUs.
- Therefore, if we were to hyper-optimize code, but with portability, we need to design custom kernels for different architectures, which is what libraries like cuBLAS and cuDNN often do.
- Not sure if there is a way to get the exact detailed specs out of a GPU, but looking at the hardware spec sheet should help.
- Things are usually consistent within a generation.
- Added to the OH post - “It's likely going to be vendor specific. For Nvidia, there is
cudaGetDeviceProperties
that has a lot of info.”
If-else
in CUDA only reduce performance if the threads are diverging, and have a lot of processing during diversions. If the number of diversions are low, or the diverging threads are only doing, say, a single instruction, then the performance reduction due to loss of parallelism is low.
- For example, diverging statements during a tree traversal would be bad, as after a few levels, all threads would be doing different things, thereby serializing the whole traversal.
- Boundary condition if-else in Module 3, Lesson 4 only has to set the values to 0, which is a minor operation and doesn’t affect the performance as much.
- To solve the
if-else
divergence dilemma, if the branches can be pre-computed, then you can just break it into two kernels and have each branch run individually.
- Profilers can help in this analysis. We can check if we are having performance issues.
- Compilers are also quite smart, and if they can figure out that a branch is not going to be used, or can pre-compute a branch etc, then they will optimize the code for it.
July 23
- Some P5 discussion.
- TLDR: Just make a private post with your code if you need really specific help with the project.