The options pricing library I work on at CSIRO is both computationally intensive and highly mathematical – a perfect fit for improving performance by moving calculations to the GPU. In this talk I will discuss my experience adjusting our existing code to use CUDA and some lessons learned along the way.
Adjusting existing code to use CUDA is not yet as simple as recompiling with the Nvidia compiler. There are two primary constraints that working with the GPU imposes. First, the GPU cannot access just any memory address. Second, the GPU cannot execute just any function in your program. These constraints can make it difficult to use code that was not designed with the GPU in mind with CUDA.
In this talk I will first address the constraints on memory layout that working with the GPU imposes. I will then show how I adjusted data-structures and used std::pmr to place data within GPU accessible memory. Next, I will discuss the constraints created by only being able to call on the GPU functions annotated with __device__. I will then show how the CUDA feature for treating constexpr functions as __device__ functions makes it easier to get code running on the GPU and debugging it once it is there. Lastly, I will compare the performance of calculations done on the CPU vs CUDA.
If you are considering porting existing code to CUDA, by the end of the talk you should have a better idea of the impediments to running that code on the GPU and high level details of the strategy I used to overcome those impediments.
ALL TALK SESSIONS CAN BE ACCESSED FROM THE MAIN LOBBY:
https://cppcon.digital-medium.co.uk/