GPUCC - An Open-Source GPGPU Compiler
Venue
Proceedings of the 2016 International Symposium on Code Generation and Optimization, ACM, New York, NY, pp. 105-116
Publication Year
2016
Authors
Jingyue Wu, Artem Belevich, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng, Robert Hundt
BibTeX
Abstract
Graphics Processing Units have emerged as powerful accelerators for massively
parallel, numerically intensive workloads. The two dominant software models for
these devices are NVIDIA’s CUDA and the cross-platform OpenCL standard. Until now,
there has not been a fully open-source compiler targeting the CUDA environment,
hampering general compiler and architecture research and making deployment
difficult in datacenter or supercomputer environments. In this paper, we present
gpucc, an LLVM-based, fully open-source, CUDA compatible compiler for high
performance computing. It performs various general and CUDA-specific optimizations
to generate high performance code. The Clang-based frontend supports modern
language features such as those in C++11 and C++14. Compile time is 8% faster than
NVIDIA’s toolchain (nvcc) and it reduces compile time by up to 2.4x for
pathological compilations (>100 secs), which tend to dominate build times in
parallel build environments. Compared to nvcc, gpucc’s runtime performance is on
par for several open-source benchmarks, such as Rodinia (0.8% faster), SHOC (0.5%
slower), or Tensor (3.7% faster). It outperforms nvcc on internal large-scale
end-to-end benchmarks by up to 51.0%, with a geometric mean of 22.9%.