Date of Award
5-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
School of Computing
Committee Chair/Advisor
Rong Ge
Committee Member
Kai Liu
Committee Member
Feng Luo
Committee Member
Prasanna Balaprakash
Committee Member
Xingfu Wu
Abstract
Continuous increases in high performance computing (HPC) throughput have served as catalysts for industry and scientific advancement in countless manners that have fundamentally shaped our modern world. Our demands on compute resources continue to scale, but the limitations of Ahmdal’s law and Dennard scaling have proven increasingly difficult to overcome when approached solely through hardware or software design. Furthermore, many HPC applications fail to utilize the collective system’s performance, even on the most advanced supercomputers.
However, the resurgence of AI in the industry has promoted an explosion of hardware and software codesign that have fueled massive improvements in GPU design and novel ASICs. These performance improvements are maximized on myriad heterogeneous systems by specially tuning applications. Mimicking these developments across the whole of computing will require similarly holistic approaches combining specialty hardware, software that caters its design to the greatest hardware strengths, and fine-tuning on individual systems to maximize performance.
We use three distinct perspectives to address scalable system performance holistically. We analyze the impacts of liquid immersion cooling technologies on sustained application performance and energy efficiency. Next, we present a case study where intentional algorithmic redesign for GPU acceleration permits robust performance improvements that endure through multiple generations of hardware. We find that memory latency forms a primary bottleneck for GPU-accelerated performance and demonstrate how algorithm-specific optimizations can significantly improve performance over multiple architecture generations. Finally, we tie these concepts together through performance optimization techniques that respect software- and hardware-based performance constraints. We improve the re-usability of performance insights with novel transfer learning techniques that make performance optimization costs more predictable and more successful in the short term. Our insights demonstrate the necessity of systemic approaches for performance tuning in HPC.
Recommended Citation
Randall, Thomas L., "A Systemic Approach to Maximize Heterogeneous System Performance" (2025). All Dissertations. 3935.
https://open.clemson.edu/all_dissertations/3935
Author ORCID Identifier
0000-0002-1213-1011