You are now in the main content area

Power and Performance based Autotuning of Heterogeneous Applications

MASC Thesis Oral Exam Announcement
By: Sunbal Cheema
August 26, 2022

ABSTRACT
Recent advancement in artificial intelligence (AI) and deep learning has happened due to the usage General Purpose Graphics Processing Units (GPUs) to implement these AI applications. GPU programming became easier with the advent of high-level abstraction API frameworks such as OpenCL and CUDA. The portability of these frameworks has been the performance cost. The GPU kernel performance is highly dependent on the underlying hardware architecture. The application kernels need their tuning every time it executes on a new device. The work presented in this thesis focuses on OpenCL kernels running on heterogeneous CPU-GPU systems. First, we present an analytical approach to estimate the power and performance of a convolution neural network (CNN) on a heterogeneous system that is useful for power and performance based auto-tuning. Then we present our main contribution to multi-objective OpenCL kernels and propose Multi-Objective Kernel Auto-Tuner (MOKAT) for power and performance tuning. MOKAT tunes an OpenCL kernel without compromising on any of the two objectives (power and performance) and provides a final set of pareto-optimal kernels. MOKAT offers an integrated power calculation methodology for both online and offline tuning. It utilizes Non-Dominated Sorting Genetic Algorithm (NSGA-II) as the multi-objective evolutionary algorithm. We describe the MOKAT API and internal structure of our framework. The two case studies of kernel tuning are related to 2D convolution and General Matrix Multiplication.

(Anyone can attend this zoom session.  Please send an email to Dr. Gul Khan  requesting the zoom URL and passcode)