Compilation of Generalized Matrix Chains with Symbolic Sizes
This program is tentative and subject to change.
Generalized Matrix Chains (GMCs) are products of matrices where each matrix carries features (e.g., general, symmetric, triangular, positive-definite) and is optionally transposed and/or inverted.
GMCs are commonly evaluated via sequences of calls to BLAS and LAPACK kernels.
When matrix sizes are known, one can craft a sequence of kernel calls to evaluate a GMC that minimizes some cost, e.g., the number of floating-point operations (FLOPs).
Even in these circumstances, high-level languages and libraries, upon which users usually rely, typically perform a suboptimal mapping of the input GMC onto a sequence of kernels.
In this work, we go one step beyond and consider matrix sizes to be symbolic (unknown); this changes the nature of the problem since no single sequence of kernel calls is optimal for all possible combinations of matrix sizes.
We design and evaluate a code generator for GMCs with symbolic sizes that relies on multi-versioning.
At compile-time, when the GMC is known but the sizes are not, code is generated for a few carefully selected sequences of kernel calls.
At run-time, when sizes become known, the best generated variant for the matrix sizes at hand is selected and executed.
The code generator uses new theoretical results that guarantee that the cost is within a constant factor from optimal for all matrix sizes and an empirical tuning component that further tightens the gap to optimality in practice.
In experiments, we found that the increase above optimal in both FLOPs and execution time of the generated code was less than 15% for 95% of the tested chains.
This program is tentative and subject to change.
Tue 3 FebDisplayed time zone: Hobart change
09:50 - 11:10 | |||
09:50 20mTalk | TPDE: A Fast Adaptable Compiler Back-End Framework Main Conference Pre-print Media Attached | ||
10:10 20mTalk | Synthesizing Instruction Selection Back-Ends from ISA Specifications Made Practical Main Conference Pre-print | ||
10:30 20mTalk | SparseX: Synergizing GPU Libraries for Sparse Matrix Multiplication on Heterogeneous Processors Main Conference Ruifeng Zhang North Carolina State University, Xiangwei Wang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Xipeng Shen North Carolina State University Pre-print Media Attached | ||
10:50 20mTalk | Compilation of Generalized Matrix Chains with Symbolic Sizes Main Conference Pre-print Media Attached | ||