GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection
Over the years, many frameworks and optimization techniques have been proposed to accelerate graph neural networks (GNNs).
In contrast to the optimizations explored in these systems, we observe that different matrix re-associations of GNN computations lead to novel input-sensitive performance behavior.
We leverage this observation to propose GRANII, a system that \textit{exposes} different
compositions of sparse and dense matrix primitives
based on different matrix re-associations of GNN computations and \textit{selects} the best among them based on input attributes. GRANII executes in two stages: (1) an offline compilation stage that enumerates all valid re-associations leading to different sparse-dense matrix compositions and uses input-oblivious pruning techniques to prune away clearly unprofitable candidates, and (2) an online runtime system that explores the remaining candidates and uses lightweight cost models to select the best re-association based on the input graph and the embedding sizes.
On a wide range of configurations, GRANII achieves a geo-mean speedup of $1.56\times$ for inference and $1.4\times$ for training across multiple GNN models and systems.
We also show GRANII's technique functions on diverse implementations and with techniques such as sampling.