GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection (CGO 2026 - Main Conference)

Who

Damitha Lenadora, Vimarsh Sathia, Gerasimos Gerogiannis, Serif Yesil, Josep Torrellas, Charith Mendis

Track

CGO 2026 Main Conference

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 2 Feb 2026 10:10 - 10:30 at Bronte - Compiling for ML 1 Chair(s): Albert Cohen

Abstract

Over the years, many frameworks and optimization techniques have been proposed to accelerate graph neural networks (GNNs).
In contrast to the optimizations explored in these systems, we observe that different matrix re-associations of GNN computations lead to novel input-sensitive performance behavior.
We leverage this observation to propose GRANII, a system that \textit{exposes} different
compositions of sparse and dense matrix primitives
based on different matrix re-associations of GNN computations and \textit{selects} the best among them based on input attributes. GRANII executes in two stages: (1) an offline compilation stage that enumerates all valid re-associations leading to different sparse-dense matrix compositions and uses input-oblivious pruning techniques to prune away clearly unprofitable candidates, and (2) an online runtime system that explores the remaining candidates and uses lightweight cost models to select the best re-association based on the input graph and the embedding sizes.
On a wide range of configurations, GRANII achieves a geo-mean speedup of $1.56\times$ for inference and $1.4\times$ for training across multiple GNN models and systems.
We also show GRANII's technique functions on diverse implementations and with techniques such as sampling.

Link to Preprint

https://www.conference-publishing.com/Proc/CGO26/cgo26/cgo26main-p52-p

Damitha Lenadora

University of Illinois at Urbana-Champaign

United States

Vimarsh Sathia

University of Illinois Urbana Champaign

United States

Gerasimos Gerogiannis

University of Illinois at Urbana-Champaign

United States

Serif Yesil

NVIDIA

United States

Josep Torrellas

University of Illinois at Urbana-Champaign

United States

Charith Mendis

University of Illinois at Urbana-Champaign

United States

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 2 Feb
Displayed time zone: Hobart change

09:50 - 11:10	Compiling for ML 1Main Conference at Bronte Chair(s): Albert Cohen Google DeepMind

09:50 20m Talk		Enabling Spill-Free Compilation via Affine-Based Live Range Reduction Optimization Main Conference Prasanth Chatarasi IBM Research, Alex Gatea IBM, Wei Wang IBM, Chris Bowler IBM, Shubham Jain IBM Research, Masoud Ataei Jaliseh IBM, Nicole Khoun IBM, Alberto Mannari IBM, Bardia Mahjour IBM, Viji Srinivasan IBM Research, Swagath Venkataramani IBM Research Pre-print
10:10 20m Talk		GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection Main Conference Damitha Lenadora University of Illinois at Urbana-Champaign, Vimarsh Sathia University of Illinois Urbana Champaign, Gerasimos Gerogiannis University of Illinois at Urbana-Champaign, Serif Yesil NVIDIA, Josep Torrellas University of Illinois at Urbana-Champaign, Charith Mendis University of Illinois at Urbana-Champaign Pre-print
10:30 20m Talk		Fast Autoscheduling for Sparse ML Frameworks Main Conference Bobby Yan Stanford University, Alexander J Root Stanford University, Trevor Gale Stanford University, David Broman KTH Royal Institute of Technology, Fredrik Kjolstad Stanford University Pre-print
10:50 20m Talk		Eliminating Redundancy: Ultra-compact Code Generation for Programmable Dataflow Accelerators Main Conference Prasanth Chatarasi IBM Research, Alex Gatea IBM, Bardia Mahjour IBM, Jintao Zhang Unaffiliated, Alberto Mannari IBM, Chris Bowler IBM, Shubham Jain IBM Research, Masoud Ataei Jaliseh IBM, Nicole Khoun IBM, Kamlesh Kumar Unaffiliated, Viji Srinivasan IBM Research, Swagath Venkataramani IBM Research Pre-print