DyPARS: Dynamic-Shape DNN Optimization via Pareto-Aware MCTS for Graph Variants (CGO 2026 - Main Conference)

Who

Hao Qian, Guangli Li, Qiuchu Yu, Xueying Wang, Jingling Xue

Track

CGO 2026 Main Conference

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 3 Feb 2026 16:10 - 16:30 at Bronte - Compiling for ML 2 Chair(s): Fabrice Rastello

Abstract

Dynamic-shape DNNs are widely used in applications such as variable-resolution image processing and language modeling with variable-length sequences. Existing DL (Deep-Learning) compilers apply rule-based rewriting to either transform a subgraph into a fixed variant at compile time (leading to sub-optimal performance) or generate multiple variants at runtime, incurring significant overhead. The challenge is discovering and applying shape-dependent subgraph variants that maintain high efficiency across diverse inputs with minimal runtime cost.
We propose DyPARS, a dynamic-shape DL compiler approach that discovers high-performance subgraph variants at compile time and applies the best ones at runtime. Leveraging Pareto-aware MCTS, DyPARS identifies shape-aware variants, incorporating shape-dependent kernel adaptations. These variants are integrated into a prediction-enhanced computational graph, enabling efficient variant selection based on input shapes with minimal overhead. DyPARS achieves average speedups of 1.31x and 1.80x over TorchInductor (JIT) and BladeDISC (non-JIT), respectively, across five DNN models, demonstrating robust efficiency across diverse inputs.

Link to Preprint

https://www.conference-publishing.com/Proc/CGO26/cgo26/cgo26main-p118-p

Hao Qian

University of New South Wales

Australia

Guangli Li

Institute of Computing Technology, Chinese Academy of Sciences

China

Qiuchu Yu

Institute of Computing Technology at Chinese Academy of Sciences

China

Xueying Wang

Beijing University of Posts and Telecommunications

China

Jingling Xue

University of New South Wales

Australia

Media

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 3 Feb
Displayed time zone: Hobart change

15:50 - 17:10	Compiling for ML 2Main Conference at Bronte Chair(s): Fabrice Rastello University Grenoble Alpes - Inria - CNRS - Grenoble INP - LIG

15:50 20m Talk		QIGen: A Kernel Generator for Inference on Nonuniformly Quantized Large Language Models Main Conference Tommaso Pegolotti ETH Zürich, Dan Alistarh IST Austria, Markus Püschel ETH Zurich Pre-print Media Attached
16:10 20m Talk		DyPARS: Dynamic-Shape DNN Optimization via Pareto-Aware MCTS for Graph Variants Main Conference Hao Qian University of New South Wales, Guangli Li Institute of Computing Technology, Chinese Academy of Sciences, Qiuchu Yu Institute of Computing Technology at Chinese Academy of Sciences, Xueying Wang Beijing University of Posts and Telecommunications, Jingling Xue University of New South Wales Pre-print Media Attached
16:30 20m Talk		Compiler-Runtime Co-operative Chain of Verification for LLM-Based Code Optimization Main Conference Hyunho Kwon Yonsei University, Sanggyu Shin SAIT, Ju Min Lee Yonsei University, Hoyun Youm Yonsei University, Seungbin Song SAIT, Seongho Kim Yonsei University, Hanwoong Jung Samsung Advanced Institute of Technology, Seungwon Lee Samsung Advanced Institute of Technology, Hanjun Kim Yonsei University Pre-print
16:50 20m Talk		Hexcute: A Compiler Framework for Automating Layout Synthesis in GPU Programs Main Conference Xiao Zhang University of Toronto; NVIDIA, Yaoyao Ding University of Toronto; Vector Institute; NVIDIA, Bolin Sun University of Toronto; NVIDIA, Yang Hu NVIDIA, Tatiana Shpeisman Google, Gennady Pekhimenko University of Toronto / Vector Institute Pre-print Media Attached

Hide past events