FORTE: Online DataFrame Query Optimizer (CGO 2026 - Main Conference)

Sat 31 January - Wed 4 February 2026 Sydney, Australia

co-located with HPCA/CGO/PPoPP/CC 2026

Who

Yoonho Choi, Kyoungtae Lee, Minji Kim, Hyungsoo Jung, Hyojin Sung

Track

CGO 2026 Main Conference

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 2 Feb 2026 14:10 - 14:30 at Bronte - DSLs Chair(s): Olivia Hsu

Abstract

DataFrame libraries are widely adopted in data science for their flexible, Pythonic interfaces, but their fragmented APIs and unstructured query patterns limit systematic optimization.
Existing work has explored parallel execution or SQL-style logical rewrites, yet these approaches fall short in capturing DataFrame-specific semantics and Python control-flow context. We present FORTE, the first online, source-to-source query optimizer that unifies multiple DataFrame libraries under a shared intermediate representation (DFL).
DFL makes DataFrame semantics explicit, enabling composable and portable rewriting rules such as user-defined function (UDF) lifting/lowering, loop lifting, and API tuning, alongside classical rewrites (e.g., predicate pushdown). FORTE employs a lightweight, learned cost model and greedy search to apply these rewrites with negligible overhead, while supporting both intra-library optimization and cross-library transpilation. Our evaluation on TPC-H workloads and real-world Kaggle/GitHub workloads shows that FORTE consistently delivers substantial speedups—up to 52.53× (3.7× on average) across Pandas, Modin, Polars, and Pandas-on-Spark—demonstrating that online, IR-guided rewriting can significantly outperform existing DataFrame engines and rewriters, while enabling cross-library retargetability.

Link to Preprint

https://www.conference-publishing.com/Proc/CGO26/cgo26/cgo26main-p98-p

Yoonho Choi

POSTECH

Kyoungtae Lee

Seoul National University

Minji Kim

Ewha Womans University

Hyungsoo Jung

Seoul National University

Hyojin Sung

Seoul National University

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 2 Feb
Displayed time zone: Hobart change

14:10 - 15:30	DSLsMain Conference at Bronte Chair(s): Olivia Hsu Stanford University

14:10 20m Talk		FORTE: Online DataFrame Query Optimizer Main Conference Yoonho Choi POSTECH, Kyoungtae Lee Seoul National University, Minji Kim Ewha Womans University, Hyungsoo Jung Seoul National University, Hyojin Sung Seoul National University Pre-print
14:30 20m Talk		LEGO: A Layout Expression Language for Code Generation of Hierarchical Mapping Main Conference Amir Mohammad Tavakkoli University of Utah, Cosmin E. Oancea University of Copenhagen, Denmark, Mary Hall University of Utah Pre-print Media Attached
14:50 20m Talk		Pushing Tensor Accelerators beyond MatMul in a User-Schedulable Language Main Conference Yihong Zhang University of Washington, Derek Gerstmann Adobe, Andrew Adams Adobe Research, Maaz Bin Safeer Ahmad University of Washington, Seattle Pre-print Media Attached
15:10 20m Talk		Tawa: Automatic Warp Specialization for Modern GPUs with Asynchronous References Main Conference Hongzheng Chen Cornell University, Bin Fan Nvidia, Alexander Collins NVIDIA, Bastian Hagedorn NVIDIA, Evghenii Gaburov NVIDIA, Masahiro Masuda NVIDIA, Matthew Brookhart NVIDIA, Chris Sullivan NVIDIA, Jason Knight NVIDIA, Zhiru Zhang Cornell University, USA, Vinod Grover NVIDIA Pre-print