Dr.avx: A Dynamic Compilation System for Seamlessly Executing Hardware-Unsupported Vectorization Instructions
This program is tentative and subject to change.
Modern processors are breaking a fundamental rule: backward compatibility within their own ISA families. We term this Generational ISA Fragmentation (GIF), where newer processors cannot execute instructions supported by prior generations within the same ISA family. This phenomenon is exemplified by Intel's removal of AVX-512 from Alder Lake processors after years of deployment, ARM's inconsistent support for SVE across cores, and RISC-V's incompatible vector specifications. GIF causes illegal instruction crashes when running applications optimized for earlier processors on newer hardware, threatening the foundation of software portability that has underpinned decades of computing evolution.
We introduce Dr.avx, a dynamic compilation system that enables seamless execution of AVX-512 instructions on hardware that lacks native support. Dr.avx addresses the most instructive GIF instance, x86 AVX-512 fragmentation, by targeting the integer and floating-point operations that dominate real workloads. Our rewrite engine performs a fine-grained classification of AVX-512 opcode-operand patterns and employs three complementary strategies: \textit{Instr Mirroring}, \textit{AVX Lowering}, and \textit{Scalar Fallback}. Experiments show that Dr.avx incurs a geometric mean overhead of 1.44$\times$ on SPEC CINT2017 relative to native AVX-512 execution, 17.3% better than Intel's closed-source SDE. On production databases, Dr.avx sustains 75%\text{–}88% (MySQL) and 86%\text{–}99% (MongoDB) of native throughput, yielding ({2.0\text{–}2.7\times}) higher throughput than SDE. For LLM inference (llama.cpp), Dr.avx keeps 95%\text{–}99% of native tokens/s and delivers ({2.5\text{–}4.8\times}) speedup over SDE. Unlike Intel's proprietary SDE, which provides no visibility into its implementation details, Dr.avx achieves functional correctness while providing an open, extensible, near-native performance implementation. Our work offers both a remedy for AVX-512 fragmentation and a blueprint for addressing similar compatibility challenges emerging across all major ISAs.
This program is tentative and subject to change.
Tue 3 FebDisplayed time zone: Hobart change
09:50 - 11:10 | |||
09:50 20mTalk | Binary Diffing via Library Signatures Main Conference Andrei Rimsa CEFET-MG, Anderson Faustino da Silva State University of Maringá, Camilo Santana Melgaço Federal University of Minas Gerais, Fernando Magno Quintão Pereira Federal University of Minas Gerais Pre-print Media Attached | ||
10:10 20mTalk | BIT: Empowering Binary Analysis through the LLVM Toolchain Main Conference Puzhuo Liu Ant Group & Tsinghua University, Peng Di Ant Group & UNSW, Jingling Xue University of New South Wales, Yu Jiang Tsinghua University Pre-print | ||
10:30 20mTalk | Dr.avx: A Dynamic Compilation System for Seamlessly Executing Hardware-Unsupported Vectorization Instructions Main Conference Yue Tang East China Normal University, Mianzhi Wu East China Normal University, Yufeng Li East China Normal University, Haoyu Liao East China Normal University, Jianmei Guo East China Normal University, Bo Huang East China Normal University Pre-print Media Attached | ||
10:50 20mTalk | Practical: Are Abstract-Interpreter Baseline JITs Worth It? An Empirical Evaluation through Metacompilation Main Conference Nahuel Palumbo Université Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStAL, Guillermo Polito Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Stéphane Ducasse Inria; University of Lille; CNRS; Centrale Lille; CRIStAL, Pablo Tesone Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL, Pharo Consortium Pre-print | ||