Unifying Medium Sparse Processing Frameworks
Tensor computation plays a critical role in scientific computing, graph analytics, and machine learning; many of these applications, such as web link mining, graph-based social networks, and pruned neural networks, rely on operations over sparse data structures.
Several data formats have been introduced over the past decades to compress and store sparse tensors. Sparse tensor frameworks such as TACO and MKL are designed to efficiently process extremely sparse data by minimizing memory traffic and irregular accesses. However, these methods do not perform well on medium-sparsity tensors (60–95%), which are common in modern workloads, such as sparse neural networks.
Several approaches are built on the insight that the structural pattern (1) can improve the performance and (2) is fixed. For example, the weight matrix of a pruned neural network typically remains unchanged after training. Thus, a specialized matrix-specific kernel can accelerate tensor operations. Although a variety of techniques have been proposed based on this insight, the existing work is fragmented, with each study addressing only a subset of the broader design space.
This work proposes SETAM, a unified sparse processing framework with particular focus on medium sparsity. SETAM’s primary goal is to integrate and generalize the ideas from prior work within a unified compiler-driven system. This design enables each technique to be applied where it is most beneficial and can even synthesize novel algorithms.