ACM International Conference on Supercomputing 2025

June 8-11, 2025 Salt Lake City, U.S.A.

Preliminary Program

06/09/2025 Monday

Time Track A Track B
09:00-9:10 Opening
09:10-10:20 Keynote
Title:
10:20-10:40 Coffee Break
10:40-12:00 Session: Approximation
Chair:
Session: Graph Neural Networks
Chair:
  • Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers
    Chen Zhuang:Institute of Science Tokyo,Riken Center for Computational Science;Lingqi Zhang:RIKEN Center for Computational Science;Du Wu:Institute of Science Tokyo,RIKEN Center for Computational Science;Peng Chen:RIKEN Center for Computational Science;Jiajun Huang:University of South Florida;Xin Liu:National Institute of Advanced Industrial Science & Technology;Rio Yokota:Institute of Science Tokyo;Nikoli Dryden:Lawrence Livermore National Laboratory;Toshio Endo:Institute of Science Tokyo;Satoshi Matsuoka:RIKEN Center for Computational Science,Institute of Science Tokyo;Mohamed Wahib:RIKEN Center for Computational Science
  • CoLa: Towards Communication-efficient Distributed Sparse Matrix-Matrix Multiplication on GPUs
    Lixing Zhang:Beijing University of Posts and Telecommunications;Yingxia Shao:Beijing University of Posts and Telecommunications;Shigang Li:Beijing University of Posts and Telecommunications
  • Cherry: Breaking the GPU Memory Wall for Large-Scale GNN Training via Micro-Batching
    Yan Wang:Guangzhou Institute of Technology, Xidian University;Qinghua Guo:Guangzhou Institute of Technology, Xidian University;Haoran Kong:Institute of Computing Technology, Chinese Academy of Sciences;Kai Sheng:Guangzhou Institute of Technology, Xidian University;Zhen Xie:Binghamton University;Hao Chen:College of Computer Science and Electronic Engineering, Hunan University;Weile Jia:Institute of Computing Technology, Chinese Academy of Sciences;Dingwen Tao:Institute of Computing Technology, Chinese Academy of Sciences;Xin He:Guangzhou Institute of Technology, Xidian University
  • Fused3S: Fast Sparse Attention on Tensor Cores
    Zitong Li:University of California, Irvine;Aparna Chandramowlishwaran:University of California, Irvine
12:00-1:40 Lunch
1:40-3:00 Session: Sparse Linear Algebra
Chair:
Session: Acceleration
Chair:
3:00-3:20 Coffee Break
3:20-5:20 Session: Applications
Chair:
Session: GPU Scheduling
Chair:

06/10/2025 Tuesday

Time Track A Track B
9:00-10:20 Session: Solvers & Sparsity
Chair:
  • CRAMG: A Communication-Reduced Algebraic Multigrid Method
    Fan Yuan:School of Mathematics and Computer Science, Xiangtan University;Xiaojian Yang:National University of Defense Technology;Yunqing Huang:School of Mathematics and Computer Science, Xiangtan University;Dezun Dong:National University of Defense Technology;Chuanfu Xu:National University of Defense Technology;Jie Liu:National University of Defense Technology;Xiaoqiang Yue:School of Mathematics and Computer Science, Xiangtan University;Shengguo Li:National University of Defense Technology;Hongxia Wang:Department of Mathematics, National University of Defense Technology
  • An Efficient 2D Fusion Method for High-Performance Two-Stage Eigensolvers on Modern Heterogeneous Architectures
    Yongxiao Zhou:Tsinghua University;Yi Zong:Tsinghua University;Yuyang Jin:Tsinghua University;Heng Li:Tsinghua University;Wei Xue:Tsinghua University, Beijing, China; Qinghai University, Xining, China
  • XSolver: Optimizing Sparse Direct Solvers for Heterogeneous Systems
    Chaewon Kim:Department of Seoul National University;Jaehwan Lee:Department of Seoul National University;Jinpyo Kim:Department of Seoul National University;Dohyun Kim:Institute of Computer Technology, Seoul National University;Kyusu Ahn:Department of Data Science, Seoul National University,Research Center, Samsung Display Co., Ltd.;Hyung Uk Cho:Research Center, Samsung Display Co., Ltd.;Seungin Baek:Research Center, Samsung Display Co., Ltd.;Jaejin Lee:Dept. of Data Science, Seoul National University,Dept. of Seoul National University
  • MAGNUS: Generating Data Locality to Accelerate Sparse Matrix-Matrix Multiplication on CPUs
    Jordi Wolfson-Pou:Intel Labs;Jan Laukemann:Friedrich-Alexander-Universität Erlangen-Nürnberg;Fabrizio Petrini:Intel Labs
Session: Processing-in-Memory
Chair:
10:20-10:40 Coffee Break
10:40-12:00 Session: Efficiency
Chair:
Session: Optimizing Compilation
Chair:
12:00-1:40 Lunch
1:40-3:40 Session: Best Papers
Chair:
  • Pushing the Limits of GPU Lossy Compression: A Hierarchical Delta Approach
    Boyuan Zhang:Indiana University;Yafan Huang:University of Iowa;Sheng Di:Argonne National Laboratory;Fengguang Song:Indiana University;Guanpeng Li:University of Iowa;Franck Cappello:Argonne National Laboratory
  • Parallel Contraction Hierarchies Can Be Efficient and Scalable
    Zijin Wan:University of California, Riverside;Xiaojun Dong:University of California, Riverside;Letong Wang:University of California, Riverside;Enzuo Zhu:University of California, Davis;Yan Gu:University of California, Riverside;Yihan Sun:University of California, Riverside
  • BMQSim: Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework
    Boyuan Zhang:Indiana University;Bo Fang:Pacific Northwest National Laboratory;Fanjiang Ye:Indiana University;Luanzheng Guo:Pacific Northwest National Laboratory;Fengguang Song:Indiana University;Nathan Tallent:Pacific Northwest National Laboratory;Dingwen Tao:Indiana University
  • DIV: An Index & Value compression method for SpMV on large matrices
    Dimitrios Galanopoulos:National Technical University of Athens;Panagiotis Mpakos:National Technical University of Athens;Petros Anastasiadis:National Technical University of Athens;Nectarios Koziris:National Technical University of Athens;Georgios Goumas:National Technical University of Athens
  • DIMPLES: Distributed Influence Maximization for Pandemic pLanning on Exascale Systems
    Marco Minutoli:Pacific Northwest National Laboratory;Reece Neff:North Carolina State University;Naw Safrin Sattar:Oak Ridge National Laboratory;Hao Lu:Oak Ridge National Laboratory;John Feo:Pacific Northwest National Laboratory;Henning Mortveit:University of Virginia;Anil Vullikanti:University of Virginia;Dawen Xie:University of Virginia;Mandy L Wilson:University of Virginia;Gregor von Laszewski:University of Virginia;Parantapa Bhattacharya:University of Virginia;S M Ferdous:Pacific Northwest National Laboratory;Ananth Kalyanaraman:Washington State University;Michela Becchi:North Carolina State University;Madhav Marathe:University of Virginia;Mahantesh Halappanavar:Pacific Northwest National Laboratory
  • Light-FP: Analyze Floating-Point Error in a Highly Condensed Approach
    Jiazhi Mi:Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences;Li Chen:Institute of Computing Technology, Chinese Academy of Sciences,Laboratory for Advanced Computing and Intelligence Engineering;Haoyu Wang:Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences;Ruixiang Gao:Shandong University of Science and Technology;Hongze Zhang:Shandong University of Science and Technology;Ronghong Shen:Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences;Kai Lin:Beijing Institute of Technology;You Fu:Shandong University of Science and Technology;Huimin Cui:Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences
After 3:40 Excursion

06/11/2025 Wednesday

Time Track A Track B
9:00-10:20 Session: Performance Analysis
Chair:
Session: Heterogeneity
Chair:
  • Understanding the Idiosyncrasies of Emerging BlueField DPUs
    Arjun Kashyap:University of California, Merced;Yuke Li:University of California, Merced;Darren Ng:University of California, Merced;Xiaoyi Lu:University of California, Merced
  • Multi-node Multi-GPU Datalog
    Ahmedur Rahman Shovon:University of Illinois Chicago;Yihao Sun:Syracuse University;Kristopher Micinski:Syracure University;Thomas Gilray:Washington State University;Sidharth Kumar:University of Illinois Chicago
  • SmartNIC-GPU-CPU Heterogeneous System for Large Machine Learning Model with Software-Hardware Codesign
    Anqi Guo: Boston University;Yuchen Hao:Meta Platforms;Xiteng Yao: Boston University;Shining Yang:Boston University;Jianyu Huang:Meta Platforms;Tony (Tong) Geng:Department of Electrical and Computer Engineering, University of Rochester;Martin Herbordt: Boston University
  • D-Rex: Heterogeneity-Aware Reliability Framework and Adaptive Algorithms for Distributed Storage
    Maxime Gonthier:University of Chicago,Argonne National Laboratory;Dante D. Sanchez-Gallegos:University Carlos III of Madrid;Haochen Pan:University of Chicago;Bogdan Nicolae:Argonne National Laboratory;Sicheng Zhou:Southern University of Science and Technology;Hai Duc Nguyen:University of Chicago ,Argonne National Laboratory;Valerie Hayot-Sasson:University of Chicago,Argonne National Laboratory;J. Gregory Pauloski:University of Chicago;Jesus Carretero:University Carlos III of Madrid;Kyle Chard:University of Chicago,Argonne National Laboratory;Ian Foster:University of Chicago,Argonne National Laboratory
10:20-10:40 Coffee Break
10:40-12:00 Session: Resource Management
Chair:
Session: Code Optimization
Chair:
12:00-1:40 Lunch
1:40-3:00 Session: Energy & Servers
Chair:
Session: Potpourri
Chair:
  • PortFC: Designing High-performance Deadlock-free BCube Networks
    Peirui Cao:Nanjing University;Rui Ning:Nanjing University;Hongwei Yang: China Mobile;Zhaochen Zhang:Nanjing University;Chang Liu:Nanjing University;Rui Li:Nanjing University;Yongqi Yang:Nanjing University;Yunzhuo Liu:Nanjing University;Chengyuan Huang:Nanjing University;Tao Sun: China Mobile;Xiaodong Duan: China Mobile;Guihai Chen:Nanjing University;Chen Tian:Nanjing University
  • Auto-Healer: Self-Healing Hardware for Perception Stage Faults in Autonomous Driving Systems
    Ali Suvizi:George Washington University;Guru Venkataramani:George Washington University
  • OpaQue: Program Output Obfuscation for Quantum Software Circuits in Quantum Clouds
    Tirthak Patel:Rice University;Aditya Ranjan:Northeastern University;Daniel Silver:Northeastern University;Harshitta Gandhi:QBit Solutions Research;William Cutler:Oxford University;Devesh Tiwari:Northeastern University
  • JBSA: A Bit-Serial Accelerator for Deep Neural Networks Using Superconducting SFQ Logic
    Yang Su:ShanghaiTech University; Shanghai Innovation Center for Processor Technologies;Sheng Li:ShanghaiTech University; Shanghai Innovation Center for Processor Technologies;Huilong Jiang:State Key Lab of Processors, Institute of Computing Technology, CAS;Haofei Yin:ShanghaiTech University; Shanghai Innovation Center for Processor Technologies;Rongliang Fu:The Chinese University of Hong Kong;Junying Huang:State Key Lab of Processors, Institute of Computing Technology, CAS;Xiaochun Ye:State Key Lab of Processors, Institute of Computing Technology, CAS;Zhimin Zhang:State Key Lab of Processors, Institute of Computing Technology, CAS;Jie Ren:State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, CAS; University of Chinese Academy of Sciences ;Xiaoping Gao:State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, CAS;Tsung-Yi Ho:The Chinese University of Hong Kong;Dongrui Fan:State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences
3:20-5:00 Session: Graph Algorithms
Chair:
Session: Memory Systems
Chair: