• Unleashing Multi-GPU Computing to the Next-Level. University of California, Los Angeles
  • Unleashing Multi-GPU Computing to the Next-Level. University of California, Irvine
  • Unleashing Multi-GPU Computing to the Next-Level. University of Southern California
  • Expediting Continual Online Learning on Edge Platforms through Software-Hardware Co-designs. NSF PI Meeting
  • Toward Fault-tolerant and Scalable Quantum Computing. QCE 2025
  • Unleashing Multi-GPU Computing to the Next-Level. ISCA 2025 Forum
  • Reinforcement Learning-Guided Graph State Generation in Photonic Quantum Computers. ISCA 2025
  • Towards High-Fidelity, Scalable, and Accessible Quantum Computing Systems. StableQ Workshop co-located with MICRO 2024
  • Towards High-Fidelity, Scalable, and Accessible Quantum Computing Systems. Pitt SCI Dean's Spotlight
  • Compilation in Measurement-Based Quantum Computing. Pitt Quantum Institute 2024
  • Towards Efficient and Scalable Computing for Multi-GPUs. Shanghai Jiao Tong University
  • Towards Efficient and Scalable Computing for Multi-GPUs. University of Science and Technology of China
  • Towards Efficient and Scalable Computing for Multi-GPUs. Sun Yat-sen University
  • Towards Efficient and Scalable Computing for Multi-GPUs. The Hong Kong University of Science and Technology
  • CEGMA: Coordinated Elastic Graph Matching Acceleration for Graph Matching Networks. HPCA 2023
  • Embracing Heterogeneity in Modern GPUs. Pitt Momentum 2021
  • Mix and Match: Reorganizing Tasks for Enhancing Data Locality. SIGMETRICS 2021
  • Optimizing Quantum Circuit Simulation/Emulation on HPC Platforms. Pitt Quantum Institute 2020
  • Enhancing address translation in GPUs through compression. Intel
  • Enhancing Address Translations in Throughput Processors via Compression. PACT 2020
  • Co-Optimizing Memory-Level Parallelism and Cache-Level Parallelism. PLDI 2019
  • Computing with Near Data. SIGMETRICS 2019
  • Irregularity-aware Computation and Data Management in Manycore Systems. Job talk at multiple universities, Spring 2019
  • Quantifying and Optimizing Data Access Parallelism on Manycores. MASCOTS 2018
  • Scheduling in the cloud. MASCOTS 2018
  • Enhancing Computation-to-Core Assignment with Physical Location Information. PLDI 2018
  • Data Movement Aware Computation Partitioning. MICRO 2017
  • DEMM: a Dynamic Energy-saving mechanism for Multicore. MASCOTS 2017
  • Controlled Kernel Launch for Dynamic Parallelism in GPUs. HPCA 2017
  • Improving Bank-Level Parallelism for Irregular Applications. MICRO 2016
  • Memory Row Reuse Distance and its Role in Optimizing Application Performance. SIGMETRICS 2015