Research
Computer architecture & systems — software–hardware co-design for emerging applications
Current Research Focus
My research interests lie in the area of both compiler and computer architecture. More specifically, my group works on software–hardware co-designs for emerging applications, exploring algorithm-level innovations and advanced architecture designs.
Multi-GPU & LLM Infrastructure
Fueling machine learning and deep learning applications at scale on single- and multi-GPU systems. Exploring DNN characteristics, compiler optimizations, runtime management, and next-generation GPU architecture features.
Quantum Computing Systems
Building efficient quantum computing ecosystems. Leveraging system optimizations to simulate large quantum circuits, developing front-end/back-end compiler support, and exploring heterogeneous quantum–classical system designs.
Edge AI
Software–hardware co-design for addressing performance bottlenecks in edge platforms. Developing compiler-assisted paging, zero-copy remapping, and high-performance persistence support for secure non-volatile memory.
Past Research (Ph.D.)
- Optimizing Dynamic Parallelism for Irregular Applications on GPGPUs. Designed a runtime control system that dynamically decides child kernel launch, enabling better mixing of parent and child kernels to hide launch overheads and improve GPU utilization. Analyzed data reuse and designed locality-aware schedulers.
- Compiler-assisted Optimization on Manycore Platforms. Proposed a loop iteration scheduling strategy considering both bank-level parallelism (inter-core) and bank reuse (intra-core) for irregular applications. Designed a compiler algorithm to partition computations into subcomputations scheduled for minimal on-chip network distance-to-data.
Experience
- Internship at AMD Research, Aug–Dec 2017
- Internship at Samsung Research America, Summer 2015
- Internship at Institute of Computing Technology, Chinese Academy of Sciences, 2011–2013