Yiwei Jiang

Computer Sciences, UW–Madison

I design and optimize GPU-centric machine learning systems through hardware-software co-design, spanning systems, networking, and architecture — with the goal of enabling scalable and sustainable ML platforms.

At UW–Madison, I am fortunate to be advised by Prof. Shivaram Venkataraman and Prof. Matt Sinclair. Previously, I worked as a research associate at AMD Research and Advanced Development (RAD), where I was mentored by Dr. Srilatha (Bobbie) Manne on power-efficient LLM inference and GPU workload optimization.

Research Interests

ML Systems GPU Architecture Power-Efficient Inference Hardware-Software Co-Design Workload Characterization

Research Experience

Research Intern — UW-Madison Dec 2025 – Present
Continuing GPU-centric ML systems research with Prof. Shivaram Venkataraman and Prof. Matt Sinclair.
Power-Efficient LLM Inference — AMD Research May – Dec 2025
Enabled prefill/decode disaggregation runtime achieving 20% energy reduction and 2× higher SLO attainment.
GPU Workload Characterization — UW-Madison Mar 2024 – May 2025
Proposed a dual classification scheme for GPU workloads that reduces profiling time by 90% for predicting frequency-capping impact.
SIGMETRICS 2026 (co-first author)

Education

M.S. in Computer Sciences
University of Wisconsin–Madison
Advisors: Shivaram Venkataraman, Matt Sinclair
2022 – 2025
B.S. in Environmental Science & Engineering
Shanghai Jiao Tong University
2017 – 2021