Hongil Yoon

Technical Lead & Manager, Google | Visiting Associate Professor, SNU

Mountain View, CA, USA • hongilyoon@google.com • LinkedIn

About Me

I am a Technical Lead & Manager at Google and a Visiting Associate Professor at the AI Institute of Seoul National University (AIIS). My research bridges the gap between computer architecture and machine learning, with a specific focus on hardware-software co-design for efficient AI computing. At Google, I lead engineering teams developing next-generation TPU architectures and system optimizations for cloud-scale training and serving. Previously, I architected the EdgeTPU and memory subsystems for the Google Tensor SoC, powering the Pixel series.

Currently, my academic research explores efficient ML training & inference infrastructure and 3D perception systems, tackling the computational and memory challenges of Large-Scale 3D Gaussian Splatting and point cloud processing via novel host-offloading and pruning techniques. I received my Ph.D. in Computer Sciences from the University of Wisconsin–Madison. I was advised by Professor Gurindar S. Sohi.

Work Experience

Google Inc.

Software Engineer, Technical Lead & Manager Apr 2025 – Present

TPU hardware-software co-design and system architecture exploration for cloud-scale ML deployments.
Leading specialized engineering teams focused on optimizing large-scale pre-training and serving for next-generation AI architectures.

ML Accelerator Architect, Technical Lead Nov 2018 – Apr 2025

Technical Lead for Generative AI Compute & Memory Architecture for the Tensor SoC, defining long-term strategic directions.
Spearheaded development of EdgeTPU and advanced memory systems for the Google Pixel ecosystem (Pixel 7 through Pixel 11).

SoC Performance Architect Apr 2017 – Nov 2018

Conducted architectural modeling and performance analysis for future-generation Tensor SoC for Pixel 6.

Seoul National University

Visiting Associate Professor (June 2024–Present), Artificial Intelligence Institute Seoul National University (AIIS), South Korea

Awarded Google Gift (2024/2025) for Seoul National University ARC LAB.
GS-Scale: Unlocking Large-Scale 3D Gaussian Splatting Training via Host Offloading, ASPLOS 2026.
FastPoint: Accelerating 3D Point Cloud Model Inference via Sample Point Distance Prediction, ICCV 2025.
Frugal 3D Point Cloud Model Training via Sample Recycling and Fused Aggregation, ECCV 2024.

University of Wisconsin–Madison

Research Assistant Jun 2010 – Mar 2017

Developed novel virtual caching and memory hierarchy techniques to significantly reduce address translation overheads in modern processors.

Education

University of Wisconsin–Madison May 2017

Ph.D. in Computer Sciences

Dissertation: Reducing Address Translation Overheads with Virtual Caching. [Dissertation and Defense Talk]
Committees: Gurindar S. Sohi (Chair), Mark D. Hill, Mikko H. Lipasti, Karthikeyan Sankaralingam, and David A. Wood.

University of Wisconsin–Madison May 2012

M.S. in Computer Sciences

Korea University February 2007

B.E. in Computer Sciences and Engineering

Publications

Encompassing diverse areas within computer architecture and machine learning. My early work focused on enhancing caching mechanisms and memory systems. I later expanded this scope to mobile computer architecture, addressing the unique constraints of on-device ML. Currently, my research focuses on the hardware-software co-design of efficient 3D perception systems, specifically optimizing 3D point cloud processing and Large-Scale Gaussian Splatting through novel ML training techniques and host-offloading strategies.

GS-Scale: Unlocking Large-Scale 3D Gaussian Splatting Training via Host Offloading. Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2026. Donghyun Lee, Dawoon Jeong, Jae W. Lee*, and Hongil Yoon*.
FastPoint: Accelerating 3D Point Cloud Model Inference via Sample Point Distance Prediction. International Conference on Computer Vision (ICCV) 2025. Donghyun Lee, Dawoon Jeong, Jae W. Lee*, and Hongil Yoon*.
Frugal 3D Point Cloud Model Training via Sample Recycling and Fused Aggregation. The 18th European Conference on Computer Vision (ECCV), 2024. Donghyun Lee, Yejin Lee, Jae W. Lee*, and Hongil Yoon*.
Not All Neighbors Matter: Point Distribution-Aware Pruning for 3D Point Cloud. Thirty-Seventh AAAI Conference on Artificial Intelligence, February 2023. Yejin Lee, Donghyun Lee, JungUk Hong, Jae W. Lee*, and Hongil Yoon*.
ASAP: Fast Mobile Application Switch via Adaptive Prepaging. USENIX ATC'21, July 2021. Sam Son, Seung Yul Lee, Yunho Jin, Jonghyun Bae, Jinkyu Jeong, Tae Jun Ham, Jae W. Lee*, and Hongil Yoon*.
The gem5 Simulator: Version 20.0+: A new era for the open-source computer architecture simulator. arXiv:2007.03152, July 2020.
Filtering Translation Bandwidth with Virtual Caching. The 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2018. Hongil Yoon, Jason Lowe-Power, and Gurindar S. Sohi.
Two Billion Devices and Counting: An Industry Perspective on the State of Mobile Computer Architecture. IEEE MICRO, February 2018. Vijay Janapa Reddi, Hongil Yoon, and Allan Knies.
Revisiting Virtual L1 Caches: A Practical Design Using Dynamic Synonym Remapping. The 22nd IEEE International Symposium on High-Performance Computer Architecture (HPCA-22), March 2016. Hongil Yoon and Gurindar S. Sohi.
Region-level Tracking for Scalable Directory Cache. Tech Report TR-1823, UW-Madison, April 2015. Hongil Yoon and Gurindar S. Sohi.
Reducing Coherence Overheads with Multi-Line Invalidation (MLI) Messages. Tech Report TR-1816, UW-Madison, May 2013. Hongil Yoon and Gurindar S. Sohi.
SIP: Speculative Insertion Policy for High Performance Caching. Tech Report TR-1676, UW-Madison, June 2010. Hongil Yoon, Tan Zhang, and Mikko H. Lipasti.

*Corresponding authors

Patents

25 patent applications at Google for ML accelerator and in-memory computing architecture.
Way Partitioning for a System-Level Cache, US10884959B2. Vinod Chamarty, Xiaoyu Ma, Hongil Yoon, Keith Robert Pflederer, Weiping Liao, Benjamin Dodge, Albert Meixner, Allan Douglas Knies, Manu Gulati, Rahul Jagdish Thakur, Jason Redgrave.
Cache Accessed Using Virtual Addresses, US20160188486A1. Hongil Yoon and Gurindar S. Sohi.
Computer Cache System Providing Multi-Line Invalidation Messages, US9223717B2. Hongil Yoon and Gurindar S. Sohi.

Professional Activities and Service

Organized MICRO 2023 Workshop on System and Architecture for Generative AI on the Edge/Mobile Platforms (SAGE 2023).
External Review Committee: The 58th IEEE/ACM International Symposium on Microarchitecture 2025 (MICRO-58).
External Review Committee: The 30th IEEE International Symposium on High-Performance Computer Architecture 2024 (HPCA-30).
Program Committee: The 49th The ACM/IEEE International Symposium on Computer Architecture 2023 (ISCA-49).
External Review Committee: The 29th IEEE International Symposium on High-Performance Computer Architecture 2023 (HPCA-29).
Program Committee: The 28th IEEE International Symposium on High-Performance Computer Architecture 2022 (HPCA-28).
Program Committee: The 27th IEEE International Symposium on High-Performance Computer Architecture 2021 (HPCA-27).