Joel Hestness

hestness <at>
jthestness <at>
Connect with me on:

In December 2016, I completed my PhD at The University of Wisconsin - Madison, working with the Wisconsin Multifacet Group (Multifacet). I was co-advised by Dr. Stephen Keckler, and Prof. David Wood.

I work at Baidu's Silicon Valley AI Lab (SVAIL) on scaling out deep learning models to large-scale high-performance computing systems.


My research interests include performance, power consumption, programmability, and reliability of data communication in future highly-parallel systems. In particular, I have investigated quality-of-service and heterogeneous traffic effects in on-chip networks, and my PhD research focused on memory synchronization and coordinated computation in heterogeneous processors integrating CPU and GPU cores.

My dissertation can be found here (PDF) and my defense talk here (PPTX).

I am a primary contributor to the gem5-gpu simulator, which integrates gem5 and GPGPU-Sim for heterogeneous CPU-GPU processor studies.

I also contribute to the gem5 simulator, including infrastructure for running PARSEC 2.1 under gem5 ALPHA
There are disk images and scripts here: PARSEC on M5.

My current work investigates performance scaling and memory management for deep learning applications. Many of these applications are developed in TensorFlow, to which I have submitted and merged a few pull requests.


Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates. INTERSPEECH, Aug. 2017.

Dissertation: Synchronization and Coordination in Heterogeneous Processors.
Joel Hestness. The University of Wisconsin - Madison, Dec. 2016.

GPU Computing Pipeline Inefficiencies and Optimization Opportunities in Heterogeneous CPU-GPU Processors.
Joel Hestness, Stephen W. Keckler, David A. Wood. The 2015 IEEE International Symposium on Workload Characterization (IISWC), Oct. 2015.

A Comparative Analysis of Microarchitecture Effects on CPU and GPU Memory System Behavior.
Joel Hestness, Stephen W. Keckler, David A. Wood. The 2014 IEEE International Symposium on Workload Characterization (IISWC), Oct. 2014.

gem5-gpu: A Heterogeneous CPU-GPU Simulator.
Jason Power, Joel Hestness, Marc S. Orr, Mark D. Hill, David A. Wood. Computer Architecture Letters, vol. 13, no. 1, pp. Jan-June 2014. PDF

A QoS-Enabled On-Die Interconnect Fabric for Kilo-Node Chips.
Boris Grot, Joel Hestness, Stephen W. Keckler, Onur Mutlu. IEEE Micro, Top Picks 2012 -- Special Issue, volume 32, issue 3, 2012. (original in ISCA 2011)

Kilo-NOC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees.
Boris Grot, Joel Hestness, Stephen W. Keckler, Onur Mutlu. The 38th International Symposium on Computer Architecture (ISCA), June 2011.

The gem5 Simulator.
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. ACM SIGARCH Computer Architecture News (CAN), May 2011.

FFTW and Complex Ambiguity Function Performance on the Maestro Processor.
Karandeep Singh, John P. Walters, Joel Hestness, Jinwoo Suh, Craig M. Rogers, Steve P. Crago. In 32nd IEEE Aerospace Conference, March 2011.

3 Day Startup: Molding Student Entrepreneurs for Fun and Non-profit.
Joel Hestness, Thomas Finsterbusch, Cameron Houser, Eli Mercer, Jeremy Guillory. In 5th International Technology, Education and Development Conference (INTED), March 2011.

Netrace: Dependency-Driven Trace-Based Network-on-Chip Simulation.
Joel Hestness, Boris Grot, Stephen W. Keckler. In 3rd International Workshop on Network on Chip Architectures (NoCArc), December 2010.

Express Cube Topologies for On-Chip Interconnects.
Boris Grot, Joel Hestness, Stephen W. Keckler, Onur Mutlu. In 15th International Symposium on High Performance Computer Architecture (HPCA), February 2009.

Technical Reports

Netrace: Dependency-Tracking Traces for Efficient Network-on-Chip Experimentation.
Joel Hestness, Stephen W. Keckler. Technical Report TR-10-11, The University of Texas at Austin, Department of Computer Science, May 2011.

Running PARSEC 2.1 on M5.
Mark Gebhart, Joel Hestness, Ehsan Fatehi, Paul Gratz, Stephen W. Keckler. Technical Report TR-09-32, The University of Texas at Austin, Department of Computer Science, Oct 2009.

Other Prior Work

I have developed tools and traces for enforcing dependencies between packets in trace-based network simulation.
It is called Netrace.

The first work that I did as a graduate student concerned reliability trends:
Hard Reliability Projections Spreadsheet
Reliability Costs Spreadsheet
Note: This spreadsheet contains excel macros.

About Me

I graduated from The University of Wisconsin - Madison with Bachelor's degrees in Computer Science and Mathematics.

Involvement at UT-Austin:
I was elected as a representative of the Graduate Representation Association of Computer Science (GRACS) during my first year as a graduate student at The University of Texas. I was also a founding member of the student organization Student Entrepreneurship Opportunities (SEO).

I currently act as a board member and the operations coordinator for 3 Day Startup, an entrepreneurship education nonprofit that I cofounded in 2008. 3DS has educated more than 10,000 student entrepreneurs at more than 125 universities worldwide. 3DS program alumni have founded more than 150 startups that have raised more than $100M in investment capital. Hit me up if you'd like to know more!

After 4.5 years working toward a PhD at UT-Austin, I transferred back to the University of Wisconsin - Madison, where I am - once again - enjoying the gorgeous summers and chilly winters.