Jordan Henkel's Page

To Appear At SANER 2022#Conference Papers

Semantic Robustness of Models of Source Code

Jordan Henkel*Goutham Ramakrishnan*Zi Wang Aws Albarghouthi Somesh Jha Thomas Reps

A framework for testing the robustness of models of source code against sequences semantics-preserving transformations. Additionally, new results on the interplay between robustness of models of code and the domain-adaptation and cross-language-transfer tasks.

pdf arXiv

Published In ICSE 2021#Conference Papers

Shipwright: A Human-in-the-Loop System for Dockerfile Repair

Jordan Henkel*Denini Silva*Leopoldo Teixeira Marcelo d'Amorim Thomas Reps

Shipwright is a human-in-the-loop system for automated repair of broken Dockerfiles. We were able to use Shipwright to submit 45 pull requests with a 42.2% acceptance rate.

pdf acm arXiv datasets slides video

Published In ICSE 2020#Conference Papers

Learning from, Understanding, and Supporting DevOps Artifacts for Docker

Jordan Henkel Christian Bird Shuvendu K. Lahiri Thomas Reps

A toolset for advanced Dockerfile parsing, Dockerfile rule mining, and rule-based static analysis of Dockerfiles. This paper seeks to improve the support for Docker (and, more generally, DevOps artifacts) by taking the first steps toward automated rule mining in this domain.

pdf acm arXiv datasets video

Published In MSR 2020#Workshop Papers

A Dataset of Dockerfiles

Jordan Henkel Christian Bird Shuvendu K. Lahiri Thomas Reps

Expanded details on the dataset of 178,000 unique Dockerfiles used in "Learning from, Understanding, and Supporting DevOps Artifacts for Docker." In addition to a discussion of the dataset, this workshop paper features example usages and schemas for all of the unique representations we took the data through to achieve rule mining and static checking.

pdf acm arXiv datasets video

Available on Arxiv #Arxiv Papers

Enabling Open-World Specification Mining via Unsupervised Learning

Jordan Henkel Shuvendu K. Lahiri Ben Liblit Thomas Reps

A framework for mining specifications and usage patterns without the aid of rule templates, user-directed feedback, or predefined API surfaces. This paper leverages both learned embeddings and traditional co-occurrence statistics to disentangle trace-based data.

pdf arXiv

Published In ESEC/FSE 2018#Conference Papers

Code Vectors: Understanding Programs Through Embedded Abstracted Symbolic Traces

Jordan Henkel Shuvendu K. Lahiri Ben Liblit Thomas Reps

A novel technique for embedding semantically rich program artifacts. This paper explores how to combine sophisticated program analysis (in the form of lightweight symbolic execution) with off-the-shelf machine learning.

pdf acm arXiv datasets