CS 701: Course Project Presentations
Section 1
Spring 2018

Presentations on Wednesday, 4/25/2018, 3024 Engineering, 9:30 AM - 10:45 AM

Speeding up GPU Programs Using Partial Evaluation

Shuo Yang and Zihao Qiu
9:30-9:50 AM

Graphics processing unit (GPU) is widely used in many data-level parallel computing applications today. Programming a high efficient GPU application is not easy, as many programs consist of a chain of processing stages and loops can enclose another loop, and GPU lacks an efficient mechanism to support processing a chain of stages and nested loops.

Partial evaluation can be used to specialize the program on certain input. In this project, we used the ideas of partial evaluation to reschedule or specialize part of the kernel function of the GPU program to speed up the execution. We focused on the program which the similar nested loop structures. We proposed two different strategies using partial evaluation. One is based on the numbers of outer loops and inner loops. When the number of inner loops is much smaller than the outer loops, it is suitable for specializing the function in the inner loop and vice versa. Another strategy is based on GPU model. Typical GPU program models for multiple stages or nested loops application contain run to completion (different stages are implemented in one kernel function), kernel by kernel (one stage is implemented in one kernel. Using the idea of partial evaluation, we can merge several kernel functions to one functions and make it suitable for the run to completion model which has high GPU utilization and lower CPU-GPU switching overhead.

We tested our implementation of the FFT and CFD programs. The results show a 1.2X speed up. We are also trying to add more optimization and more experiments are still undergoing.

Static Analysis of Gas Estimation on Ethereum Smart Contracts

Zhicheng Cai and Heng Zhuo
9:50-10:10 AM

Ethereum smart contracts, executable objects containing their own storage, codes and balance, are raising people’s attention. Different with traditional object-oriented programming, smart contracts are triggered by transactions or messages, and their executions are bounded by a gas limit. In order to design a good contract, or set a suitable gas limit for transactions, gas estimation is important. However, the estimation provided by a full node relies on specific running environments and current static analysis efforts have some disadvantages. Our idea is that, instead of analyze a whole transaction gas usage, we give estimation for partial execution trace, which do not count external callings. We propose a symbolic execution gas analysis for EVM bytecodes, supporting a majority part of all opcodes with modified behaviors. Given a contract EVM code, we can give a gas estimation table for all possible partial execution paths, along with external calling tags. From our generated tables, we can perform inter-contract gas analysis in the future, if provided with external calling relation table.

Presentations on Friday, 4/27/2018, 3024 Engineering, 9:30 AM - 10:45 AM

A Human Motion to Robot Motion Compiler and Feasibility Checker

Daniel Rakita
9:30-9:50 AM

In this project, we propose a method to effectively translate human arm motions to robot arm motions. This special purpose compiler will take in a person's hand trace, i.e., a sequence of positions and orientations that the hand exhibited through some demonstration, and the output will consist of two components: 1) a Boolean indicating whether the robot can feasibly follow this path with its end-effector under a given error threshold, and 2) a sequence of joint angles that best reflects the demonstrated hand trace, given some definition of ``best''. Our method casts the robot motion translation problem as a dataflow path problem, where each timepoint is considered as a program point and the robot's joint state at the given time are the program's variables. Framing the problem in such a way gives rise to a well-structured graph that affords efficient motion feasibility assessment through a reachability analysis and allows for a best path calculation using a multi-source-shortest-path algorithm. Through an evaluation, we show that our approach outperforms a state-of-the-art per-frame inverse kinematics solver and global optimization-based motion synthesis approach at finding feasible, smooth, collision-free robot motions that exactly match the user's motion demonstration.

The Extension on Deep Learning Compiler Infrastructure for Capsule Networks

Tananun Songdechakraiwut and Huaye Li
9:50-10:10 AM

Recently, there is a new type of neural networks emerging so-called Capsule Networks. The new Capsule architecture can be used intuitively in deep learning to better model hierarchies inside of internal representation in a neural network. In fact, CapsuleNets have been trained and achieved state-of-the-art performance. Nonetheless, there are still challenges of the implementation of CapsuleNets, which is much less efficient than other existing deep learning models.

In this project, we will explore advanced compiler techniques that can be applied to CapsuleNet systems, targeting LLVM, DLVM and other back-ends. We will investigate two key compiler issues including domain-specific intermediate representation (IR) and automatic differentiation (AD) techniques. We propose our DLVM-based IR that we build onto the existing DLVM framework that supports Capsule-level optimization for a neural network. We also explore the use of advanced AD techniques that facilitate managing large memory footprint and provide opportunities for high-level optimization.

During the presentation, we will briefly introduce the concepts of Capsule Networks, LLVM and DLVM. Then we will present a high-level overview of our compiler architecture, AD techniques, and the proposed IR. Finally, we will share our analysis of the proposed for CapsuleNet systems.

Patch Outcome Prediction

David Bingham Brown
10:10-10:30 AM

Machine-learning techniques require vast amounts of training data to function effectively. Source control repositories provide huge amounts of potential training data, but this training data is often not specific enough to provide training-set data on the scale necessary for effective use of machine-learning techniques. We describe techniques utilizing high-throughput computing to generate virtualized training data from analysis of source-control repository histories, and demonstrate use of this technique by predicting the behavior of patches applied to code bases.