Automatic Chord Recognition

CS 761: Advanced Machine Learning - Course Final Project
May. 2016, Personal Project
Chord Recognition

This project was to propose a method to automatically recognize the chord in an audio recording based on the improved pitch class profile and the circular shift and weighted sum based template matching.

Download Report

I propose a new automatic chord recognition method, formalize every stage in its pipeline, discuss some important implementation details, and show its effectiveness through experiments. The proposed method is based on a traditional scheme, but enhances it with techniques including the Soft Thresholding denoising, the Improved Pitch Class Profile, and the circular shift and weighted sum based Template Matching.

ML-based Image Superresolution on GPGPU

CS 760: Machine Learning & CS 759: High Performance Computing - Course Final Project
Dec. 2015, Team Project with Shengchao Liu and Yao Song
Image Superresolution

This project was to develop an approach to recovering the 3x superresolution version of a single image based on coupled dictionaries and neural networks, and accelerating it on GPGPU with CUDA library.

Visit Repository

We propose a single image super-resolution method based on coupled dictionaries and neural networks. Our method models the relationship between low- and high-resolution images with a cascade of two distinct learners, and they both contribute to the overall performance, which shares the same ideas with ensemble learning. We discuss how to select the tunable parameters, and compare our method with the baseline method on two datasets. We show that our method is computationally efficient and produces high-quality super-resolution images.

We develop a CUDA program to recover the 3x super-resolution version of a given low-resolution grayscale image using a machine-learning-based method. We discuss the implementation of the program, and show that it can produce high-quality super-resolution images and achieve high performance on GPU. We also point out the limitations and potential improvements of the program.

Abnormal Crowd Behavior Detection

CS 766: Computer Vision - Course Final Project
May. 2015, Team Project with Christopher Bodden and Michael Doescher
Anomaly Detection

This project was to develop an approach to detecting abnormal crowd behaviors in surveillance video streams. We proposed a method to compute social forces using dense trajectories and detect anomalies using a one-class SVM. Our approach outperformed prior works on common datasets.

Download Report

Abnormal crowd behavior has become a popular research topic in recent years. This is related to a rise in the need for electronic video surveillance. Many methods have been proposed to detect abnormalities, but these methods rely on optical flow or classical classification techniques. We proposed to follow the general pipeline used by previous works, but upgrade several components with state-of-theart techniques. Specifically, we used dense trajectories in place of optical flow, robust features such as social force, HOG, HOF, and MBH, and a single-class support vector machine. We achieved significant improvements in abnormality detection when compared with prior works.

Scene Classification

CS 766: Computer Vision - Course Project 3
Apr. 2015, Team Project with Christopher Bodden
Scene Classification

This project was to implement the Locality-constrained Linear Coding method to get state-of-the-art results for scene classification. Through careful experiments and the integration of Object Bank, we achieved the highest accuracy among all submissions in the class.

Visit Repository

We developed an application based on MATLAB to pipeline the scene classification process. We implemented the LLC method based on the basic spatial pyramid code as well the codebook optimization algorithm. We combined LLC with Object Bank and got our best results of an accuracy of 81.34%, which is comparable to the state-of-the-art on the dataset. Please refer to the Result page of the project wiki for more information.

Panoramic Mosaic Stitching

CS 766: Computer Vision - Course Project 2
Mar. 2015, Team Project with Christopher Bodden
Panorama Project

This project was to develop the whole process of stitching a set of images into panoramas by registering, warping, resampling and blending them together. We implemented a feature-based method to perform the panorama creation process automatically.

Visit Repository

We developed an application based on MATLAB to pipeline the panorama creation process. We developed two distinct methods to create panoramas: either via cylindrical projection or via planar (homography) projection. We employed SIFT and RANSAC to register images, and implemented a set of blending techniques such as alpha blending and pyramid blending.

You're highly recommended to view the high-resolution interactive version of our final panoramas.
[Bascom Hill 1] [Bascom Hill 2]

HDR Recovery and Tone Mapping

CS 766: Computer Vision - Course Project 1
Feb. 2015, Team Project with Christopher Bodden
HDR Project

This project was to implement some classical algorithms to recover the High-Dynamic-Range radiance map from several LDR images with different exposure times, and tone map it to make it displayable on LDR devices while preserving contrast.

Visit Repository

We developed an application based on MATLAB to pipeline the HDR recovery process. We employed Debevec's algorithm to recover response curves and assemble HDR radiance maps. We offered five tone mapping algorithms to develop the radiance map into displayable images, which include Reinhard's, Drago's and Durand's algorithms. We also implemented Ward's MTB algorithm to align the input images.

Here is a small gallery for this project:

Graphics Town: The Island

CS 559: Computer Graphics - Course Project 2
Dec. 2014, Personal Project
Graphics Town

This project was to create a living town with various things moving around in it. I adopted many graphics techniques in the project, and finally created a seaside scene that seems quite small and simple, but really nice and realistic as well!

Download Report

I attempted various technical challenges in this project: surfaces of revolution, generalized cylinders, subdivision surfaces, parametric patches, local lights, skybox, billboarding, environment mapping, bump mapping, fractals, particle systems, fake physics and model loader.

Here is a small gallery for this project:

Virtual Roller Coaster

CS 559: Computer Graphics - Course Project 1
Nov. 2014, Personal Project
Roller Coaster

This project was to create a 3D world where users are able to see and manipulate control points that define the track, and run or pause the animation of a roller coaster running along the track. I went beyond the basic requirements and implemented many more advanced features.

Download Report

In this project, I implemented some features of note:
a) Two lighting modes are provided. Both have fine lighting effects and shadows.
b) The track curve can be created and modified easily. Users can sketch a rough shape to generate a track curve, and insert a loop or spin at any point.
c) C2 interpolating spline is provided as one type of track curve. Users can create interpolating track curves with good continuity.
d) The roller coaster and the track are good-looking with many details.

Here is a small gallery for this project:

Effectiveness of Embodiment of Video Instructor

CS 770: Human-Computer Interaction - Course Project
Dec. 2014, Team Project with Jia-Shen Boon and Ayon Sen
Video Instructor

This project was to investigate the question: How does the physical embodiment of the instructor in the inset frame impact the learning process of students learning from lecture videos? We really found some interesting things and thus proposed some important insights.

Download Report

Faculty members are often busy with various research work and thus have little time to prepare lectures. Advances in technology offer a solution to this problem. That is, they can record the lectures and reuse them over semesters. It is common for these lecture videos to include an inset frame of the instructor giving the lecture. However, little research has been done to investigate the effect of these inset frames. In this study, we examined how the physical embodiment of the instructor in the inset frame impacts the learning process. Three types of embodiment were compared through an online between-participants experiment: lecture with no inset frame, lecture with an inset frame of a human instructor and lecture with an inset frame of a virtual agent instructor. The results showed a marginally significant difference between having some form of embodiment and having no embodiment in terms of learning performance.

Fitting Robot and Virtual Fitting System

Undergraduate Final Term Project
Jun. 2014, Team Project with Aobo Zhou, Lei Shi and Jiahui Zhou
Fitting Robot

This project was to develop a sizable robot which can be deformed to fit various body-builds, and to implement the associated control system. With such robot, people are able to choose well-fitting clothes without actually trying on them, which can be beneficial for clothing online retailers.

Download Dissertation

In this project, I was responsible for the parameterization for the fitting robot and the development of user interface of the fitting system. The tasks mainly lied in two aspects: The first was to provide an interface that guides users to input the size parameters, and presents them with a real-time 3D human body model as feedback. The second was to pass data to the control module of the fitting robot, and make it deform to fit a specified body-build.

Here is a video for this project:

3D Scanning and Parameter Extraction for Plants

Undergraduate Final Term Project
Jun. 2014, Personal Project
Plant Scan

This project was to utilize a Kinect sensor to capture the multi-view point clouds of the plant, and then combine them to construct the full point cloud. Therefore, some meaningful structural parameters could be extracted from it, which is of great importance in many fields.

Download Dissertation

The non-destructive measurement of plant structural parameters is of great importance in fields such as growth monitoring, scientific modeling and quality control. This project utilized a Microsoft Kinect sensor to acquire color and 3D position data and combined multiple point cloud processing techniques to construct the full 3D point clouds of the plants. It also extracted several plant structural parameters based on the clouds, so as to achieve the goal of non-destructive measurement. The non-destructive measurement process proposed by this project has relatively low hardware and software requirements, convenient steps and easy implementations; the measurement error is reasonable, and some parameters that can hardly be achieved by physical measurements can be measured. Thus, this non-destructive measurement process has some practical significance.

Here is a small gallery for this project:

Algorithms for Laminated Object Manufacturing

SUTD-ZJU Innovation Project
Dec. 2013, Team Project with Nuo Lei
LOM Algorithm

This project was to explore computation methods to partition 3D structure into 2D physical units that can be manufactured by laser cutting and then assembled to represent the structure. I designed and implemented the algorithm, and finally manufactured a triceratops model.

In this project, I proposed and implemented an algorithm to partition 3D digital model, which was in STL format, into 2D slices, which could be manufactured by laser cutting. Then through simple assembly, the physical model could be obtained to represent the original 3D model.

This project focused on the development of the algorithm to partition 3D digital model into 2D slices. I adopted a relatively easy and intuitive way to achieve this, that is, to partition 3D objects into horizontal slice layers using parallel x-y planes along z-axis. In order to prevent relative displacements between adjacent layers, I proposed a technique that used reference holes. The adjacent slice layers shared several reference holes in the exactly same position, so we could easily align different layers during assembly.

Here is the poster for this project:

LOM Algorithm

FPGA Tetris

50.002: Computation Structures - 1D Desgin Project
Dec. 2013, Team Project with Fanding Wei
FPGA Tetris

This project was to implement an educative electronic game on an FPGA board with limited budget to buy extra components. We first designed an 8-bit ALU (arithmetic and logic unit) and then utilized it to make a simplified version of the famous Tetris game!

Download Report

In the first part, we mainly focused on the development of ALU and its self-test circuit. We implemented an ALU that was of 19 main functions, and also arranged the instruction set that used 5 bits of ALUFN signals. We made several important improvements based on the basic design such as adopting a better adder architecture, and also designed a totally automated self-test circuit using a cascade of 3 finite state machines.

In the second part, we mainly focused on developing the electronic game prototype, that is, a modified version of the classic Tetris game, which we believe is educative to children. We implemented this prototype with 3 buttons and 2 switches, and utilized 3 ALU functions. The design and implementation details were described in the report for reproduction.

Here is a video for this project:

Adder Optimization and CEC

50.001 & 50.002 - 2D Desgin Challenge
Nov. 2013, Team Project with Fanding Wei
Adder Opt

This project was to optimize the original 32-bit adder to work with a clock period of less than 3 ns, and implement Combinational Equivalence Checking (CEC) to verify its functionality. We finally achieved an adder of 2.1 ns in min clock cycle and 7152 microns^2 in area.

Download Report

In the first part, we should optimize the 32-bit adder to achieve the smallest (min clock cycle * area) value. The simulations were done in JSim. We tested a lot of adder architectures and adopted methods like inverting logic to further optimize them, and finally attained an adder of 2.1 ns in min clock cycle and 7152 microns^2 in area. In order to do CEC, we should also translate the circuit netlist into conjunction normal form (CNF) format. We achieved that in a semi-automated way.

In the second part, we should implement a SAT solver which reads CNF format file as input. The SAT solver usually adopted a simple and effective algorithm named Davis-Putnam-Logemann-Loveland (DPLL). The SAT solver should solve a given 6000-clause CNF file in less than 15 seconds.

Hay Hauling Robot

ASABE 2013 Robotics Competition
Jul. 2013, Team Project with Qiang Zhong and Tianqi Gao
Hauling Robot

This project was to develop a set of robots that can automatically pick up the hay bales scattered in the field, haul them to the designated areas and stack them. We employed 3 small robot that are responsible for picking and hauling, and 1 big robot that is responsible for stacking.

The challenge was held on an 8ft x 8ft flat, smooth board. The board was marked in a grid marking every square foot and colored sections in the corners to designate hay stacking areas. The stacking areas were intended to hold a specific color of hay bale. Robots were tasked to pick up, haul, and then stack the hay bales within a 5 minute time limit.

In our design, we employed 1 big robot and 3 small robots. The small robots were to carry the hay bales to different edges of the board according to the colors. After the small robots had done their jobs, the big robot collected the hay bales and then stacked them in the designated hay stacking areas.

Here is a video for this project:

Kinect Puppet

Zhejiang Province Science & Technology Innovation Program for Undergraduates
May. 2013, Team Project with Yedan Qian
Kinect Puppet

This project was to develop a system to control shadow puppets, a Chinese traditional folk art, according to user motion using Kinect sensor. It is promising in that it reduces much labor for animation production and recalls people's attention to traditional culture.

In this project, Kinect sensor captured human motion and obtained RGBD data. It then processed the data stream and got the 3D joint data. An intermediate program read the 3D joint data and converted them into properly mapped 2D joint data. It sent the data through OSC messages. The client program named Animata received the OSC messages and acquired the joint data, so that it could give an animation output through image deformation.

Stem Detection Algorithm for Pears

Robotics for Bioproduction - Course Project
Apr. 2013, Personal Project
Stem Detect

This project was to develop an algorithm to detect the presence of stem of pear from its profile image. I surveyed all current algorithms and designed a brand-new algorithm based on Monte Carlo method to solve some of the drawbacks they had.

Download Report

This topic was about the detection of stem from profile images of fruits. At first, I examined all current algorithms. I divided the current algorithms for stem detection into three classes, that is, algorithms based on thinness feature, algorithms based on concavity feature and algorithms based on optical property. Then I found that all of them had some drawbacks, in terms of application, accuracy or efficiency.

In order to resolve the contradiction between accuracy and efficiency, I tried to apply Monte Carlo method to estimate the center and radius of the fruit, and proposed a new approach named elliptical scanning to detect the presence of stem. This new algorithm performed better than other algorithms in terms of accuracy and efficiency for some specific varieties of fruits.

copyright © 2015-2016 Ke Ma | all rights reserved