This project was to build a movie recommender system based on Item Collaborative Filtering using Hadoop MapReduce in Java. I multiplied user rating matrix and movie co-occurrence matrix to generate a recommendation list of movies that are similar to the user's high-rated movies.
Github Repository
This project was to implement Google Search Auto Complete based on N-Gram Model using Hadoop MapReduce in Java. The data of Language Model was loaded into MySQL. And the web demo was built using PHP, JQuery, Ajax.
Github Repository
This project was to crawl HTML data of restaurants from Yelp and Yellow Pages, perform information extraction to convert the HTML data into two relational tables and match the restaurants in the two tables.
Project Webpage Github Repository More Info >>This project has three stages: data crawling and extraction, blocking, and matching.
This project was to develop a Yelp-like web application that allows users to search nearby restaurants as well as to give reviews and ratings etc. We built it using JSP technology with Apache Struts2 framework. The program was deployed on Apache Tomcat server and we chose MySQL as our database.
Github Repository More Info >>Feature
This project was to predict zygosity status of twin pairs based on their history of diseases using Expectation-Maximization (EM) algorithm with Bayes Network. Then, a two-sample t-test is conducted for the concordance rates between identical and fraternal twins for each disease to identify the diseases that have high potential to correlate with zygosity.
Final Report
This project was to propose a new approach for video retargeting that uses discontinuous seam-carving in both space and time for resizing videos in order to improve the poor speed performance of the original seam-carving algorithm by reducing computing complexity and parallelizing.
Project Webpage Final Report Github Repository More Info >>First, we implement the seam-carving algorithm in papers Seam Carving for Content-Aware Image Resizing and Improved Seam Carving for Video Retargeting. We notice that the seam-carving algorithm for videos in the paper runs fairly slow. Thus, in our new approach we calculate the seam for each frame seperately while using the look-ahead energy to maintain the temporal coherency between frames. We implement and parallelize this algorithm with OpenCV in C++ and achieve considerable speed improvements while keeping the same carving power as the algorithm in the papers.
This project was to propose a new approach to predicting the Northern Hemisphere sea ice extent using Time Lagged Neural Networks (TLNN) with adding external forcing (Solar radiation and CO2). And our new model can capture the main features of Northern Hemisphere sea ice change as predicted by the complexing numerical models while it is simpler and less computationally expensive.
Final Report Github Repository