Full Stack Java Project (Event Search and Ticket Recommendation Web Service)

Credit Card Fraud Detection

E-commerce revenue optimization

IOS store monetization experiment design

Yelp Business Rating Prediction Using Sentiment Analysis

This is the project I did during my financial analyst internship at China Asset Management Co.,LTD.

Project1: Problem solving using Alteryx software:

Project 2: Generate an analytical dataset using Alteryx

Project 3: Data visualization in Tableau: movie trend

Project 4: Classification models in predicting default risk:

Project 5: A/B testing for new menu lauch

Project 6: Times series model in forecasting video game sales

Project 7: Segmentation and clustering: the use of K-mean clustering technique in segmenting

Project 1 Intro-to-statistics

  • Statistical techniques using Python to investigate a classic phenomenon from experimental psychology known as the Stroop Effect. Python code was used to produce descriptive statistics, data visualizations and run a paired samples t-test.
  Project 2 Intro-to-data-analysis

  • Application of data wrangling, statistical analysis, machine learning, and visualization techniques to a New York City Subway dataset. Includes a Mann-Whitney U-test to test for significant difference between the number of people who ride the NYC subway when raining versus not raining, while a regression model was fitted to predict the hourly number of Subway entries.
  Project 3 Data-wrangling-with-mongodb

  • Data munging techniques using Python in order to clean OpenStreetMap data for Perth, Australia, create a .json map file, and load that file into the MongoDB instance.
  Project 4 Explore and Summarize Data with R

  • Exploratory analysis in R in order to examine the relationship between 11 chemical and physical properties of a sample of white wines. Includes univariate, bivariate and multivariate analysis using the ggplot function, with a focus on identifying properties which are correlated with the subjective quality ranking of each wine.
  Project 5 Intro-to-machine-learning

  • A Python based predictive model which is able to identify and label Persons of Interest (POI) i.e. Enron employees who committed fraud. Makes use of a GridSearchCV Pipeline with a StratifiedShuffleSplit cross-validation loop to select the optimal estimation algorithm (logistic regression estimator with feature scaling using MinMaxScaling and feature selection with SelectKBest).
  Project 6 Data-visualisation-and-d3js

  • An interactive data visualization from a dataset of flight delay statistics for airports based within the US, created using HTML, CSS, D3.js and dimple charts. Visualization indented to provide users the ability to easily access and interpret time-series flight delay statistics for various airports spread across the US.
  Project 7 - Design A/B Testing Experiment

  • Results of an A/B test that was run by Udacity in order to recommend whether or not to launch a change to the Udacity course enrolment webpage. Involved the selection of invariant and evaluation metrics, calculation of the duration and proportion of traffic diversion, and analysis of whether a statistically significant result was observed between the test and control groups
  Some of My Statistics Class Projects

    STAT 333 Linear Regression Class Project

    Body fat percentage Prediction

    STAT 461 Financial Statistics Class Final Project

    Title: Portfolio Investments On Four Stocks

    STAT 456 Multivariable Data Analysis Final Project

  • Principle Component Analysis with two given data set
  STAT 424 Statistics Experimental Design

  • Experimetnal design project with blood pressures measuring with mice.
  Appendix

    Kaggle Playground Competition Project