Advanced Machine Learning Final project Presentation

Tran Le

12/02/2020

Introduction

Data

Fig1: A figure in the data

Goal of the project

Detail of the project, the outline

First, Investigating the PCA applied to the data set

Fig2: Number of PCs vs cummulative proportion of variance explained

Second, apply ML to full and dimension reduction data:

Detail of how to run each methods and the results:

K nearest neighbors:

Model_name Accuracy_full Runing_time_full Accuracy_PCA64 Run_time_PCA64 Accuracy_PCA25 Run_time_PCA25
knn 0.9367215 20.21794 secs 0.1913303 3.272247 secs 0.1893373 0.1878426 secs

Classification tree

Model_name Accuracy_full Runing_time_full Accuracy_PCA81 Run_time_PCA81 Accuracy_PCA64 Run_time_PCA64
rpart 0.7249626 4.64187 secs 0.7229461 3.357013 secs 0.7229461 2.732178 secs

Random Forest:

Model_name Accuracy_full Runing_time_full Accuracy_PCA81 Run_time_PCA81 Accuracy_PCA64 Run_time_PCA64
RandomForest 0.9407075 1.536948 mins 0.2705531 22.28762 secs 0.245142 8.512539 secs

Support Vector Machine

Model_name Accuracy_full Runing_time_full Accuracy_PCA81 Run_time_PCA81 Accuracy_PCA64 Run_time_PCA64
SVM 0.941704 35.33119 secs 0.1908321 14.17465 secs 0.1908321 12.27648 secs

Neural network:

Model_name Accuracy_full Runing_time_full Accuracy_PCA81 Run_time_PCA81 Accuracy_PCA64 Run_time_PCA64
NeuralNetwork 0.9367215 17.57725 secs 0.1923269 1.729314 secs 0.1788739 1.759174 secs

Summarize the results and observation

Limitation and future development: