Kernel PCA using Different Kernels With Classification using python
The things that you must have a decent knowledge on:
* Python
* Linear Algebra
- This project is fully based on python. So, the necessary modules needed for computaion are:
* Numpy
* Sklearm
* Matplotlib
* Pandas
- The commands needed for installing the above modules on windows platfom are:
pip install numpy
pip install sklearn
pip install matplotlib
pip install pandas
- we can verify the installation of modules by importing the modules. For example:
import numpy
from sklearn.decomposition import kernelPCA
import matplotlib.pyplot as plt
import pandas as pd
- We are performing the the dimensionality reduction using Kernel PCA with three different Kernels:
- Here we are performing the operations on the IRIS Dataset
-
The output of kernel PCA with Linear kernel :
-
The Explained variance Ratio of the principal components using kernel PCA with Linear kernel and result is shown in bargraph for 4 Pricipal Components according to their variance ratio's :
Since, The initial two principal components have high variance. So, we selected the first two principal components.
-
The scatter plot for the 2 Pricipal Components is :
-
The dimensionally reduced file is saved to iris_after_KPCA_using_linear.csv.
-
-
The output of kernel PCA with Radial Basis Function(RBF) kernel :
- The Explained variance Ratio of the principal components using kernel PCA with Radial Basis Function(RBF) kernel and result is shown in bargraph for 4 Pricipal Components according to their variance ratio's :
Since, The initial two principal components have high variance. So, we selected the first two principal components.
- THe scatter plot for the 2 Pricipal Components is :
- The dimensionally reduced file is saved to iris_after_KPCA_using_rbf.csv.
-
The output of kernel PCA with Polynomial kernel :
-
The Explained variance Ratio of the principal components using kernel PCA with Polynomial kernel and result is shown in bargraph for 4 Pricipal Components according to their variance ratio's :
Since, The initial two principal components have high variance. So, we selected the first two principal components.
-
The scatter plot for the 2 Pricipal Components is :
-
The dimensionally reduced file is saved to iris_after_KPCA_using_poly.csv.
-
-
The classifier used for classification in Support Vector Machine Classifier(SVC) with Linear kernel.
-
The data sets before and after KPCA is shown below:
-
The classification of the dataset before Kernel PCA is:
kernel Accuracy Execution Time Linear 100 0.00200009346 Radial Basis Function(RBF) 100 0.0020003318 Polynomial 100 0.0010001659 -
The classification of the dataset After Kernel PCA is:
kernel Accuracy Execution Time Linear 95.55 0.0020003318 Radial Basis Function(RBF) 37.77 0.00200009346 Polynomial 95.55 0.1670093536
- By performing KPCA with three different kernels (linear,rbf,polynomial) on the iris data set.
- since, the initial two Principal Components(PC'S) has more variance ratio. we selected two only.
- Initially the dataset contains the dimensions 150 X 5 is drastically reduced to 150 X 3 dimensions including label.
- The classification has varied a lot according to the kernel choosen.
This project is licensed under the MIT License - see the LICENSE.md