Artificial Inteligence
  • Preface
  • Introduction
  • Machine Learning
    • Linear Algebra
    • Supervised Learning
      • Neural Networks
      • Linear Classification
      • Loss Function
      • Model Optimization
      • Backpropagation
      • Feature Scaling
      • Model Initialization
      • Recurrent Neural Networks
        • Machine Translation Using RNN
    • Deep Learning
      • Convolution
      • Convolutional Neural Networks
      • Fully Connected Layer
      • Relu Layer
      • Dropout Layer
      • Convolution Layer
        • Making faster
      • Pooling Layer
      • Batch Norm layer
      • Model Solver
      • Object Localization and Detection
      • Single Shot Detectors
        • Yolo
        • SSD
      • Image Segmentation
      • GoogleNet
      • Residual Net
      • Deep Learning Libraries
    • Unsupervised Learning
      • Principal Component Analysis
      • Generative Models
    • Distributed Learning
    • Methodology for usage
      • Imbalanced/Missing Datasets
  • Artificial Intelligence
    • OpenAI Gym
    • Tree Search
    • Markov Decision process
    • Reinforcement Learning
      • Q_Learning_Simple
      • Deep Q Learning
      • Deep Reinforcement Learning
    • Natural Language Processing
      • Word2Vec
  • Appendix
    • Statistics and Probability
      • Probability
        • Markov Chains
        • Random Walk
    • Lua and Torch
    • Tensorflow
      • Multi Layer Perceptron MNIST
      • Convolution Neural Network MNIST
      • SkFlow
    • PyTorch
      • Transfer Learning
      • DataLoader and DataSets
      • Visualizing Results
Powered by GitBook
On this page
  • Introduction
  • Calculating PCA
  • Reducing input data
  • Example in Matlab
  • Using PCA on images

Was this helpful?

  1. Machine Learning
  2. Unsupervised Learning

Principal Component Analysis

PreviousUnsupervised LearningNextGenerative Models

Last updated 5 years ago

Was this helpful?

Introduction

On this chapter we're going to learn about Principal Component Analysis (PCA) which is a tool used to make dimensionality reduction. This is usefull because it make the job of classifiers easier in terms of speed, or to aid data visualization.

So what are principal components then? They're the underlying structure in the data. They are the directions where there is the most variance on your data, the directions where the data is most spread out.

The only limitation if this algorithm is that it works better only when we have a linear manifold.

The PCA algorithm will try to fit a plane that minimize a projection error (sum of all red-line sizes)

Imagine that the PCA will try to rotate your data looking for a angle where it see more variances.

As mentioned before you can use PCA when your data has a linear data manifold.

But for non linear manifolds we're going to have a lot of projection errors.

Calculating PCA

  1. Preprocess the data: Xprep=X−mean(X)std(X)X_{prep} = \frac{X - mean(X)}{std(X)}Xprep​=std(X)X−mean(X)​

  2. Calculate the covariance matrix: σ=1m.(XT.X)\sigma=\frac{1}{m}.(X^T.X)σ=m1​.(XT.X), mmm is the number of elements, X is a matrix nxpnxpnxp where n is experiment number and p the features

  3. Get the eigenvectors of the covariance matrix [U,S,V]=svd(σ)[U,S,V]=svd(\sigma)[U,S,V]=svd(σ), here the U matrix will be a nxn matrix where every column of U will be the principal components, if we want to reduce our data from n dimensions to k, we choose k columns from U.

The preprocessing part sometimes includes a division by the standard deviation of each collumn, but there are cases that this is not needed. (The mean subtraction is more important)

Reducing input data

Now that we calculate our principal components, which are stored on the matrix U, we will reduce our input data X∈RnX \in R^nX∈Rn from n dimensions to k dimensions Z∈RkZ \in R^kZ∈Rk. Here k is the number of columns of U. Depending on how you organized the data we can have 2 different formats for Z

$$U{reduce}=U(:,1:k)\ Z = U{reduce}^T . X{prep}\ Z = X{prep} . U_{reduce}

### Get the data back --- To reverse the transformation we do the following: $$X = [((X_{prep} . U).U^T).std(X)]+mean(X)

Example in Matlab

To illustrate the whole process we're going to calculate the PCA from an image, and then restore it with less dimensions.

Get some data example

Here our data is a matrix with 15 samples of 3 measurements [15x3]

Data pre-processing

Now we're going to subtract the mean of each experiment from every column, then divide also each element by the standard deviation of each column.

mean and std will work on all columns of X

Calculate the covariance matrix

Get the principal components

Now we use "svd" to get the principal components, which are the eigen-vectors and eigen-values of the covariance matrix

There are different ways to calculate the PCA, for instance matlab gives already a function pca or princomp, which could give different signs on the eigenvectors (U) but they all represent the same components.

The one thing that you should pay attention is the order of the input matrix, because some methods to find the PCA, expect that your samples and measurements, are in some pre-defined order.

Recover original data

Now to recover the original data we use all the components, and also reverse the preprocessing.

Reducing our data

Actually normally we do something before we Now that we have our principal components let's apply for instance k=2

We can use the principal components Z to recreate the data X, but with some loss. The idea is that the data in Z is smaller than X, but with similar variance. On this case we have X∈R3X \in R^3X∈R3 awe could reproduce the data X_loss with Z∈Rk=2Z \in R^{k=2}Z∈Rk=2, so one dimension less.

Using PCA on images

Before finish the chapter we're going to use PCA on images.