Principal Component Analysis

Principal Component Analysis (PCA) projects the given dataset onto a lower dimensional linear space so that the variance of the projected data is maximized. PCA requires the eigenvalues and eigenvectors of the covariance matrix, which is the product where X is the data matrix.

SVD on the data matrix X is given as follows:

The following example shows PCA using SVD:

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import plotly.plotly as py
import plotly.graph_objs as go
import plotly.figure_factory as FF
import pandas as pd

path = "/neuralnetwork-programming/ch01/plots"
logs = "/neuralnetwork-programming/ch01/logs"

xMatrix = np.array([[0,2,1,0,0,0,0,0],
[2,0,0,1,0,1,0,0],
[1,0,0,0,0,0,1,0],
[0,1,0,0,1,0,0,0],
[0,0,0,1,0,0,0,1],
[0,1,0,0,0,0,0,1],
[0,0,1,0,0,0,0,1],
[0,0,0,0,1,1,1,0]], dtype=np.float32)

def pca(mat):
mat = tf.constant(mat, dtype=tf.float32)
mean = tf.reduce_mean(mat, 0)
less = mat - mean
s, u, v = tf.svd(less, full_matrices=True, compute_uv=True)

s2 = s ** 2
variance_ratio = s2 / tf.reduce_sum(s2)

with tf.Session() as session:
run = session.run([variance_ratio])
return run


if __name__ == '__main__':
print(pca(xMatrix))

The output of the listing is shown as follows:

[array([  4.15949494e-01,   2.08390564e-01,   1.90929279e-01,
8.36438537e-02, 5.55494241e-02, 2.46047471e-02,
2.09326427e-02, 3.57540098e-16], dtype=float32)]