Description
Tasks
In the class we talked about how to reduce data dimensionality by extracting new set of features
using PCA, LDA and other methods. The basis of these methods is the eigenvector decomposition
of the data matrix.
• Load the crime dataset and store it as a matrix (The data is already normalized so you should
not need to normalize the data further.)
• Compute the eigenvectors and eigenvalues (you can use the mathematical formulation or call
a library in your chosen environment)
• Report a table with the top 20 eigenvalues, is there a clear point where you could cut off the
dimensions?
1
• For fun (on your own): Plot all the data points in a 2D scatterplot by projecting data points
to the two eigenvectors with the highest eigenvalues. Colour the points by some dimension of
the original data (eg. PopDens or medIncome) to see what patterns arise.
2