Description
Problem 1 – Machine Learning
In this problem the Iris Plants Database contains 3 classes of 50 instances each, where each class
refers to a type of Iris plant. The attributes/features were extended by two features for each plant
instance for a total of 6 features. The updated data is provide as a csv file. In this assignment data
processing and machine learning techniques need to be implemented as follows:
1. Data Cleansing (use the iris data for cleansing.csv)
2. Data Transformation (can be combined with #6 (dimensionality reduction))
3. Generate two sets of features from the original 4 features to end up with a total of 8 featres
4. Perform Feature Preprocessing
Use an outlier removal method to remove any outliers.
5. Rank the 6 set of features to determine which are the two top features
6. Reduce the dimensionality to two features using PCA or Kernel PCA
7. Using the following Machine Learning techniques, classify the three class Iris data:
(a) Expectation Maximization
(b) Either Fisher Linear Discriminant (Linear Discriminant Analysis), Kernel Fisher’s Discriminant or Parzen Window
(c) Neural Network Method (Probabilistic NN, Radial Basis Function or Feed Forward)
(d) Support Vector Machine
1