Description
In this lab assignment, you will experiment with ensemble classifiers.
You may use a machine learning package to perform the experiments. For example the package ”caret”
in R (https://topepo.github.io/caret/available-models.html) has all the algorithms used in this lab.
1 Dataset
• We will use the Blood Transfusion Service Center Data Set from the UC Irvine Machine Learning
Repository:
https://archive.ics.uci.edu/ml/datasets/Blood+Transfusion+Service+Center
We have split the above data into training (train.csv) and testing (test.csv) sets. In the experiments, please
train your models with the training data, and report the performances on the test data.
2 Tasks
2.1 Task 1
• Train ensemble models Random Forest (RF) and AdaBoost.M1 with decision stumps as base classifiers
(Note you may use other versions of AdaBoost if the machine learning package you choose to use
does not have this particular version.). Experiment with different values of hyper-parameters such as
number of base classifiers etc.
• Report your experimental results. Report for the best models you have learned, the corresponding
hyper-parameters and the performance including overall classification accuracy and confusion matrix
for the test data. Discuss the results.
2.2 Task 2
• Train 5 individual models: Neural Network (NN), KNN, Logistic Regression (LR), Naive Bayes (NB),
Decision Tree (DT). Report the confusion matrix and classification accuracy on the test data for each
of them. When training the 5 models, slightly tune the hyper-parameters to get a good accuracy
(extensive grid search is not required). Report the experiments you have done.
• Construct an ensemble classifier using unweighted majority vote over the 5 models you have trained.
Report the performance on the test data.
• Construct an ensemble classifier using weighted majority vote over the 5 models you have trained.
Report the performance on the test data. You might use weights proportional to the classification
accuracy, tune weights as hyperparameters, or use stacking (not required). Report the experiments
you have done.
• Discuss the results.
1
2.3 Task 3
• Construct ensemble classifiers (unweighted and weighted) as in Task 2 but use all seven models (5
individual models in Task 2 plus RF and AdaBoost models in Task 1).
• Discuss the results.
3 What to turn in
Turn in via Canvas (a compressed .zip file) the following:
• A lab report (in pdf file) with your experimental results and discussions of these results.
• All of your commented source code that you may have written.
• Readme file with instructions on how to reproduce your experiments.