CS5590 Assignment 3 Foundations of Machine Learning

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (7 votes)

Questions: Theory
1. Neural Networks: (2+2=4 marks)
(a) The XOR function (exclusive or) returns true only when one of the arguments is true
and another is false. Otherwise, it returns false. Show that a two-layer perceptron (a
perceptron with one hidden layer) can solve the XOR problem. Submit a figure and a
network diagram (with associated weights).
1
(b) x, y, and z are inputs with values -2, 5, and -4 respectively. You have a neuron q and
neuron f with functions:
q = x − y
f = q ∗ z
Show the graphical representation, and compute the gradient of f with respect to x, y,
and z.
2. Neural Networks: (4 marks) The extension of the cross-entropy error function for a
multi-class classification problem is given by:
E(w) = −
X
N
n=1
X
K
k=1
tkn ln yk(xn, w)
where K is the number of classes, N is the number of data samples, and tn is a one-hot
vector which designates the expected output for a data sample xn (note that a one-hot vector
has a 1 in the correct class’ position and zeroes elsewhere, e.g. tn = [0, 0, 1, 0, · · · , 0] for the
ground truth of the 3rd class). The network outputs yk(xn, w) = p(tk = 1|x) are given by
the softmax activation function:
yk(x, w) = exp(ak(x, w))
P
j
exp(ak(x, w))
which satisfies 0 ≤ yk ≤ 1 and P
k
yk = 1, where ’ak’s are the pre-softmax activations of the
output layer neurons (also called logits). For a given input, show that the derivative of the
above error function with respect to the activation ak for an output unit having a logistic
sigmoid activation function is given by:
∂E
∂ak
= yk − tk
3. Ensemble Methods: (2+2=4 marks) Consider a convex function f(x) = x
2
. Show
that the average expected sum-of-squares error EAV =
1
M
PM
m=1 Ex[(ym(x) − f(x))2
] of the
members of an ensemble model and the expected error EENS = Ex[( 1
M
PM
m=1 ym(x)−f(x))2
]
of the ensemble satisfy:
EENS ≤ EAV
Show further that the above result holds for any error function E(y), not just sum-of-squares,
as long as it is convex in y. (Hint: The only tool you may need is the Jensen’s inequality.
Read up about it, and use it!)
Questions: Programming
4. Random Forests: (5 + 2.5 + 2.5 = 10 marks)
(a) Write your own random forest classifier (this should be relatively easy, given you have
written your own decision tree code) to apply to the Spam dataset [data, information].
Use 30% of the provided data as test data and the remaining for training. Compare
your results in terms of accuracy and time taken with Scikitlearn’s built-in random
2
forest classifier. (Note that you can’t use in-built decision tree functions to implement
your code. You can modify your decision tree code of the Assignment 1, or code a new
one, to implement a random forest. You can however use the inbuilt train test split of
sklearn to divide the data into train and test.)
(b) Explore the sensitivity of Random Forests to the parameter m (the number of features
used for best split).
(c) Plot the OOB (out-of-bag) error (you have to find what this is, and read about it!) and
the test error against a suitably chosen range of values for m. (Use your implementation
of random forest to perform this analysis.)
Deliverables:
• Code
• Brief report (PDF) with your solutions for the above questions
5. Gradient Boosting: (3 + 5 = 8 marks) In this question, we will explore the use of
pre-processing methods and Gradient Boosting on the popular Lending Club dataset. You
are provided with two files: loan train.csv and loan test.csv. The dataset is almost as
provided by the the original source, and you may have to make the necessary changes to make
it suitable for applying ML algorithms. (If required, you can further divide loan train.csv
into a validation set for model selection.) Your efforts will be to pre-process the data appropriately, and then apply gradient boosting to classify whether a customer should be given
a loan or not. The target attribute is in the column loan status, which has values “Fully
Paid” for which you can assign +1 to, and “Charged off” for which you can assign -1 to. The
other records with loan status values “Current” (in both train and test) are not relevant
to this problem. You can see this link to know more about the different attributes on the
dataset (but please use the provided data, there are several versions of the dataset online.)
Your tasks are to do the following:
(a) Pre-process the data as needed to apply the classifier to the training data (you are free
to use pandas or other relevant libraries. Note that test data should not be used for
pre-processing in any way, but the same pre-processing steps can be used on test data.
Some steps to consider:
• Check for missing values, and how you want to handle them (you can delete the
records, or replace the missing value with mean/median of the attribute – this is
a decision you must make. Please document your decisions/choices in the final
submitted report.)
• Check whether you really need all the provided attributes, and choose the necessary
attributes. (You can employ feature selection methods, if you are familiar; if not,
you can eyeball.)
• Transform categorical data into binary features, and any other relevant columns to
suitable datatypes
• Any other steps that help you perform better
(b) Apply gradient boosting using the function sklearn.ensemble.GradientBoostingClassifier
for training the model. You will need to import sklearn, sklearn.ensemble, and
numpy. Your effort will be focused on predicting whether or not a loan is likely to
default.
3
• Get the best test accuracy you can, and show what hyperparameters led to this
accuracy. Report the precision and recall for each of the models that you built.
• In particular, study the effect of increasing the number of trees in the classifier.
• Compare your final best performance (accuracy, precision, recall) against a simple
decision tree built using information gain. (You can use sklearn’s inbuilt decision
tree function for this.)
Deliverables:
• Code
• Brief report (PDF) with your solutions for the above questions
4