Description

5/5 - (5 votes)

Before working on this project, you must go through the R handout on fitting feedforward neural
networks.
1. Consider the MNIST dataset from keras package. It contains a training set of 60,000 28×28 grayscale
images of 10 handwritten digits (from 0 to 9), along with a test set of 10,000 images. We would like
to build a feedforward neural network model to identify the digit on the image. This is a multiclass
classification problem with 10 output classes. For fitting the models below, use ReLU activation for
the hidden layers, softmax activation for the output layer, and minibatches of size 128.
(a) Fit a neural network model with 1 hidden layer with 512 hidden units and 5 epochs. Report its
training and test errors.
(b) Repeat (a) with 1 hidden layer with 512 hidden units and 10 epochs.
(c) Repeat (a) with 1 hidden layer with 256 hidden units and 5 epochs.
(d) Repeat (a) with 1 hidden layer with 256 hidden units and 10 epochs.
(e) Repeat (a) with 2 hidden layers, each with 512 hidden units, and 5 epochs.
(f) Repeat (a) with 2 hidden layers, each with 512 hidden units, and 10 epochs.
(g) Repeat (a) with 2 hidden layers, each with 256 hidden units, and 5 epochs.
(h) Repeat (a) with 2 hidden layers, each with 256 hidden units, and 10 epochs.
(i) Repeat (a) with L2 weight regularization with λ = 0.001.
(j) Repeat (a) with 50% dropout.
1
(k) Make a tabular summary of the results from all the above models and compare them. Which
model would you recommend?
2. Consider the Boston Housing Price dataset from keras package. It contains median price of homes
in a Boston suburb in the mid-1970s, together with 13 numerical neighborhood characteristics. This
relatively small dataset has 506 examples, split between a training set of size 404 and a test set of
size 102. We would like to build a feedforward neural network model to predict the median home
price based on the neighborhood features. Since the features are on different scales, they need to
be standardized before fitting any model. Use the mean and standard deviation from the training
data to standardize features in both training and test sets before doing any analysis. For fitting the
models below, use ReLU activation for the hidden layers, no activation for the output layer, and
minibatches of size 16. In addition, as described in the handout, use mean absolute error (MAE)
computed using 4-fold CV as the performance accuracy measure.
(a) Fit a neural network model with 2 hidden layers, each with 64 hidden units, and 200 epochs.
Make a plot of validation MAE against epoch. Would you recommend early stopping based on
this plot? How many epochs would you suggest? Fit a model with the suggested number of
epochs. Reports its validation MAE. Use this suggested number of epochs for all the models
below.
(b) Fit a neural network model with 1 hidden layer with 128 units. Report its validation MAE.
(c) Add L2 weight regularization to the model with 2 hidden layers, each with 64 hidden units.
Report its validation MAE.
(d) Add L2 weight regularization to the model with 1 hidden layer with 128 hidden units. Report
its validation MAE.
(e) Compare the above models. Which model would you recommend? Compute MAE of the
recommended model from the test data. Comment on the results.

STAT 6340 (Statistical and Machine Learning) Bonus Project

Description

Related products

Homework #3 CSE 446: Machine Learning

STAT 6340 (Statistical and Machine Learning) Mini Project 6

Machine Learning: Programming Assignment 4: K-Means Clustering