Description
1 Overview
In this project, you will experiment with linear regression, overfitting, feature “engineering”, and basic matrix operations.
You are given a dataset, consisting of 926 examples. Each example has 8 real-valued
predictor attributes x (“regressors”), and a single real-valued dependent value y to be
predicted. Your goal is to use this data to build a function that will predict values y for
new data x. Specifically, we have more data that was generated from the same source. We
will measure how well your predictor does on this new data.
2 Data
You are given a file, traindata.txt. This file contains 926 rows, each with 9 numbers,
meaning it constitutes a 926×9 matrix. The first 8 columns contain the data that you will
be predicting from, while the last column contain the data that you will predict.
As an example, to read this data in matlab, you might do:
traindata = importdata(’traindata.txt’);
X = traindata(:,1:8);
Y = traindata(:,9);
We also provide a file testinputs.txt. This contains 103 rows, each with 8 numbers.
This is just like above, except that we do not provide the true output value. You might
read this in with something like:
Xtest = importdata(’testinputs.txt’);
3 Task
Your goal in this project is to provide a text file with 103 numbers in it, consisting of your
predictions for each of the 103 test inputs. Specifically, we will measure the mean-squared
error of your predictions,
1