Description
In this homework, you will implement three nonparametric regression algorithms in R, Matlab,
or Python. Here are the steps you need to follow:
1. Read Section 8.8 from the textbook.
2. You are given a univariate regression data set, which contains 133 data points, in the file
named hw04_data_set.csv. Divide the data set into two parts by assigning the first
100 data points to the training set and the remaining 33 data points to the test set.
3. Learn a regressogram by setting the bin width parameter to 3 and the origin parameter to
0. Draw training data points, test data points, and your regressogram in the same figure.
Your figure should be similar to the following figure.
4. Calculate the root mean squared error (RMSE) of your regressogram for test data points.
The formula for RMSE can be written as
!∑ ($%&$’%) *+,-+ ) %./
0+,-+
.
Your output should be similar to the following sentence.
Regressogram => RMSE is 24.7260 when h is 3
5. Learn a running mean smoother by setting the bin width parameter to 3. Draw training
data points, test data points, and your running mean smoother in the same figure. Your
figure should be similar to the following figure.
0 10 20 30 40 50 60
−100
−50
0
50
h = 3
x
y
training
test
6. Calculate the RMSE of your running mean smoother for test data points. Your output
should be similar to the following sentence.
Running Mean Smoother => RMSE is 23.8403 when h is 3
7. Learn a kernel smoother by setting the bin width parameter to 1. Draw training data
points, test data points, and your kernel smoother in the same figure. Your figure should
be similar to the following figure.
8. Calculate the RMSE of your kernel smoother for test data points. Your output should be
similar to the following sentence.
Kernel Smoother => RMSE is 24.1672 when h is 1
What to submit: You need to submit your source code in a single file (.R file if you are using R,
.m file if you are using Matlab, or .py file if you are using Python) and a short report explaining
0 10 20 30 40 50 60
−100
−50
0
50
h = 3
x
y
training
test
0 10 20 30 40 50 60
−100
−50
0
50
h = 1
x
y
training
test
your approach (.doc, .docx, or .pdf file). You will put these two files in a single zip file named as
STUDENTID.zip, where STUDENTID should be replaced with your 7-digit student number.
How to submit: E-mail the zip file you created to cak14@ku.edu.tr with the subject line
Intro2MachineLearningHW04. Please follow the exact style mentioned for the subject line and
do not send a zip file named as STUDENTID.zip. Submissions that do not follow these
guidelines will not be graded.
Late submission policy: Late submissions will not be graded.
Cheating policy: Very similar submissions will not be graded.