Description
1. [8 points] A cubic regression spline with one knot at ξ can be obtained using a basis of the form
1, x, x2
, x3
,(x−ξ)
3
+ where (x−ξ)
3
+ = (x−ξ)
3
if x > ξ and equals 0 otherwise. Show that a function
of the form
f(x) = β0 + β1x + β2x
2 + β3x
3 + β4(x − ξ)
3
+
is indeed a cubic regression spline, regardless of the values of β0, β1, β2, β3, β4.
2. It was mentioned that GAMs are generally fit using a backfitting approach. The idea behind
backfitting is actually quite simple. We will now explore backfitting in the context of multiple
linear regression. Suppose that we would like to perform multiple linear regression, but we do
not have software to do so. Instead, we only have software to perform simple linear regression.
Therefore, we take the following iterative approach: we repeatedly hold all but one coefficient
estimate fixed at its current value, and update only that coefficient estimate using a simple linear
regression. The process is continued until convergence–that is, until the coefficient estimates stop
changing. The process flow is sketched next.
1. Download the adv.dat data set (n = 200) with response Y and 2 predictors X1 and X2 on
BlackBoard
2. Initialize βˆ
1 (estimated coefficient of X1) to take on a value of your choice, say 0.
3. Keeping βˆ
1 fixed, fit the model
Y − βˆ
1 X1 = β0 +β2X2 + e
4. Keeping βˆ
2 fixed, fit the model
Y − βˆ
2X2 = β0 +β1X1 + e.
(a) [6 points] Write a for loop to repeat (3) and (4) 1,000 times. Report the estimates of βˆ
0, βˆ
1
and βˆ
2 at each iteration of the for loop. Create a plot in which each of these values is displayed,
with βˆ
0, βˆ
1 and βˆ
2 each shown in a different color.
(b) [2 points] Compare your answer in (a) to the results of simply performing multiple linear
regression to predict Y using X1 and X2. Use the abline() function to overlay those multiple
linear regression coefficient estimates on the plot obtained in (a).
1
(c) [1 point] On this data set, how many backfitting iterations were required in order to obtain a
“good” approximation to the multiple regression coefficient estimates? What would be a good
stopping criterion?
3. [5 points] Show that the Nadaraya-Watson estimator is equal to local constant fitting. Hint: Use
the local polynomial cost function to start and adapt where necessary.
4. [3 points] Show that the kernel density estimate
ˆf(x) = 1
nh
Xn
i=1
K
x − Xi
h
,
with kernel K and bandwidth h > 0, is a bonafide density. Did you need any condition(s) on K?
If so, which one(s).
2