## Description

1. [8 points] A cubic regression spline with one knot at ξ can be obtained using a basis of the form

1, x, x2

, x3

,(x−ξ)

3

+ where (x−ξ)

3

+ = (x−ξ)

3

if x > ξ and equals 0 otherwise. Show that a function

of the form

f(x) = β0 + β1x + β2x

2 + β3x

3 + β4(x − ξ)

3

+

is indeed a cubic regression spline, regardless of the values of β0, β1, β2, β3, β4.

2. It was mentioned that GAMs are generally fit using a backfitting approach. The idea behind

backfitting is actually quite simple. We will now explore backfitting in the context of multiple

linear regression. Suppose that we would like to perform multiple linear regression, but we do

not have software to do so. Instead, we only have software to perform simple linear regression.

Therefore, we take the following iterative approach: we repeatedly hold all but one coefficient

estimate fixed at its current value, and update only that coefficient estimate using a simple linear

regression. The process is continued until convergence–that is, until the coefficient estimates stop

changing. The process flow is sketched next.

1. Download the adv.dat data set (n = 200) with response Y and 2 predictors X1 and X2 on

BlackBoard

2. Initialize βˆ

1 (estimated coefficient of X1) to take on a value of your choice, say 0.

3. Keeping βˆ

1 fixed, fit the model

Y − βˆ

1 X1 = β0 +β2X2 + e

4. Keeping βˆ

2 fixed, fit the model

Y − βˆ

2X2 = β0 +β1X1 + e.

(a) [6 points] Write a for loop to repeat (3) and (4) 1,000 times. Report the estimates of βˆ

0, βˆ

1

and βˆ

2 at each iteration of the for loop. Create a plot in which each of these values is displayed,

with βˆ

0, βˆ

1 and βˆ

2 each shown in a different color.

(b) [2 points] Compare your answer in (a) to the results of simply performing multiple linear

regression to predict Y using X1 and X2. Use the abline() function to overlay those multiple

linear regression coefficient estimates on the plot obtained in (a).

1

(c) [1 point] On this data set, how many backfitting iterations were required in order to obtain a

“good” approximation to the multiple regression coefficient estimates? What would be a good

stopping criterion?

3. [5 points] Show that the Nadaraya-Watson estimator is equal to local constant fitting. Hint: Use

the local polynomial cost function to start and adapt where necessary.

4. [3 points] Show that the kernel density estimate

ˆf(x) = 1

nh

Xn

i=1

K

x − Xi

h

,

with kernel K and bandwidth h > 0, is a bonafide density. Did you need any condition(s) on K?

If so, which one(s).

2