STAT 3008: Applied Regression Analysis Assignment #2

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (7 votes)

Problem 1 [25 points]: Suppose simple linear regression is fitted to the data {(x1, y1), … (x20, y20)},
with
2
0 1 E(Y | X  x)     x, Var(Y | X  x) 
The coefficient table and ANOVA table below shows some of the estimated values:
(a) [14 points] Replicate the two tables above, and fill in ALL the missing values (in 5 significant
figures) from the tables.
(The p-values can be obtained from R command like “> 1-pf(F0, df1, df2)” for the
right-hand tailed probability of Fdf1, df2).
(b) [3 points] Based on the results in part (a), what is the sample correlation coefficient between
x and y? That is,
2 2
( , ) ( )( )/ ( ) ( ) r  C
ˆ
orr x y  x  x y  y  x  x  y  y xy i i i i
.
(c) [8 points] Based on the results in part (a), test the hypotheses on whether β0 = -10.0 at α=0.05.
You should setup the 4 steps of hypothesis testing as on Ch2 page 64.
Problem 2 [17 points]: Consider the multiple linear regression:
1 ( 1) 1
1 ( 1)   
  
 
n p
n n p
Y X β e
, with
1
( ) E  0n
e
and
n
e I
2 Var( ) 
(a) [10 points] Based on the fact that the OLS estimates
β (X’X) X’Y
1 ˆ

, show that
2
) ‘ ‘ ( 1)
ˆ ˆ
E(Y’Y  β X Xβ  p  
(b) [7 points] Based on the fact that
( ) (ˆ’ˆ) ( 1)
2
E RSS  E e e  n p  and the result from
(a), show that
     
 
n
i
i
n
i
i
n
i
i E y E y E e
1
2
1
2
1
2
( ) ( ˆ ) (ˆ )
Page 2/3
Problem 3 [27 points]: Let Y = (21, 25, 21, 24, 9, 36, 36, 24, 10)’, X1= (3, 9, 5, 3, -1, 7, 8, 4, 1)’
and X2=(3, 9, 5, 3, 0, 7, 9, 4, 1)’. Suppose we want to model the response Y by X1
, X2 and the
intercept using the multiple linear regression.
(a) [12 points] Based on matrix operations in R(i.e. A%*%B, t(A), solve(A) on Ch3 page 30),
(a) show that
β
ˆ
= (11.6819, 0.32316, 2.1527)’,
(b) compute the value of
Y
ˆ
, e
ˆ
, SYY, RSS, SSreg,
2

ˆ , )
ˆ
Var(
ˆ
β
and R
2
.
(Note: In R, command like“RSS<-t(y)%*%y-t(y)%*%X%*%solve(t(X)%*%X)%*%t(X)%*%y”
will assign RSS as a 1×1 matrix object instead of a numeric object. You may want to use
the command “as.numeric(RSS)” to bring it back to a scalar quantity.)
(b) [5 points] Consider a new data point (x1
, x2) = (-1, 1). What is the best point estimator
for the response, and a 95% prediction interval for the response?
(c) [10 points] The ANOVA table below compares Model 1: E(Y|X) = β0 and Model 2: E(Y|X)
= β0 + β2×2:
Source df SS MS F0 p-value
Regression 1 516.44 516.44 18.035 0.0038
Residual 7 200.45 28.636
Total 8 716.89
Suppose we want to test the hypotheses
H0: E(Y|X) = β0+ β2×2 vs H1
: E(Y|X) = β0 + β1×1 + β2×2
Based on the ANOVA table and the results from part (a), construct the appropriate
ANOVA table. What decision and conclusion you can make from the table?
(The p-value can be obtained from R command like “> 1-pf(F0, df1, df2)” for the
right-hand tailed probability of Fdf1, df2).
Problem 4 [11 points]: Consider data {(ui
, vi
, yi), i = 1, 2, …, n} with
u  v  0 , and
0
1
 

n
i
i i SUV u v
. The data is fitted by a multiple linear regression with mean function
E Y U u V v u v 0 1 2
( | 
,
 )      
(a) [6 points] Show that the OLS estimates are

ˆ
1  SUY / SUU , 
ˆ
2  SVY / SVV
and
 y 0
ˆ
 .
(b) [5 points] Suppose a simple linear regression
E(Y |U  u) 0 1u
is fitted to the data.
Do the OLS estimates
0 
ˆ
and
1 
ˆ
the same as the corresponding estimates in part (a)?
Page 3/3
Problem 5 [20 points]: The kinetic energy of an object (y) is related with its velocity (x)
through
, ~ (0, )
2
0
2
y  0  1
x  e e N 
Suppose we fit the data {(xi
, yi), i = 1, …, n} based on
, ~ (0, )
2
y  0 1
x    N  .
(a) [5 points] Show that
α X’X X’X2β
1
(ˆ) ( )
 E 
, with

























 







 








2
2
2
2
1
2
2
1
1
0
1
0
1
1
1
and
1
1
1
, ,
ˆ
ˆ
ˆ
n n
x
x
x
x
x
x
   
α β X X




(b) [11 points] Based on the result from part (a),
(i) show that
1
2 2
2 2 3
0 0
( ) ( )
(ˆ )  
x x
x x x
E


 
(ii) express
( ˆ ) E 1
in terms of
x ,
2
x ,
3
x , n, β0 and β1
.
(c) [4 points] Given that
0
1
lim
1
 


n
i
i
n
x
n
, 0
1
lim 2
1
2   


x
n
i
i
n
x
n

and
3
1
1 3
lim x x
n
i
i
n
x
n    


with
 0  x
. Express
( ˆ ) E 0
and
( ˆ ) E 1
in terms of
, , and as
2
0 1  x  x n  
. Show that (i)
0 
ˆ
is NOT a consistent estimator for β0, and (ii)
1 
ˆ
is NOT a consistent estimator for β1
.
(That is,
0 0
lim 
ˆ
 
n
and
1 1
lim 
ˆ
 
n
)
0 0
lim (
ˆ )  

E
n
and (ii)
1 1
lim (
ˆ )  

E
n
.
– End of the Assignment –