Sale!

Homework # 7. AMS 597

$30.00 $18.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (4 votes)

Penalized Regression with the Boston Housing Data

This dataset contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It was obtained from the StatLib archive (http://lib.stat.cmu.edu/datasets/boston), and has been used extensively throughout the literature to benchmark algorithms. The goal is to model the variable MEDV using the other 13 variables.

There are 14 attributes (variables) in the data set:

CRIM – per capita crime rate by town

ZN – proportion of residential land zoned for lots over 25,000 sq.ft.

INDUS – proportion of non-retail business acres per town.

CHAS – Charles River dummy variable (1 if tract bounds river; 0 otherwise)

NOX – nitric oxides concentration (parts per 10 million)

RM – average number of rooms per dwelling

AGE – proportion of owner-occupied units built prior to 1940

DIS – weighted distances to five Boston employment centres

RAD – index of accessibility to radial highways

TAX – full-value property-tax rate per $10,000

PTRATIO – pupil-teacher ratio by town

B – 1000(Bk – 0.63)^2 where Bk is the proportion of blacks by town

LSTAT – % lower status of the population

MEDV – Median value of owner-occupied homes in $1000’s

  1. Please use the random seed 123 to divide the data into 75% training and 25% testing.

 

  1. Please first find the best Ridge Regression model using the training data. Please (a) find the best λ value through cross-validation and display this value; (b) display the coefficients of the fitted model; and (c) make prediction on the testing data, and report the RMSE and the Coefficient of Determination .

 

  1. Please first find the best LASSO model using the training data. Please (a) find the best λ value through cross-validation and display this value; (b) display the coefficients of the fitted model; and (c) make prediction on the testing data, and report the RMSE and the Coefficient of Determination .

 

  1. Please first find the best Elastic Net model using the training data. Please (a) find the best tuning parameter values through cross-validation and display these values; (b) display the coefficients of the fitted model; and (c) make prediction on the testing data, and report the RMSE and the Coefficient of Determination .