Description
Linear Regression
1.1 In LinearRegression\linearRegression.py, fill in the missing lines in the python code to estimate β, using (1) closed-form solution, (2) batch gradient descent, and (3) stochastic gradient descent.
Report the learned weights and MSE (Mean Square Error) in the test dataset for each version, are they the same and why?
Apply z-score normalization for each feature x and report whether the normalization affect β and MSE (Mean Square Error) in the test dataset, for all three versions of the algorithm, and why?
1.2 Ridge regression is to add an l2 regularization term to the original mean square error cost function in linear regression: J(β)=1/2n ∑_i▒(x_i^T β-y_i )^2 +λ/2n ∑_j▒β_j^2 , where λ≥0 is a trade-off between the two items. Please derive the closed-form solution for β.
Logistic Regression and Model Selection
In LogisticRegression\LogisticRegression.py, fill in the missing lines in the python code to estimate β, using (1) batch gradient descent and (2) Newton Raphson method.
Report the learned weights and accuracy (Mean Square Error) in the test dataset for each version, are they the same and why? Discuss the pros and cons of the two methods.
Similar to linear regression, regularization can be added to logistic regression. Consider the new objective function as:
J(β)=-∑_i▒〖(y_i x_i^T β〗-log(1+exp{x_i^T β}))/n+λ∑_(j=0)^p▒β_j^2
where n is the total number of data points in the training dataset and p is the dimensionality of attributes. Please compute its first derivative (∂J(β))/(∂β_j ) and implement a regularized batch gradient descent algorithm accordingly. Discuss how the regularization is affecting the training loss (in terms of average log likelihood) and test accuracy based on the experimental results.