CS/ECE/ME532 Activity 18


Category: You will Instantly receive a download link for .zip solution file upon Payment


5/5 - (1 vote)

1. A breast cancer gene database has approximately 8000 genes from 100 subjects. The
label yi
is the disease state of the ith subject (+1 if no cancer, -1 if breast cancer). Suppose we build a linear classifier that combines the 8000 genes, say gi
, i = 1, 2, . . . , 100
to predict whether a subject has cancer ˆyi = sign{g
i w}.

Note that here gi and w are
8000-by-1 vectors. You recall from the previous period that the least-squares problem
for finding classifier weights has no unique solution.

Your hypothesis is that a relatively small number of the 8000 genes are predictive of
the cancer state. Identify a regularization strategy consistent with this hypothesis and
justify your choice.

2. Consider the least-squares problem minw ||y − Xw||2
2 where y = 4 and X =

2 1

a) Does this problem have a unique solution? Why or why not?
b) Sketch the contours of the cost function f(w) = ||y−Xw||2
in the w1−w2 plane.

c) Now consider the LASSO minw ||w||1 subject to ||y − Xw||2
2 < 1. Find the
solution using the following steps
i. Repeat your sketch from part b).
ii. Add a sketch of ||w||1 = c
iii. Find the w that satisfies ||y − Xw||2
2 = 1 with the minimum possible value
of ||w||1.

d) Use your insight from the previous part to sketch the set of solutions to the
problem minw ||y − Xw||2
2 + λ||w||1 for 0 < λ < ∞.

3. The script provided has a function that will compute a specified number of iterations
of the proximal gradient descent algorithm for solving the `1-regularized least-squares
||y − Xw||2
2 + λ||w||1

The script will get you started displaying the path taken by the weights in the proximal
gradient descent iteration superimposed on a contour plot of the squared error surface
for the cost function defined in problem 2. part b) starting from w(0) =

. The
script assumes λ = 4 and τ = 1/4.

Include the plots you generate below with your submission.
a) How many iterations does it take for the algorithm to converge to the solution?
What is the converged value for w?

b) Change to λ = 2. How many iterations does it take for the algorithm to converge
to the solution? What is the converged value for w?

c) Explain what happens to the weights in the regularization step.

4. Use the proximal gradient algorithm to solve minw ||y − Xw||2
2 + 4||w||1 for the parameters defined in problem 2.

a) What is the maximum value for the step size in the negative gradient direction,
b) Suppose τ = 0.1 and you start at w(0) =


Calculate the first two complete
iterations of the proximal gradient algorithm and depict w(0)
, z
, w(1)
, z
(2) and
w(2) on a sketch of the cost function identical to the one you created in problem