## Description

1. Consider the system of linear equations Xw = y where X =

”

1 1

−2 −2

#

, w =

”

w1

w2

#

and y =

”

2

−4

#

.

a) Sketch the set of all w that satisfy Xw = y in the w1-w2 plane. Is the solution

unique? What is the value of the squared error minw ||Xw − y||2

2

?

b) Use your sketch to find the w of minimum norm that satisfies the system of

equations: minw ||w||2

2

subject to Xw = y. Is this solution unique? What makes

it unique? What is the value of the squared error ||Xw − y||2

2

at this solution?

What is the value of ||w||2

? Hint: The equation ||w||2

2 = c describes a circle in

R

2 with radius √

c.

c) Algebraically find the wˆ that solves the Tikhonov-regularized (or ridge regression)

problem wˆ = arg minw {||Xw − y||2

2 + λ||w||2

2} as a function of λ. Hint: Recall

that

”

a b

c d #−1

=

1

ad − bc ”

d −b

−c a #

d) Sketch the set solution to the Tikhonov-regularized problem in the w1-w2 plane

as a function of λ for 0 < λ < ∞. (Consider the solution for different values

of λ in that range.) Find the squared error ||Xw − y||2

2

and norm squared of

the solution, ||w||2

2

for λ = 0 and λ = 5. Compare the squared error and norm

squared of the solution to those in part b).

2. Let X =

1 γ

1 −γ

1 −γ

1 γ

.

a) Show that the columns of X are orthogonal to each other for any γ.

b) Express X = UΣ where U is a 4-by-2 matrix with orthonormal columns and Σ

is a 2-by-2 diagonal matrix (the non-diagonal entries are zero).

c) Express the solution to the least-squares problem minw ||Xw −y||2

2

as a function

of U, Σ, and y.

d) Let y =

1

0

0

1

. Find the weights w as a function of γ. What happens to ||w||2

2

as γ → 0?

e) The ratio of the largest to the smallest diagonal values in Σ is termed the condition

number of X. Find the condition number if γ = 0.1 and γ = 10−8

. Also find

||w||2

2

for these two values of γ.

f) A system of linear equations with a large condition number is said to be “illconditioned”. One consequence of an ill-conditioned system of equations is solutions with large norms as you found in the previous part of this problem.

A second

consequence is that the solution is very sensitive to small errors in y such as may

result from measurement error or numerical error. Suppose y =

1 +

0

0

1

.

Write

w = wo + w where wo is the solution for arbitrary γ when = 0 and w

is the

perturbation in that solution due to some error 6= 0. How does the norm of the

perturbation due to 6= 0, ||w

||2

2

, depend on the condition number? Find ||w

||2

2

for = 0.01 and γ = 0.1 and γ = 10−8

.

g) Now apply ridge regression, i.e., Tikhonov reqularization. Solve for wo and w as

a function of λ. Find ||wo||2

2

and ||w

||2

2

for λ = 0.1, = 0.01 and γ = 0.1 and

γ = 10−8

. Comment on the impact of regularization.