## Description

In this assignment, you will implement a Matlab function to decompose a

matrix A into the product of two matrices A = QR, where Q has orthonormal

columns and R is upper triangular. You will then use your decomposition to

solve a shape fitting problem.

You will need the files back_sub.m for Part 4 and generate_data.m and

visualize.m for Part 5. These can be found in the zip file provided on the

Moodle assignment page.

1 Submission Guidelines

You will submit a zip file that contains the following .m files on Moodle:

• ortho_decomp.m

• my_qr.m

• least_squares.m

• my_pack.m

• my_unpack.m

• design_matrix.m

• affine_fit.m

as well as any helper functions that are needed by your implementation.

As with previous assignments, please note the following rules:

1. Each file must be named as specified above.

2. Each function must return the specified value(s).

3. You may use Matlab’s array manipulation syntax, e.g. size(A), A(i,:),

and [A,B], and basic operations like addition, multiplication, and transpose

of matrices and vectors. High-level linear algebra functions such as inv,

qr, and A\b are not allowed except where specified. Please contact the

instructor with further questions.

1

4. This is an individual assignment, and no collaboration is allowed. Any

in-person or online discussion should stop before you start discussing or

designing a solution.

2 Orthogonal decomposition

Suppose you have an orthonormal basis {u1, . . . , un} for a subspace H ⊆ R

m.

Any vector v ∈ R

m can be decomposed into the sum of two orthogonal vectors,

vˆ ∈ H and vˆ

⊥ ∈ H⊥. Furthermore, since vˆ ∈ H, it can be expressed as

vˆ = c1u1 + · · · + cnun for some weights c1, . . . , cn. Implement a function to

perform this decomposition.

Specification:

function [c, v_perp] = ortho_decomp(U, v)

Input: an m × n matrix U =

u1 · · · un

with orthonormal columns, and a

vector v.

Output:

c: a vector c =

c1

.

.

.

cn

such that v = c1u1 + · · · + cnun + vˆ

⊥

v_perp: the residual vector v

⊥ such that ui

· vˆ

⊥ = 0 for all i.

Implementation notes:

Here vˆ is simply the orthogonal projection of v onto the subspace H. Since we

have an orthogonal basis of H, this should be easy to compute. In fact, since

the basis is orthonormal, it is possible to compute the projection in just one line

of code without any for loops. Hint: What are the entries of UT v?

Test cases:

1. U =

1 0

0 0

0 1

0 0

, v =

2

3

4

5

→ c =

2

4

, vˆ

⊥ =

0

3

0

5

2. U =

0.6

0.8

0

, v =

1

1

1

→ c =

1.4

, vˆ

⊥ =

0.16

−0.12

1

2

3. Generate a random orthonormal set of 5 vectors in R

10 using the command [U,~] = qr(randn(10,5),0), and generate another random integer

vector v = randi([-2,2], 10,1). Compute your orthogonal decomposition, [c, v_perp] = ortho_decomp(U, v). Verify that U*c + v_perp is

nearly the same as v, and U’*v_perp is nearly zero (both to within 10−12).

3 QR decomposition

Suppose you have an m × n matrix A which is “tall and thin”, i.e. with m > n,

and the columns of A are linearly independent. The Gram-Schmidt process

corresponds to a factorization A = QR, where Q is an m × n matrix with

orthonormal columns, and R is an n × n upper triangular matrix.1

Specification:

function [Q, R] = my_qr(A)

Input: an m × n matrix A with m > n.

Output: an m × n matrix Q with orthonormal columns, and an n × n upper

triangular matrix R, such that A = QR.

Implementation notes:

I recommend starting your implementation by first having your function compute

Q correctly. After that, you can modify your code to also compute R.

The columns of Q are essentially obtained by performing the Gram-Schmidt

process with normalization on the columns of A:

q1 = a1/ka1k,

aˆ

⊥

2 = a2 − projH1

a2 where H1 = span{q1},

q2 = aˆ

⊥

2 /kaˆ

⊥

2 k,

aˆ

⊥

3 = a3 − projH2

a3 where H2 = span{q1, q2},

q3 = aˆ

⊥

3 /kaˆ

⊥

3 k,

.

.

.

You can carry this out using your orthogonal decomposition function. For each

column i = 1, . . . , n, call ortho_decomp with an appropriate set of inputs, and

1Technically, what we are computing here is known as the “thin” or “reduced” QR decomposition. In the full QR decomposition, Q is square and orthogonal, and R is an m × n upper

triangular matrix whose lower (m − n) rows are all zero. In practice, it is also not computed

with the Gram-Schmidt process, but with other methods that are more numerically stable.

3

use the resulting vˆ

⊥ to fill in the ith column of Q. In the end, the columns of

Q should be an orthonormal set of vectors {q1, . . . , qn} which span Col A.

Once you can compute Q correctly, modify your function to compute R as well.

In principle, since QT Q = I, you could obtain R simply via QT A = QT QR = R.

However, this is more work than necessary, because you can actually fill in the

entries of R as you compute each column of Q. When you compute the ith

column of Q, the ortho_decomp call gives you both c and vˆ

⊥; can you use these

to fill in the i nonzero entries in the ith column of R? Hint: Consider the case

when A is already an upper triangular matrix, say A =

1 2 3

0 4 5

0 0 6

, and work

out by hand what c and vˆ

⊥ will be for each column.

Test cases:

1. A =

0 3 0

0 0 −4

−2 0 0

→ Q =

0 1 0

0 0 −1

−1 0 0

, R =

2 0 0

0 3 0

0 0 4

2. A =

1 1 0

1 0 0

1 1 1

1 0 1

→ Q =

1/2 1/2 −1/2

1/2 −1/2 −1/2

1/2 1/2 1/2

1/2 −1/2 1/2

, R =

2 1 1

0 1 0

0 0 1

3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5)

and compute your QR decomposition, [Q, R] = my_qr(A). Verify that

istriu(R) is true, Q’*Q is nearly the identity matrix, and Q*R is nearly

the same as A (both to within 10−12).

4

4 Least-squares problems

Use the QR decomposition to solve the least-squares problem Ax ≈ b.

Specification:

function x = least_squares(A, b)

Input: an m × n matrix A and a vector b ∈ R

m.

Output: a vector x ∈ R

n which is the least-squares solution of the overdetermined system Ax ≈ b, i.e. such that Ax − b ∈ (Col A)

⊥.

Implementation notes:

Do not solve the normal equations AT Ax = AT b directly, as this is a very

numerically unstable approach. Instead, compute the QR factorization of A and

use it to solve the least-squares problem with only a matrix-vector multiplication

and a back-substitution, as described in the textbook. Use the back_sub function

provided with this assignment to perform the back-substitution.

If you cannot get your my_qr function to work, you may use Matlab’s built-in qr

function here so you can still attempt Part 5. Call qr(A,0) to get a rectangular

Q and square R as desired.

Test cases:

1. A =

1

1

0

, b =

2

3

4

→ x =

2.5

2. A =

1 0 0

1 1 1

1 2 4

1 3 9

1 4 16

, b =

0

3

4

3

0

→ x =

0

4

−1

3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5)

and a random integer vector b = randi([-2,2], 10,1). Compute your

least-squares solution, x = least_squares(A, b), and obtain the residual

r = b – A*x. Verify that A’*r is nearly zero (to within 10−12).

5

5 Best-fitting transformations

An affine transformation is a combination of a linear transformation and a

translation (i.e. displacement) while retaining parallelism, i.e., parallel lines

remain parallel after the transformation. For example, a point

x

y

on A (orange

square) in Figure 1 can be transformed to

x

y

on B (red parallelogram) via a

linear transform M =

m11 m12

m21 m22

, i.e.,

x

y

=

m11 m12

m21 m22 x

y

.

Note that the parallel lines stay parallel. This transform can be combined with

a translation, moving the red parallelogram to blue parallelogram (C). This

composite transformation can be written as:

xe

ye

=

x

y

+

tx

ty

=

m11 m12

m21 m22 x

y

+

tx

ty

= M

x

y

+ t, (1)

where t =

tx

ty

is the translation vector.

11 12

2 21 22 1

2 m m 1

m

x

y m y

x

=

11 12

21 22

1

1

x

y

m m t

m y t

x

y m

x

= +

A

B

C

Figure 1: Affine transform.

In sum, Equation (1) can be written as

xe

ye

=

m11x + m12y + tx

m22x + m22y + ty

. (2)

6

5.1 Linear Equation from a Single Correspondence

Your task is given many correspondences2

(xi

, yi) ↔ (xei

, yei) where i is the

index for the correspondence, to compute the best affine transform parameters,

m11, m12, m21, m22, tx, ty (unknowns): We can rewrite Equation (2) by arranging

it with respect to the unknowns for, say, the first correspondence:

xe1

ye1

| {z }

pe1

=

x1 y1 0 0 1 0

0 0 x1 y1 0 1

| {z }

A1

m11

m12

m21

m22

tx

ty

| {z }

β

. (3)

or simply,

pe1 = A1β. (4)

Note that A1 depends on the original position of the point, p1 =

x1

y1

. The

correspondence produces 2 linear equations while the number of unknowns is

6, so at least 3 correspondences are needed to uniquely determine the affine

transform parameters.

Implement a function beta = my_pack(M, t) to obtain β from M =

m11 m12

m21 m22

and t =

tx

ty

, and its inverse function [M, t] = my_unpack(beta).

function beta = my_pack(M, t)

Input: a 2 × 2 matrix M and a vector t.

Output: a vector β ∈ R

6

containing the entries of M and t.

function [M, t] = my_unpack(beta)

Input: a vector β ∈ R

6

.

Output: a 2 × 2 matrix M and a vector t ∈ R

2

such that my_pack(M, t) = β.

Finally, construct the matrix Ai given pi

.

function Ai = design_matrix(pi)

Input: a vector pi =

xi

yi

∈ R

2

.

Output: the matrix Ai given by Equation (3).

2A point (x, y) in A is transformed to a point (x, e ye) in C. This pair of points forms a

correspondence.

7

5.2 Solving Linear System from n Correspondences

Equation (3) can be extended to include n correspondences as follow:

xe1

ye1

.

.

.

xen

yen

=

x1 y1 0 0 1 0

0 0 x1 y1 0 1

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

xn yn 0 0 1 0

0 0 xn yn 0 1

m11

m12

m21

m22

tx

ty

, (5)

or again simply,

pe1

.

.

.

pen

=

A1

.

.

.

An

β, (6)

If all correspondences are noise-free, Equation (6) will be always satisfied. Due

to correspondence noise in practice, it cannot be satisfied, and therefore,

pe1

.

.

.

pen

≈

A1

.

.

.

An

β, (7)

Now we will compute the best affine transform parameters using Equation (7).

function [M, t] = affine_fit(P, P_tilde)

Input: 2×k correspondence matrices P =

p1 · · · pk

and Pe =

pe1 · · · pek

containing the reference points and transformed points as columns.

Output: a 2 × 2 matrix M and a vector t ∈ R

2

such that the affine transformation Mx + t best matches the given data. To compute this, you will have

to set up and solve the least-squares problem in Equation (7) to obtain β, then

unpack it to get M and t out.

Test cases:

For the first two tests, pick an arbitrary M and t, say M =

1 2

3 4

and t =

5

6

.

1. Check that [M_new, t_new] = my_unpack(my_pack(M, t)) recovers the

original values of M and t.

2. Verify that when x =

0

0

, design_matrix(x)*my_pack(M,t) gives back

t. Similarly, x =

1

0

and x =

0

1

should give you m1 + t and m2 + t

respectively, where m1 and m2 are the columns of M.

8

3. Applying affine_fit to P =

0 1 1 0

0 0 1 1

, Pe =

0.4 1 1.8 1.2

0.8 0 0.6 1.4

should give M =

0.6 0.8

−0.8 0.6

and t =

0.4

0.8

: a rotation and a translation.

p1

pe1

p2 = pe2

p3

pe3

p4

pe4

4. We have provided a function [P, P_tilde, M, t] = generate_data()

that produces some random test data. It does so by filling random values

in M and t, choosing random points pi

, then setting each pei to Mpi + t

plus a small amount of random noise. You can visualize the data by calling

the provided function visualize(P, P_tilde).

Call affine_fit(P, P_tilde) to obtain your own estimates of M and t.

The result should be close to the “ground truth” transformation returned

by generate_data, although due to the added noise it will not be exactly

identical. Call visualize(P, P_tilde, M, t) to see what your fit looks

like.

Note: If you want to test on a smaller data set, just use the first few columns

of P and Pe, for example P = P(:, 1:10) and P_tilde = P_tilde(:, 1:10).

9