Description
1 Manual calculation of one round of EM for a GMM [30 points]
(Extended version of: Murphy Exercise 11.7) In this question we consider clustering 1D data with a mixture
of 2 Gaussians using the EM algorithm. You are given the 1-D data points x = [1 10 20].
M step
Suppose the output of the E step is the following matrix:
R =
1 0
0.3 0.7
0 1
where entry Ri,c is the probability of observation xi belonging to cluster c (the responsibility of cluster c
for data point i). You just have to compute the M step. You may state the equations for maximum likelihood
estimates of these quantities (which you should know) without proof; you just have to apply the equations to
this data set. You may leave your answer in fractional form. Show your work.
1. [3 points] Write down the likelihood function you are trying to optimize.
2. [6 points] After performing the M step for the mixing weights π1, π2, what are the new values?
3. [6 points] After performing the M step for the means µ1 and µ2, what are the new values?
4. [6 points] After performing the M step for the standard deviations σ1 and σ2, what are the new values?
E step
Now suppose the output of the M step is the answer to the previous section. You will compute the subsequent
E step.
1. [3 points] Write down the formula for the probability of observation xi belonging to cluster c.
2. [6 points] After performing the E step, what is the new value of R?
2 PCA via Successive Deflation [30 points]
(Adapted from Murphy Exercise 12.7)
Suppose we have a set of n data points x1, . . . , xn, where each xi
is a d-dimensional column vector. Let
X = [x1; . . . ; xn] be the (d×n) matrix where the i
th column is xi
. Define C =
1
nXXT
to be the covariance
matrix of X, where cij =
Pn
l=1 xilxjl = covar(i, j).
1
Next, order the eigenvectors of C by their eigenvalues (largest first), and let v1, v2, . . . , vk be the first k
eigenvectors. These satisfy
v
T
i vj =
(
0 if i 6= j
1 if i = j
v1 is the first principal eigenvector of C (the eigenvector with the largest eigenvalue), and as such satisfies
Cv1 = λ1v1. Now define x˜i as the orthogonal projection of xi onto the space orthogonal to v1:
x˜i = (I − v1v
T
1
)xi
Finally, define X˜ = [x˜1; . . . ; x˜n] as the deflated matrix of rank d − 1, which is obtained by removing
from the d-dimensional data the component that lies in the direction of the first principal eigenvector:
X˜ = (I − v1v
T
1
)X
1. [5 points] Show that the covariance of the deflated matrix,
C˜ =
1
n
X˜ X˜ T
is given by
C˜ =
1
n
XXT − λ1v1v
T
1
(Hint: Some useful facts: (I − v1v
T
1
) is symmetric, XXT v1 = nλ1v1, and v
T
1 v1 = 1. Also, for any
matrices A and B, (AB)
T = BT AT
.)
2. [5 points] Show that for j 6= 1, if vj is a principal eigenvector of C with corresponding eigenvalue λj
(that is, Cvj = λjvj ), then vj is also a principal eigenvector of C˜ with the same eigenvalue λj .
3. [5 points] Let u be the first principal eigenvector of C˜ . Explain why u = v2. (You may assume u is
unit norm.)
4. [5 points] Suppose we have a simple method f for finding the leading eigenvector and eigenvalue of
a positive-definite matrix, denoted by [λ, u] = f(C). Write some pseudocode for finding the first k
principal basis vectors of X that only uses the special f function and simple vector arithmetic.
(Hint: This should be a simple iterative routine that takes only a few lines to write. The input is C, k,
and the function f, the output should be vj and λj for j ∈ 1, · · · , k)
3 Programming Question (Generative Adversarial Networks) [40 points]
In this section, you will train generative adversarial networks (GAN) to generate images using PyTorch. We
use the MNIST data which is 60,000 training and 10,000 test images. Refer to the jupyter notebook for
details.
You will first train a GAN for generating new images. Then try to improve the network architecture and
attach your results with the jupyter notebook. Also add the hyper-parameters explored.
The detail instructions and questions are in the jupyter notebook GAN.ipynb. In this file, there are 7
“To-Do” locations for you to fill. The score of each To-Do is specified at the spot.
We recommend using virtual environment for the project. If you choose not to use a virtual environment,
it is up to you to make sure that all dependencies for the code are installed globally on your machine. To set
up a virtual environment, run the following in the command-line interface:
2
cd your_hw4_folder
sudo pip install virtualenv # This may already be installed
virtualenv .env # Create a virtual environment
source .env/bin/activate # Activate the virtual environment
pip install -r requirements.txt # Install dependencies
# Note that this does NOT install TensorFlow or PyTorch,
# which you need to do yourself.
# Work on the assignment for a while …
# … and when you’re done:
deactivate # Exit the virtual environment
Note that every time you want to work on the assignment, you should run ‘source .env/bin/activate’ (from
within your hw4 folder) to re-activate the virtual environment, and deactivate again whenever you are done.
4 What to submit?
4.1 Blackboard submission
For question 1 & 2, please put everything in one single pdf file and submit it on Blackboard, please include
your name and student ID in the first page of the pdf file. For question 3, submit the jupyter notebook files
GAN.ipynb with your answers filled at the ‘To Do’ locations. Put the PDF file and your jupyter notebook
file in a folder named: SUBID FirstName LastName (e.g., 10947XXXX lionel messi). Zip this folder and
submit the zip file on Blackboard. Your submission must be a zip file, i.e, SUBID FirstName LastName.zip.
5 Cheating warnings
Don’t cheat. You must do the homework yourself, otherwise you won’t learn. You must use your SBU ID as
your file name for the competition.
3