Description
High Level Descriptions
0.1 Tasks You will be asked to implement (1) hidden Markov models’ (HMM) inference procedures and
(2) Principal Component Analysis (PCA). Specifically, you will
• For Problem 1, finish implementing the functions forward, backward, seqprob forward,
seqprob backward, and viterbi. Refer to hmm.py for more information.
• Run the scripts hmm.sh, after you finish your implementation. hmm.sh will output hmm.txt.
• Add, commit, and push hmm.py and hmm.txt.
• For Problem 2, finish implementing the functions pca, decompress and reconstruction error.
Refer to pca.py for more information.
• Run pca.py (i.e., do python3 pca.py), after you finish your implementation, it will generate
pca output.txt.
• Add, commit, and push pca.py and pca output.txt.
0.2 Dataset In Problem 1, we will play with a small hidden Markov model. The parameters of the model
are given in hmm model.json.
In Problem 2, we will use a subset of MNIST dataset that you are already familiar with. As a reminder,
MNIST is a dataset of handwritten digits and each entry is a 28×28 grayscale image of a digit. We will unroll
those digits into one-dimensional arrays of size 784. Please download the data mnist subset.json from
Piazza (in the Homework section of Resources) and put it into the directory ‘P5/’. Please DO NOT commit
and push the data mnist subset.json to the repository.
0.3 Cautions Please DO NOT import packages that are not listed in the provided code. Follow the instructions in each section strictly to code up your solutions. Do not change the output format. Do not
modify the code unless we instruct you to do so. A homework solution that does not match the provided
setup, such as format, name, initializations, etc., will not be graded. It is your responsibility to make sure
that your code runs with the provided commands and scripts on the VM. Finally, make sure that you git
add, commit, and push all the required files, including your code and generated output files.
0.4 Final Submission Add, commit, and push hmm.py, pca.py and the files hmm.txt and pca output.txt
that you have generated.
1
Problem 1 Hidden Markov Models
In this problem, you are given parameters of a small hidden Markov model and you will implement three
inference procedures discussed in the class. In hmm model.json, you will find the following model parameters (note the slight notation difference from the class):
• π: the initial probabilities, πi = P(Z1 = si);
• A: the transition probabilities, with aij = P(Zt = sj
|Zt−1 = si);
• B: the observation probabilities, with bik = P(Xt = ok
|Zt = si).
Now we observe a sequence O of length N. Your task is to write code to compute probabilities and infer
about the possible hidden states given this observation. In particular, in 1.1 you need to implement the
forward and backward procedures; in 1.2, you need to compute P(X1:N = O), the probability of observing
the sequence, using the forward or backward messages; finally in 1.3, you need to infer the most likely
state path using the Viterbi algorithm. Note that your code might be tested against different parameter
sets/observation sequences when graded.
1.1 Please finish the implementation of the functions forward() and backward() in hmm.py:
• forward(π, A, B,O) takes the model parameters π, A, B and the observation sequence O as input
and output a numpy array α, where α[j, t − 1] = P(Zt = sj
, X1:t = x1:t). Note that this is α[j, t − 1]
instead of α[j, t] just because the index of an array starts from 0.
• backward(π, A, B,O) takes the model parameters π, A, B and the observation sequence O as input
and output a numpy array β, where β[j, t − 1] = P(Xt+1:N = xt+1:N | Zt = sj). Note the same index
difference from the lecture slides.
1.2 Now we can calculate P(X1:N = O) from the output of your forward and backward algorithms. Please
finish the implementation of the function seqprob forward() and seqprob backward() in hmm.py.
Both of them should return P(X1:N = O). Note that in seqprob forward() the input is only the forward
messages, while in seqprob backward() you have access to the model parameters too. Please refer to
Slide 29/60 of Lec 10 for the concrete formulas.
1.3 Next you need to find the most likely state path based on the observation O, namely,
argmax
z
∗
1:N
P(Z1:N = z
∗
1:N | X1:N = O).
Please implement the Viterbi algorithm viterbi() in hmm.py. The function viterbi(π, A, B,O) takes
the model parameters π, A, B and the observation sequence O as inputs and outputs a list path which contains the most likely state path z
∗
1:N
(in terms of the state index) you have found.
What to do and submit: After you finish each task in 1.1, 1.2 and 1.3 in hmm.py, run the script hmm.sh.
It will generate hmm.txt. Add, commit, and push both hmm.py and hmm.txt before the due date.
Problem 2 Principal Component Analysis
In this question we will implement Principal Component Analysis (PCA) and use it for image compression.
We will use black and white images in this example for illustratory purposes. (However, PCA can also be
used for compression of color images too. For that you need to perform compression on each channel
separately and then combine them.)
2
Figure 1: Eigenvectors together with their eigenvalues
Figure 2: Mean image
Let X be an N × D data matrix, where N is the number of images and D is the number of pixels of
each image. For a given parameter K < D, you need to use PCA to compress X into an N × K matrix. As
discussed in the class, this is done via first centering the data, then finding the top K eigenvectors of the
covariance matrix X
TX, and finally projecting the original data onto these eigenvectors.
Since each of these eigenvectors has dimensionality D, it can also be represented as an image of the same
size as the dataset. For example, in Figure 1 we demonstrate top eigenvectors corresponding to all images
with digit 6 in MNIST together with their eigenvalues. Figure 2 demonstrates the mean value of MNIST
images that correspond to digit 6.
Figure 3: Original image (on the right) and decompressed versions with different compression rates.
3
Compression
Please implement PCA in function pca of pca.py file, which returns the compressed representation Y ∈
RN×K and a matrix M ∈ RD×K that consists of the top K eigenvectors, represented as column vectors. Note
that we have subtracted the mean values from the data before passing it to pca, so you do not need to do
this step.
Decompression
With the eigenvector matrix M, we can approximately reconstruct the original dataset as Xˆ = YMT
. Implement this step in function decompress of pca.py.
In Figure 3 we show several decompressed images with different compression rates (K) and the original
image. It is easy to see that larger K leads to better reconstruction. To quantify the reconstruction error
between the original image x and the reconstructed one ˆx, we calculate the pixel-wise mean-square-error
as 1
D
kx − xˆk
2
2
. Implement this step in function reconstruction error of pca.py.
What to do and submit: After implementation, run python3 pca.py to generate pca output.txt. Submit
both pca.py and pca output.txt.
4