
685.621 Algorithms for Data Science Homework 4

1. Problem 1 Note this is a Collaborative Problem
35 Points Total
In this problem, develop pseudocode and code for the Expectation Maximization method. This
should be done for a generic number of clusters, at a minimum you should be able to handle 3
clusters to build a three class classifiers. Using the following data
x =

1 2
4 2
1 3
4 3

for 5 iteration show the values for p
i)(k|n), µ
i + 1), σ
i + 1), p
i + 1) using your code. You can
either use a built in EM algorithm or the one you implement to show how well the clusters create
the two separations as in slide 15 of the Expectation Maximization.pdf for the 5 iterations. In
this example, are the clusters starting to converge? If no, why not? If yes, why?
2. Problem 2 Note this is a Collaborative Problem
30 Points Total
Using the EM algorithm from Problem 1 the IRIS data set estimate the the unknown parameters
µk, σk, pk.
3. Problem 3
35 Points Total 15 Points Each
Consider three mean values of µ = [µ1, µ2, µ3] = [4.5, 2.2, 3.3] with a corresponding covariance
matrix as follows:
Σ =

0.5 0.1 0.05
0.1 0.25 0.1
0.05 0.1 0.4

 (2)
The respective minimums are min = [3.5, 1.7, 2.5] and maximums are max = [5.5, 2.7, 4.1].
Generate 300 observations.
Using the EM algorithm from Problem 1 and the generated date estimate the the unknown
parameters µk, σk, pk.
