Description

5/5 - (2 votes)

Question 1 (30%)
Design a classifier that achieves minimum probability of error for a three-class problem where
the class priors are respectively P(L = 1) = 0.15,P(L = 2) = 0.35,P(L = 3) = 0.5 and the classconditional data distributions are all Gaussians for two-dimensional data vectors:
N ([−1
0
],[
1 −0.4
−0.4 0.5
]),N ([ 1
0
],[
0.5 0
0 0.2
]),N ([ 0
1
],[
0.1 0
0 0.1
]).
Generate 10000 samples according to this data distribution, keep track of the true class labels
for each sample. Apply your optimal classifier designed as described above to this dataset and
obtain decision labels for each sample. Report the following:
• actual number of samples that were generated from each class;
• the confusion matrix for your classifier consisting of number of samples decided as class
r ∈ {1,2,3} when their true labels were class c ∈ {1,2,3}, using r, c as row/column indices;
• the total number of samples misclassified by your classifier;
• an estimate of the probability of error your classifier will achieves, based on these samples;
• a visualization of the data as a 2-dimensional scatter plot, with true labels and decision labels
indicated using two separate visulization cues, such as marker shape and marker color;
• a clear but brief description of the results presented as described above.
Note: See the attached generateData Exam1Question1.m Matlab script for data generation.
1
Question 2 (35%)
An object at true position [xT , yT ]
T
in 2-dimensional space is to be localized using distance
(range) measurements to K reference (landmark) coordinates {[x1, y1]
T
,…,[xi
, yi
]
T
,…,[xK, yK]
T}.
These range measurements are ri = dTi +ni for i ∈ {1,…,K}, where dTi = k[xT , yT ]
T −[xi
, yi
]
Tk
is the true distance between the object and the i
th reference point, and ni
is a zero mean Gaussian distributed measurement noise with known variance σ
2
i
. The noise in each measurement is
independent from the others.
Assume that we have the following prior knowledge regarding the position of the object:
p

x
y
!
= (2πσxσy)
−1
e
−
1
2
h
x yi
”
σ
2
x 0
0 σ
2
y
#−1″
x
y
#
(1)
where [x, y]
T
indicates a candidate position under consideration.
Express the optimization problem that needs to be solved to determine the MAP estimate of
the object position. Simplify the objective function so that the exponentials and additive/multiplicative
terms that do not impact the determination of the MAP estimate [xMAP, yMAP]
T
are removed appropriately from the objective function for computational savings when evaluating the objective.
Implement the following as computer code: Set the true object location to be inside the
circle with unit radious centered at the origin. For each K ∈ {1,2,3,4} repeat the following.
Place evenly spaced K landmarks on a circle with unit radius centered at the origin. Set measurement noise standard deviation to 0.3 for all range measurements. Generate K range measurements according to the model specified above (if a range measurement turns out to be negative,
reject it and resample; all range measurements need to be nonnegative).
Plot the equilevel contours of the MAP estimation objective for the range of horizontal and
vertical coordinates from −2 to 2; superimpose the true location of the object on these equilevel
contours (e.g. use a + mark), as well as the landmark locations (e.g. use a o mark for each one).
Provide plots of the MAP objective function contours for each value of K. When preparing
your final contour plots for different K values, make sure to plot contours at the same function
value across each of the different contour plots for easy visual comparison of the MAP objective
landscapes.
Supplement your plots with a brief description of how your code works. Comment on the
behavior of the MAP estimate of position (visually assessed from the contour plots; roughly center
of the innermost contour) relative to the true position. Does the MAP estimate get closer to the
true position as K increases? Doe is get more certain? Explain how your contours justify your
conclusions.
Suggestion: For σx and σy consider values around 0.25, and for the noise variance values σ
2
i
consider values around 0.1 for posterior functions that are illustrative; you may choose different
values than what is suggested here, so make sure to specify what your values are in the numerical
results presented.
Note: The additive Gaussian distributed noise used in this question is actually not appropriate,
since it could lead to negative measurements, which are not legitimate for a proper distance sensor.
However, in this question, we will ignore this issue and proceeding with this noise model for
the sake of illustration. In practice, a multiplicative log-normal distributed noise may be more
appropriate than an additive normal distributed noise.
2
Question 3 (35%)
We have two dimensional real-valued data (x, y) that is generated by the following procedure,
where all polynomial coefficients are real-valued and v ∼ N (0,σ
2
):
y = ax3 +bx2 +cx+d +v (2)
Let w = [a,b, c,d]
T be the parameter vector for this polynomial relationship. Given the knowledge of σ and that the relationship between x and y is a cubic polynomial corrupted by additive
noise as shown above, iid samples D = (x1, y1),…,(xN, yN) generated by the procedure using the
true value of the parameters (say wtrue), and a Gaussian prior w ∼ N (0, γ
2
I), where I is the 4×4
identity matrix, determine the MAP estimate for the parameter vector.
Write code to generate N = 10 samples according to the model, draw iid x ∼ Uni f orm[−1,1]
and choose the true parameters to place the real roots (for simplicity) for the polynomial in the
interval [−1,1]. Pick a value for σ (that makes the noise level sufficiently large), and keep it
constant for the experiments. Repeat the following for different values of γ (note that as γ increases
the MAP estimates approach the ML estimate).
Generate samples of x and v, then determine the corrsponding values of y. Given this particular
realization of the dataset D, for each value of γ, find the MAP estimate of the parameter vector and
calculate the squared L2 distance between the true parameter vector and this estimate.
For each value of γ perform at least 100 experiments, where the data is independently generated
according to the procedure, while keeping the true parameters fixed. Report the minimum, 25%,
median, 75%, and maximum values of these squared-error values, kwtrue −wMAPk
2
2
, for the MAP
estimator for each value of γ in a single plot. How do these curves behave as this parameter for the
prior changes?
Note: Make sure to change gamma to cover a sufficiently broad range to see its effects at
multiple scales. To achieve this, you might want to select values for this hyperparameter as power
of 10 linearly spaced from −B to +B, so that you cover the interval [10−B
,10B
] logarithmically.
Choose B > 0 well.
3

EECE5644 – Exam 1

Description

Related products

EECE 5644 Introduction to Machine Learning and Pattern Recognition Problem Set 2

EECE5644 – Homework 4

EECE5644 – Homework 1