CS 215 : Data Analysis and Interpretation Assignment : Bayesian Estimation

$30.00

Category: Tags: , , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (5 votes)

1. (10 points) Use the Matlab function randn() to generate a data sample of N points drawn from
a Gaussian distribution with mean µtrue = 10 and standard deviation σtrue = 4. Consider the
problem of using the data to get an estimate µb of this Gaussian mean, assuming it is unknown,
when the standard deviation σtrue is known.
Consider using one of the two prior prior distributions on the mean: (i) a Gaussian prior with mean
µprior = 10.5 and standard deviation σprior = 1 and (ii) a uniform prior over [9.5, 11.5].
Consider various sample sizes N = 5, 10, 20, 40, 60, 80, 100, 500, 103
, 104
. For each sample
size N, repeat the following experiment M ≥ 100 times: generate the data, get the maximum
likelihood estimate µb
ML, get the maximum-a-posteriori estimates µb
MAP1 and µb
MAP2, and measure
the relative errors |µb − µtrue|/µtrue for all three estimates.
• Plot a single graph that shows the relative errors for each value of N as a box plot (use the
Matlab boxplot() function), for each of the three estimates.
• Interpret what you see in the graph. (i) What happens to the error as N increases ? (ii) Which
of the three estimates will you prefer and why ?
2. (10 points) Use the Matlab function rand() to generate a data sample of N points from the uniform
distribution on [0, 1]. Transform the resulting data x to generate a transformed data sample where
each datum y := (−1/λ) log(x) with λ = 5. The transformed data y will have some distribution
with parameter λ; what is its analytical form ? Use a Gamma prior on the parameter λ, where the
Gamma distribution has parameters α = 5.5 and β = 1.
Consider various sample sizes N = 5, 10, 20, 40, 60, 80, 100, 500, 103
, 104
. For each sample
size N, repeat the following experiment M ≥ 100 times: generate the data, get the maximum likelihood estimate λbML, get the Bayesian estimate as the posterior mean λbPosteriorMean, and measure
the relative errors |λb − λtrue|/λtrue for both the estimates.
• Derive a formula for the posterior mean.
• Plot a single graph that shows the relative errors for each value of N as a box plot (use the
Matlab boxplot() function), for both the estimates.
• Interpret what you see in the graph. (i) What happens to the error as N increases ? (ii) Which
of the two estimates will you prefer and why ?
3. (10 points) Suppose random variable X has a uniform distribution over [0, θ], where the parameter
θ is unknown. Consider a Pareto distribution prior on θ, with a scale parameter θm > 0 and a
shape parameter α > 1, as P(θ) ∝ (θm/θ)
α for θ ≥ θm and P(θ) = 0 otherwise.
• Find the maximum-likelihood estimate θbML and the maximum-a-posteriori estimate θbMAP
.
• Does θbMAP tend to θbML as the sample size tends to infinity ? Is this desirable or not ?
• Find an estimator of the mean of the posterior distribution θbPosteriorMean
.
• Does θbPosteriorMean tend to θbML as the sample size tends to infinity ? Is this desirable or not ?
2