CSE 544, Probability and Statistics for Data Science Assignment 4: Parametric Inference & Hypothesis Testing

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (6 votes)

1. Practice with MME (Total 10 points)
(a) The Gamma(x, y) distribution has mean 𝑥∙𝑦 and variance 𝑥 ∙ 𝑦ଶ. Find MME for 𝑥ො and 𝑦ො. (4 points)
(b) Find MME 𝑎ො and 𝑏෠ for the Uniform(a, b) distribution. Express your final answer in terms of the
sample mean, 𝑋ത ൌ ሺ∑ 𝑋௜ሻ/𝑛, and sample variance, 𝑆
തതଶത ൌ ሺሺ∑ 𝑋௜
ଶሻ/𝑛 ሻ െ 𝑋തଶ. (6 points)
2. Consistency of MLE (Total 5 points)
Let 𝑋ଵ, 𝑋ଶ,…,𝑋௡ be distributed as Exponential(1/λ), all i.i.d. Show that the MLE(𝜆መ) will converge to the
unknown parameter λ. Prove this by showing that bias(𝜆መ) and se(𝜆መሻ tends to 0 as n tends to ∞.
3. Practice with MLE (Total 11 points)
(a) Let 𝑋ଵ, 𝑋ଶ,…,𝑋௡ be distributed as Poisson(λ). Find the MLE of λ. (3 points)
(b) Let 𝑋ଵ, 𝑋ଶ,…,𝑋௡ be a sample from the distribution whose density function is given by
f(x)= ଵ

𝑒ି|௫ିఏ| , ‐∞ <x<∞. Determine the MLE of 𝜃 and comment on it. (4 points)
(c) Let 𝑋ଵ, 𝑋ଶ,…,𝑋௡ ~ Normal(θ, 1). Let δ =𝐸ሾ𝐼௑భவ଴ሿ. Use the Equivariance property to show that the
MLE of δ is 𝜑ሺଵ
௡ ∑ 𝑋௜ ௡
௜ୀଵ ሻ, where 𝜑ሺሻ is the CDF of the standard Normal. You can use the MLE of the
Normal as derived in class. (4 points)
4. Parametric Inference with Data Samples (Total 13 points)
Let 𝑋 ൌ ቄ
2 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏 𝜃
3 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 , where 𝜃 is unknown. Let D = {2, 3, 2} be drawn i.i.d. from X.
(a) Derive 𝜃෠
ெொ using D as the sample data. Clearly show all your steps. (3 points)
(b) Derive 𝑠𝑒ෞሺ𝜃෠ሻ using estimates from part (a). Specifically, first derive 𝑠𝑒ሺ𝜃෠ሻ in terms of 𝜃, and then
estimate 𝑠𝑒ෞሺ𝜃෠ሻ, as in class. Show all your steps clearly. (5 points)
(c) Derive 𝜃෠
ெ௅ா using D as the sample data. Clearly show all your steps. (5 points)
5. MME versus MLE using real data (Total 14 points)
For this question, we will use the acceleration, model, and mpg data from the Auto‐mpg dataset
(https://www.kaggle.com/uciml/autompg‐dataset). Please use the data files on the class website. We
will assume that acceleration is Normal(μ, σ2
) distributed, model year is Uniform(a, b) distributed, and
mpg is Exponential(λ) distributed. You are to find the MME and MLE estimates of the parameters of the
distributions for all 3 datasets. For the Normal MME and Normal and Uniform MLE, you can directly use
the results from class. For the MME of Uniform, you can use the result from Q1. For the remaining cases
(Exponential MME and MLE), we will first derive the estimates.
(a) For the Exp(λ) distribution, find the 𝜆መ
ெொ. (2 points)
(b) For the Exp(λ) distribution, find the 𝜆መ
ெ௅ா. (2 points)
(c) For the 3 datasets, find the MME estimates. That is, find the MME for μ and σ2
for the acceleration
dataset, a and b for the model dataset, and λ for the mpg dataset. Provide your answer as a number
with 3 significant digits. (4 points)
(d) Same as part (c), but this time find the MLE estimates. (4 points)
(e) Based on your answers for (c) and (d), can you comment on which is more accurate among MME
and MLE? This is an open‐ended question, so subjective arguments will suffice. (2 points)
Report the required results for (c) and (d) in the hardcopy but submit the Python code for these parts via
the google form link for A4 (link on piazza).
6. More on Hypothesis testing (Total 14 points)
(a) Suppose the null hypothesis is H0: θ = θ0, but the true value of θ is θ*. Show that, under Wald’s test,
the probability of a Type II error is 𝜑ሺఏబିఏ∗
௦௘ෞ ൅ 𝑧ఈ/ଶሻ െ 𝜑ሺఏబିఏ∗
௦௘ෞ െ 𝑧ఈ/ଶሻ.
(Hints: (i) might help to draw a figure; (ii) think about the distribution of the estimate.) (6 points)
(b) You will need q6_X.dat and q6_Y.dat available at the class website for this question. Each contains
1000 samples for X and Y drawn from two independent Normal distributions. In the following, test
whether the population means of X and Y are same (null) or not (alternative). Use Wald’s 2‐
population test with α = 0.05. Is this test applicable here? (4 points)
(c) Assume X and Y are dependent. Repeat part (b) but use the paired t‐test with α = 0.05 threshold of
1.962. Is this test applicable here? (4 points)
Report the required results for (b) and (c) in the hardcopy but submit the Python code for these parts via
the google form link for A4 (link on piazza).
7. Hypothesis Testing for a single population (Total 8 points)
(a) Consider the following 10 samples: {1.87, 1.29, 2.01, 0.93, 1.02, 2.78, 2.33, 1.65, 0.50, 0.99}.
Assuming that the 10 samples are normally distributed, use the t‐test to decide the null hypothesis
that the population mean is 1.5. Use the α = 0.05 threshold of 2.228 to Reject/Accept. (3 points)
(b) You observe 46 successes in 100 trials of a coin. If the null hypothesis is that the coin is unbiased,
use the Wald’s test with the MLE/MME with α = 0.05 to Reject/Accept the null. What if the null
hypothesis is that the coin has p=0.7? (5 points)