DDA 4010 – Bayesian Statistics Exercise Sheet 2

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment

Description

5/5 - (1 vote)

Assignment A2.1 (3.9 in Textbook):

Galenshore distribution: An unknown quantity Y has a Galenshore (a, ◊) distribution if its density
is given by
p(y) = 2
(a)
◊2ay2a≠1e≠◊2y2
for y > 0, ◊ > 0 and a > 0. Assume for now that a is known. For this density,
E[Y ] = (a + 1/2)
◊(a) , E
Ë
Y 2
È
= a
◊2

• Identify a class of conjugate prior densities for ◊. Plot a few members of this class of
densities.
• Let Y1,…,Yn ≥ i.i.d. Galenshore (a, ◊). Find the posterior distribution of ◊ given Y1,…,Yn,
using a prior from your conjugate class.

• Write down p (◊a | Y1,…,Yn) /p (◊b | Y1,…,Yn) and simplify. Identify a sucient statistic.
• Determine E [◊ | Y1,…,Yn].
• Determine the form of the posterior predictive density p
1
Y˜ | Y1,…,Yn
2
.

Assignment A2.2 (3.13 in Textbook):

3.13 Improper Jereys’ prior: Let Y ≥ Poisson(◊).
• Apply Jereys’ procedure to this model, and compare the result to the family of gamma
densities. Does Jereys’ procedure produce an actual probability density for ◊ ? In other
words, can I(◊) be proportional to an actual probability density for ◊ œ (0,Œ) ?

• Obtain the form of the function f(◊, y) = I(◊) ◊ p(y | ◊). What probability density for ◊
is f(◊, y) proportional to? Can we think of f(◊, y)/
s f(◊, y)d◊ as a posterior density of ◊
given Y = y ?

Assignment A2.3 (4.2 in Textbook):

Tumor counts: A cancer laboratory is estimating the rate of tumorigenesis in two strains of mice,
A and B. They have tumor count data for 10 mice in strain A and 13 mice in strain B. Type A
mice have been well studied, and information from other laboratories suggests that type A mice
have tumor counts that are approximately Poisson-distributed with a mean of 12. Tumor count
rates for type B mice are unknown, but type B mice are related to type A mice. The observed
tumor counts for the two populations are
yA = (12, 9, 12, 14, 13, 13, 15, 8, 15, 6)
yB = (11, 11, 10, 9, 9, 8, 7, 10, 6, 8, 8, 9, 7)

• For the prior distribution given in part a) of that exercise, obtain Pr (◊B < ◊A | yA, yB) via
Monte Carlo sampling.
• For a range of values of n0, obtain Pr (◊B < ◊A | yA, yB) for ◊A ≥ gamma (120, 10) and ◊B ≥
gamma (12 ◊ n0, n0). Describe how sensitive the conclusions about the event {◊B < ◊A}
are to the prior distribution on ◊B.

• Repeat parts a) and b), replacing the event {◊B < ◊A} with the event Ó
Y˜B < Y˜A
Ô
, where
Y˜A and Y˜B are samples from the posterior predictive distribution.

Assignment A2.4 (4.8 in Textbook):

More posterior predictive checks: Let ◊A and ◊B be the average number of children of men in
their 30 s with and without bachelor’s degrees, respectively.

• Using a Poisson sampling model, a gamma (2, 1) prior for each ◊ and the data in the
files menchild30bach. dat and menchild30nobach. dat, obtain 5,000 samples of Y˜A and
Y˜B from the posterior predictive distribution of the two samples. Plot the Monte Carlo
approximations to these two posterior predictive distributions.

• For the moment, suppose you believed that ◊ œ {0.0, 0.1,…, 0.9, 1.0}. Given that the results
of the survey were q100
i=1 Yi = 57, compute Pr (q Yi = 57 | ◊) for each of these 11 values of ◊
and plot these probabilities as a function of ◊.

• Find 95% quantile-based posterior confidence intervals for ◊B ≠ ◊A and Y˜B ≠ Y˜A. Describe
in words the dierences between the two populations using these quantities and the plots in
a), along with any other results that may be of interest to you.

• Obtain the empirical distribution of the data in group B. Compare this to the Poisson
distribution with mean ˆ◊ = 1.4. Do you think the Poisson model is a good fit? Why or why
not?

• For each of the 5, 000◊B-values you sampled, sample nB = 218 Poisson random variables
and count the number of 0 s and the number of 1 s in each of the 5,000 simulated datasets.

You should now have two sequences of length 5,000 each, one sequence counting the number
of people having zero children for each of the 5,000 posterior predictive datasets, the other
counting the number of people with one child. Plot the two sequences against one another
(one on the x-axis, one on the y-axis). Add to the plot a point marking how many people in
the observed dataset had zero children and one child. Using this plot, describe the adequacy
of the Poisson model.

Please also submit the R code for Question 4.8. The data can be found in the attachment.
Sheet 2 is due on Oct. 21th. Submit your solutions before Oct. 21th, 5:00 pm.