STAT 5701 Homework 1

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (6 votes)

1. In this question, you will create a rejection sampling algorithm to generate a realization of a random variable with density f defined by f(x) =  3 4 (1 − x 2 ) if x ∈ (−1, 1) 0 otherwise . (a) Create this rejection sampling algorithm using Unif(−1, 1) as the trial/proposal distribution. Simplify and state the steps of this algorithm. (b) Write an R function called rquad that generates a realization of n independent copies of the random variable X with density f. This function must use the algorithm created in part 1a. This function has one argument n, which is the sample size. This function returns a list with two elements: • x.list, a vector of n entries, where the ith entry is the realization of the ith independent copy of X. • k.list, a vector of n entries, where the ith entry is the number of iterations of the rejection sampling algorithm required to produce the realization of the ith independent copy of X. (c) Test rquad by generating a realization of X1, . . . , X1000 iid with density f. Create a a histogram of these measurements. How many iterations, on average, were required to produce each realization? 2. (a) Write an R function called myrtnorm that generates a realization of the sequence of independent random variables T1, . . . , Tn, where each Ti has the truncated Normal distribution: Ti ∼ (X|a < X < b) where X ∼ N(µ, σ2 ) and a, b are non-random. A rejection sampling algorithm should be used in combination with the Box–Muller method. Only the standard uniform random generator runif is allowed. This function should have five arguments: • n, the random sample size • mu, the value of µ • sigma, the value of σ • a, the value of a (the left endpoint of the interval or -Inf) • b, the value of b (the right endpoint of the interval or Inf). This function should return a vector of n entries with the generated realization of T1, . . . , Tn. (b) Derive an expression for the expected number of iterations of the rejection sampling algorithm required to generate a realization of T1 in terms of µ, σ, a and b. 1 (c) Fix n = 500 and pick values of µ, σ, a and b so that the expected number of iterations of the rejection sampling algorithm required to generate a realization of T1 is less than 10. Using these values, use myrtnorm to generate a realization of T1, . . . , T500 and produce a histogram. 3. (a) Write an R function called myrexp that generates a realization of n independent random variables X1, . . . , Xn, where Xi has the exponential distribution with mean µ > 0 for i = 1, . . . , n. In other words, generate a realization of a random sample of size n from Exp(µ). Only calls to R’s standard uniform generator runif are permitted, e.g. calling rexp is not allowed. This function should have two arguments: • n, the random sample size • mu, the user-specified mean of the exponential distribution This function should return a vector of n entries with the generated realization of X1, . . . , Xn. (b) Test myrexp by generating a realization of random sample of size 1000 from the Exponential distribution with some mean µ that you pick. Create a QQ–plot to compare the data percentiles (of the realization of the random sample) to the percentiles of Exp(µ). Calling the function qexp is not allowed here. (c) Let Y1, . . . , Yn be independent copies of Y ∼ Exp(µ) and define Y¯ = n −1 Pn i=1 Yi . Write an R function called run.exp.sim that generates a realization of reps independent copies of Y¯ with sample size n. This function should have three arguments: • n the sample size • mu the mean of the exponential distribution • reps the number of realizations of Y¯ The function should return a vector of reps realizations of Y¯ and display a Normal QQ–plot of entries in this vector. (d) A civil engineer measured the times between vehicle arrivals at a rural bridge on a Sunday afternoon. For a simple model, she assumes that her measured inter-arrival times (in minutes) x1, . . . , x30 are a realization of a random sample from the exponential distribution with unknown mean µ. She computes the observed sample mean ¯x = (1/30)P30 i=1 xi to estimate µ. Is this sample size of 30 large enough for ¯x to be a realization of random variable with a distribution well approximated by the Normal distribution? To respond, pretend that µ = 3.4 minutes and perform a simulation study using the function run.exp.sim. Comment on the result. 4. (a) Write an R function called mymvrnorm that generates a realization of the sequence of independent random vectors Y1, . . . , Yn, where Yi ∼ Np(µ, Σ) for i = 1, . . . , n. Only calls to R’s standard uniform generator runif are permitted. This function should have three arguments: • n, the random sample size • mu, the mean vector with p entries 2 • Sigma, this is the covariance matrix Σ ∈ S p 0 . This function should return a matrix with n rows and p columns, where the ith row has the realization of Yi . The eigen function should be called in your definition of mymvrnorm. (b) Suppose that we plan to measure the heights of n individuals. i. Suppose that the yet-to-be measured heights X1, . . . , Xn are a random sample from N(µ, σ2 ). Compute the mean and variance of X¯ = n −1 Pn i=1 Xi (the standard estimator of µ). Express these in terms of µ, σ and n. ii. Suppose that the yet-to-be measured heights will be of individuals in the same family. Let (H1, . . . , Hn) 0 be these yet-to-be measured heights and suppose they have n-variate normal distribution with mean vector (µ, . . . , µ) 0 ∈ R n and covariance matrix Σ ∈ S n 0 with (i, j)th entry Σij = σ 2 · 0.7 |i−j| for (i, j) ∈ {1, . . . , n} × {1, . . . , n}. Compute the mean and variance of H¯ = n −1 Pn i=1 Hi (the standard estimator of µ). Express these in terms of µ, σ and n. iii. Is H¯ better or worse than X¯ as an estimator of µ. Explain. (c) Design and perform a simulation study that compares the formulas for the mean and variance of X¯ and H¯ derived in part 4b to their corresponding simulated estimates. You should use mymvrnorm. Set n = 10, µ = 68, and σ = 3. 3