## Description

1. Suppose X1, X2, . . . , Xn (n ≥ 2) are independent and identically distributed (iid) with a Uniform[θ −

1

2

, θ +

1

2

] distribution for some unknown −∞ < θ < ∞, i.e., the Xi

’s have density

fθ(x) =

1, if θ −

1

2 ≤ x ≤ θ +

1

2

;

0, otherwise.

It is desired to guess the value of θ under the loss function L(θ, d) = (θ − d)

2 based on the observed

data X = (X1, . . . , Xn). The purpose of this question is to show that the sample mean is inadmissible.

(a) Specify S, Ω, D, and L (i.e., the sample space, the set of all possible distribution functions, the

decision space, and the loss function).

(b) Find the risk function of the procedure δ0(X) = X¯

n = (X1 + . . . + Xn)/n, which is the so-called

method of moment estimator.

(c) Prove that T = (X(1), X(n)) is a sufficient statistic for θ, where X(1) = min(X1, . . . , Xn) and

X(n) = max(X1, . . . , Xn) are the sample minimum and maximum.

(d) While T = (X(1), X(n)) gives all the information about θ, the T itself is not a statistical procedure

for estimating θ, since a point estimator of θ must take on real values. To produce point estimators

from sufficient statistic T, let us consider a family of procedures of the form

δa,b(X) = aX(1) + (1 − a)X(n) + b

for some real-valued constants a, b. Show that the risk function of δa,b(X) is given by

Rδa,b (θ) = h

a

1

n + 1

+ (1 − a)

n

n + 1

+ b −

1

2

i2

+

a

2n + (1 − a)

2n + 2a(1 − a)

(n + 1)2(n + 2) ,

which is minimized at b =

1

2 −

a+n(1−a)

n+1 for any given constant a.

(e) Among all procedures δa,b(X) in part (d), show that the choice a =

1

2

and b =

1

2 −

a+n(1−a)

n+1 = 0,

i.e., δ

∗

(X) = (X(1) + X(n))/2, gives uniformly smallest risk function.

(f) Prove that when n ≥ 3, the procedure δ

∗

(X) = (X(1) + X(n))/2 in part (c)(iv) is better than

δ0(X) = X¯

n, and conclude that δ0(X) = X¯

n is inadmissible when n ≥ 3.

2. (Modified from 7.19(a)) Suppose that the random variables Y1, . . . , Yn(n ≥ 2) satisfy

Yi = βxi + ϵi

, i = 1, 2, . . . , n,

where ϵ1, . . . , ϵn are iid N(0, σ2

), and both β and σ

2 are unknown.

(a) Assume x1, . . . , xn are fixed known constants, and we observe Y1 = y1, · · · , Yn = yn, e.g., the

observed data Y = (y1, . . . , yn). Find a two-dim sufficient statistic of Y = (Y1, · · · , Yn) for (β, σ2

).

(b) Assume now that x1, . . . , xn are random variables with a known joint distribution m(x1, . . . , xn),

and the xi

’s are independent of ϵi

’s (it is traditional in the linear regression to use lower case for

independent variables xi

’s). In this case, the observed data (Y, x) = {(Yi

, xi)}i=1,…,n. Find a threedimensional sufficient statistic of (Y, x) for (β, σ2

).

3. (Modified from 6.5). Let X1, · · · , Xn(n ≥ 2) be independent random variables with pdfs

f(xi

|θ) = 1

3iθ , if −i(θ − 1) < xi < i(2θ + 1);

0, otherwise,

for i = 1, 2, · · · , n, where θ > 0.

(a) Show that Ta(X) = (min1≤i≤n(Xi/i), max1≤i≤n(Xi/i)) is a two-dim sufficient statistic for θ.

(b) Find a minimal sufficient statistic for θ. Hints: the minimal sufficient statistic is onedimensional.

4. (6.25) (b) and (d). We have seen a number of theorems concerning sufficiency and related concepts for

exponential families. Let X1, . . . , Xn(n ≥ 2) be a random sample for each of the following distribution

families, and establish the following results.

(b) The statistic T(X) = Pn

i=1 X2

i

is minimal sufficient in the N(µ, µ) family.

(d) The statistic T(X) = (Pn

i=1 Xi

,

Pn

i=1 X2

i

) is minimal sufficient for θ = (µ, σ2

) in the N(µ, σ2

)

family.

5. (6.9)(a)(b)(d)(e). For each of the following distribution let X1, . . . , Xn(n ≥ 2) be a random sample.

Find a minimal sufficient statistic for θ.

(a) f(x|θ) = √

1

2π

e

−(x−θ)

2/2

, −∞ < x < ∞, −∞ < θ < ∞ (normal)

(b) f(x|θ) = e

−(x−θ)

, θ < x < ∞, −∞ < θ < ∞ (location exponential)

(d) f(x|θ) = 1

π[1+(x−θ)

2]

, −∞ < x < ∞, −∞ < θ < ∞ (Cauchy)

(e) f(x|θ) = 1

2

e

−|x−θ|

, −∞ < x < ∞, −∞ < θ < ∞ (double exponential)

[In class we will discuss part (c) f(x|θ) = e

−(x−θ)

(1+e−(x−θ))

2

, −∞ < x < ∞, −∞ < θ < ∞(logistic). ]

6. (6.12). A natural ancillary statistic in most problems in the sample size. For example, let N be

an integer-valued random variable taking values 1

P , 2, · · · with known probabilities p1, p2, · · · , where

∞

i=1 pi = 1. Having observed N = n, perform n Bernoulli trials with success probability θ, getting X

successes.

(a) Prove that the pair (X, N) is minimal sufficient and N is ancillary for θ.

(Note that the similarity to some of the hierarchical models discussed in Section 4.4.)

(b) Prove that the estimator X/N is unbiased for θ and has variance θ(1 − θ)E(1/N). In other words,

prove that Eθ(X/N) = θ and V arθ(X/N) = θ(1 − θ)E(1/N).

Hints of Problem 1 (d): To compute its risk function, it is useful to split in the following steps.

(i) Note that if we let Ui = Xi − θ + 1/2, then X(1) = U(1) + θ − 1/2 and X(n) = U(n) + θ − 1/2. Hence

we first need to investigate the properties of U(1) = min(U1, . . . , Un) and U(n) = max(U1, . . . , Un)

when U1, . . . , Un are iid with Uniform[0, 1]. Using the fact P(u ≤ U(1) ≤ U(n) ≤ v) = P(u ≤ Ui ≤

v for all i = 1, . . . , n) = Qn

i=1 P(u ≤ Ui ≤ v) for any u and v, show that the joint density of U(1) and

U(n)

is

fU(1),U(n)

(u, v) =

n(n − 1)(v − u)

n−2

, if 0 ≤ u ≤ v ≤ 1;

0, otherwise.

whereas the respective (marginal) densities of U(1) and U(n) are

fU(1) (u) =

n(1 − u)

n−1

, if 0 ≤ u ≤ 1;

0, otherwise. and fU(n)

(v) =

nvn−1

, if 0 ≤ v ≤ 1;

0, otherwise.

(ii) Show that E(U(1)) = 1

n+1 , E(U(n)) = n

n+1 , V ar(U(1)) = V ar(U(n)) = n

(n+1)2(n+2) and Cov(U(1), U(n)) =

1

(n+1)2(n+2) .

(iii) Use the fact of E(Y

2

) = [E(Y )]2 + V ar(Y ) to show that the risk function of δa,b(X) is

Rδa,b (θ) = E

aU(1) + (1 − a)U(n) + b − 1/2

2

.

Hints of Problem 2: Let θ = (β, σ2

).

(a) The sample is Y = (Y1, . . . , Yn), and the joint density function of Y is

fθ(Y) = Yn

i=1

f(Yi) = Yn

i=1

1

√

2πσ

exp

−

(Yi − βxi)

2

2σ

2

.

How to factor this joint pdf into two parts? The part that depends on θ = (β, σ2

) depends on the sample

Y = (Y1, . . . , Yn) only through which kind of two-dimensional function T(Y)? Note that the xi

’s are treated

as known constants here.

(b) When x1, . . . , xn are random variables with a known joint distribution m(x1, . . . , xn), and the xi

’s

are independent of ϵi

’s, the joint density of the data (Y, X) = {(Yi

, xi)}i=1,…,n is

fθ(Y, X) = m(x)fθ(Y|X) = m(x1, . . . , xn)

Yn

i=1

1

√

2πσ

exp

−

(Yi − βxi)

2

2σ

2

.

Can you factor this joint pdf into two parts? The part that depends on θ = (β, σ2

) depends on the sample

(Y, X) = {(Yi

, xi)}i=1,…,n only through which kind of three-dimensional function T(Y, X)?

Hints of Problem 3: It is important to focus on the domain of θ in the joint pdf of X = (X1, . . . , Xn).

You can write a(θ) < xi < b(θ) for i = 1, · · · , n, into two separate inequalities: a(θ) < xi for all i and

xi < b(θ) for all i. From this, we can conclude that a(θ) < mini xi and maxi xi < b(θ), and then solve for θ,

respectively. To be more specific, the joint density is

fθ(x) = Yn

i=1

fXi

(xi

|θ) = Yn

i=1

h

1

3iθ I(−i(θ − 1) < xi < i(2θ + 1)i

=

1

3

nn!θ

n

I

− (θ − 1) <

xi

i

< 2θ + 1 for all i = 1, . . . , n

3

=

1

3

nn!θ

n

× I

− (θ − 1) <

xi

i

for all i = 1, . . . , n

× I

xi

i

< 2θ + 1 for all i = 1, . . . , n

=

1

3

nn!θ

n

× I

− (θ − 1) < min

1≤i≤n

xi

i

× I

max

1≤i≤n

xi

i

< 2θ + 1

=

1

3

nn!θ

n

× I

θ > 1 − ( min

1≤i≤n

xi

i

)

× I

θ > 1

2

[( max

1≤i≤n

xi

i

) − 1]

Part (a) follows from this immediately. To find the minimal sufficient statistic in part (b), using the fact

that I(θ > u)I(θ > v) = I(θ > max(u, v)), you can further simplify the above density function as a function

of one-dimensional statistic. Hint: how about us defining

T(X) = max n

1 − min

1≤i≤n

xi

i

,

1

2

( max

1≤i≤n

xi

i

− 1)o

.

Also we do not need to simplify T here and it is okay to leave it as is.

Hints of Problem 5(d): The key observation is that

f(x|θ)

f(y|θ)

is constant in θ ⇐⇒

f(x|θ)

f(x|θ = 0) =

f(y|θ)

f(y|θ = 0) for all θ

⇐⇒ Yn

k=1

1 + (xk − θ)

2

1 + x

2

k

=

Yn

k=1

1 + (yk − θ)

2

1 + y

2

k

for all θ.

Now both sides are polynomial of θ of degree 2n, comparing the coefficient of θ

2n yields that Q

(1 + x

2

k

Q

) =

(1 + y

2

k

), and thus

Yn

k=1

[1 + (xk − θ)

2

] = Yn

k=1

[1 + (yk − θ)

2

].

Setting these two polynomials to 0 and solving the complex root for θ, the left-hand side polynomial has 2n

complex roots, θb = xk ±

√

−1, for k = 1, . . . , n, whereas the right-hand polynomial leads to another set of

2n complex roots, θb = yk ±

√

−1, for k = 1, . . . , n. Of course these two polynomials in θ will have the same

(complex) roots, and thus x(k) = y(k)

for k = 1, . . . , n. What does this mean?

Hints of Problem 5(e): In this case, the order statistic is also a minimal sufficient statistic. the main

difficulty is to show that if f(x|θ)

f(y|θ)

does not depend on θ, then x(i) = y(i)

for all i = 1, . . . , n.

First, let us prove x(1) = y(1). Assume x(1) ̸= y(1), and without loss of generality, assume x(1) < y(1). For

convenience of notation, define x(0) = y(0) = −∞ and define x(n+1) = y(n+1) = ∞. Now let r be the largest

i ≥ 1 such that x(i) < y(1). In other words, x(1) ≤ x(r) < y(1) ≤ x(r+1) for some 1 ≤ r ≤ n. Consider

the interval x(r) < θ < y(1), and show that f(x|θ)

f(y|θ)

depends on θ ∈ (x(r)

, y(1)) since 1 ≤ r ≤ n. This is a

contradiction that f(x|θ)

f(y|θ)

is a constant of θ. Thus the assumption that x(1) ̸= y(1) is wrong, and hence we

must have x(1) = y(1).

The above arguments can be easily extended to show that x(i) = y(i)

for all i = 1, . . . , n. Assume this is not

true, and let k be the smallest i such that x(i) ̸= y(i)

, say x(k) < y(k)

. As above, let r be the largest i ≥ k

such that x(i) < y(k)

. Then

x(1) = y(1) ≤ x(2) = y(2) ≤ · · · ≤ x(k−1) = y(k−1) ≤ x(k) ≤ x(r) < y(k)

for some k ≤ r ≤ n. Then consider the interval x(r) < θ < y(k)

, and see what happens?

Hints of Problem 6(b): Use the facts that E(U) = E(E(U|V )) and V ar(U) = E(V ar(U|V ))+V ar(E(U|V ))

for U = X/N and V = N. See Theorems 4.4.3 and 4.4.7 on page 164-167 of our text for the proofs of these

two useful facts which will be used later.