## Description

Here, (m) denotes the covariance at lag m for a block of data. There are di§erent ways

to compute (m), speciÖcally the scaling factor, so let us use the following convention:

assume the data is 0 mean; if not subtract the sample mean Örst. Then (with data indexed

1 n N):

(m) = 1

N

X

N

n=m+1

xnxnm

The (auto-)correlation coe¢ cients are:

(m) = (m) = (0)

For example, the MATLAB function autocorr(x) will compute and plot (m) for 0 m 20,

with signiÖcance bounds (in the graph, if a displayed value is below the bounds, then it can be

considered as 0). For our purposes, it is Öne if we use j (m)j > 0:2 as a test of ìsigniÖcanceî.

Our formula for (m) is technically biased, and can be interpreted as a Bartlett (i.e., similar

to triangular) windowed form of unbiased estimates. In any case, for our purposes, the lags

of interest are M, where M N so this distinction is not of interest for us, here.

For pairs of signals, we can deÖne:

xy (m) = 1

N

X

N

n=m+1

xnynm

for m 0, and similarly for m < 0. Then the cross-correlation coe¢ cient is:

xy (m) =

xy (m)

q

xx (0) yy (0)

and in MATLAB, crosscorr(x; y) will compute and graph this, by default for 20 m 20.

1. Heavy Tail Distributions

First we are going to explore some distributions. Generate N = 1e6 iid samples of

N (0; 1), Cauchy with = 1, and Studentsí t-distribution with = 5 and = 10

degrees of freedom. The variance of Studentsít-distribution is Önite for 3 (well,

technically > 2 but here we consider only integer values for ) and is given by:

2

For comparison, you should normalize your Studentsít-distribution data sets so they

have variance 1. Obviously donít normalize Cauchy since it has no variance! However,

= 0:544 is a reasonable “match” to N (0; 1) in the sense that both yield approximatel

the same P (jXj < 1) = 68% (which is the probability of being within one standard

deviation for the case of N (0; 1)):

In MATLAB, trnd generates the Studentsí t-distribution. You can generate Cauchy

= 1 as a special case of Studentsít by setting = 1; multiply the result by to

create Cauchy with general . Another approach is that if U is uniform over (0,1),

then X = tan (U) is Cauchy. Note in general Cauchy has pdf;

f (x) = =

2 + x

2

In any case, at this point you have N samples of ìcomparably scaledîGaussian, Cauchy

and Studentsí data. For each, compute the fraction of the time the absolute value

exceeds 4. Also, graph the Cauchy data: you will see something interesting (you will

not see in the others).

2. AR and ARMA

Let us start by synthesizing data using an ARMA(2; 2) model:

rt =

(1 + 0:2z

1

) (1 + 0:5z

1

)

(1 0:8z

1

) (1 0:7z

1

)

vt

where vt

is iid N (0; 1). Interpret the above as an operator notation for the appropriate

di§erence equation. Generate N = 250 samples. Consider the AR(p) model:

rt = vt + wp1rt1 + + wpprtp

The AR coe¢ cients as given in the notes would be ap0 = 1 and apk = wpk, 1 k p.

As usual we take ap0 = 1. We can interpret this as a linear regression model with model

error vt

. In this way, we can compute the AR coe¢ cients fapkg

p

k=1 using a least-squares

Öt. Recall app is the reáection coe¢ cient p, and E

jvt

j

2

is Pp, the power in the p

th

order prediction. The set p and Pp can be found using a Levinson-Durbin recursion

instead.

Take maximum order M = 10.

(a) Estimate the covariances (m) for 0 m M. Obtain a stem plot of the values

(m) = (0) for 0 m M. Those above say about 0:2 are signiÖcant. The

number above that level gives an indication that an AR model of that length

would be useful, higher order models would give marginal improvement. Note:

In MATLAB, autocorr does all this for you.

(b) Set up the corresponding (M + 1) (M + 1) Toeplitz matrix C, and compute its

eigenvalues. They are (hopefully) strictly positive real.

(c) Recall the Cholesky factorization of a pd matrix is C = LcholL

T

chol where Lchol is

lower triangular with positive entries on the diagonal. Closely related to this is

the so-called LDL decomposition C = LDLT where L is lower triangular with 1ís

on the diagonal, and D is a diagonal matrix with positive entries. If f`ig are the

diagonal elements of Lchol, then D = diag f`

2

i g. You should be able to

decomposition in your function library whether you use MATLAB, Python or

really any good scientiÖc computing software; if not and all you have is Cholesky,

you can derive D from Lchol as suggested, and then L = LcholD1=2

. If you canít

Önd either LDL or Cholesky decomposition, look harder. If itís not there, get a

better library! In any case, Önd the L and D for your covariance matrix C.

(d) Use the Levinson-Durbin recursion (you can use a function from your library, if

not then hand code it) to compute the reáection coe¢ cients and prediction error

powers Pm up to M. Also compute the FPEF coe¢ cients of all orders up to order

M. Pack these FPEFs into a triangular matrix of the form:

F =

2

6

6

6

6

6

4

1 0 0 0

a11 1 0 0

a22 a21 1 0

.

.

.

aMM aM1 1

3

7

7

7

7

7

5

(e) Compute F CFT

. Compare the diagonal elements of D with the Pmís:Compare

F with L

1

. In summary, the Levinson-Durbin recursion is tied to the LDL

decomposition (though it doesnít give you L and D directly, it gives what turns

out to be the more useful L

1 and D).

(f) Now use a LS Öt to compute the AR coe¢ cients of order M. Compare them with

the Mth order model you found via Levinson-Durbin.

(g) Comment on the reáection coe¢ cients you found. Consider the following general

issue: if a reáection coe¢ cient is close to 1 in magnitude, what does that tell you?

if it is close to 0, then what? Hint: Recall the formula that relates the Ppís to

the pís. Also, recall what holds for the ís if the data is exactly AR (M0) for

some order M0.

3. Repeat the above experiment with the modiÖcation that you have something close to

an ARIMA(2,1,2) model: r = Hv where:

H (z) =

1 0:99z

1

1 H0 (z)

where H0 (z) is the ARMA model in Problem 2. In other words, your rt will now exhibit

behavior that would suggest a unit-root nonstationarity. Plot the original rt (one

simulated block of data), and now the rt with this (close to) unit root nonstationarity.

After you generate the AR models and do the other analysis, now compute st = rtrt1,

the Örst di§erence, and repeat all for st

. The model should work better for st

. The

question is, can you detect the unit root nonstationarity early on? Go back to where

you graphed the partial correlation coe¢ cients, i.e., (m) = (0) for rt and comment.

4. Now repeat both Problems 2 and 3 using a Studentsít-distribution with = 5 for

vt (normalized so E (v

2

t

) = 1 still). The parts to repeat, and the issue of concern, is

the numerical stability and apparent Öt of the model. In theory, this should change

NOTHING! The theoretical values of (m) are IDENTICAL. There are various ways

of examining the changes, see if you can detect anything noticeable (or, in particular,

disturbing) happening. If not, good fo

5. Cointegration

Here we look for an example of cointegration, among other e§ects in time series. Let us

start with two underlying signals at

, bt

, with the Örst having a unit root nonstationarity

and the other stationary:

at

bt

= + G

at1

bt1

+

ut

vt

where:

=

0:3

0:5

G =

0:99 0

0:3 0:3

where ut

is ARMA(2; 2) driven by an input white noise that is iid N (0; 1), with innovations Ölter:

H =

(1 + 0:2z

1

) (1 + 0:3z

1

)

(1 0:2z

1

) (1 0:5z

1

)

and vt are iid samples with E (vt) = 0, E (v

2

t

) =

2

0

(a constant to be prescribed later),

and two possible distributions you will test: N (0; 2

0

), or Studentsít-distribution with

= 5 scaled to achieve the desired E (v

2

t

).

Next, let:

A =

3 4

2 3

and then:

rt =

r1t

r2t

= A

at

bt

(a) Compute the

0

vector and G0 matrix such that we can write:

r1t

r2t

=

0 + G

0

r1t1

r2t1

+

1t

2t

where 1t

; 2t are correlated disturbances, computed via t = A

ut

vt

.

(b) We want the e§ect of bt to be noticeable but not dominant, so do the following.

After you generate at

, take:

2

0 =

1

16

E

a

2

=

1

16

1

N

X

N

n=1

a

2

n

!

Then use that to generate bt

. Finally compute r1t

; r2t

. [In reality, you should do

this twice, using the two di§erent distributions for vt

. Do the rest of this exercise,

as well, for each case].

(c) Graph ut and vt

,. separately. Superimpose graphs of at

; bt

. Superimpose graphs

of r1t

; r2t

. Repeat this, say three times, to see “typical”

(d) We now want to detect the unit root stationarity in r1t

; r2t

, and determine if

there is indeed cointegration. Indeed, we can recover at

; bt via A1

rt

, which shows

that a linear combination of r1t

; r2t

is stationary! Determine the coe¢ cients of

cointegration (i.e., the linear combination of these two that yield a stationary

result).

(e) Compute and graph the auto-correlation coe¢ cients of r1t

; r2t

, separately, i.e.,

11 (m), 22 (m) for 0 m 20, and compute the cross-correlations 12 (m) for

20 m 20. In MATLAB, autocorr and crosscorr will do this for you.

(f) With sit = (1 z

1

) rit, i = 1; 2, repeat the above (i.e., look at the two autocorrelations and cross-correlation coe¢ cients between s1t

; s2t). You should see a

steeper decay which indicates both r1t

; r2t su§er from unit root nonstationarity.

(g) You will note the cross-correlation between r1t

; r2t decays slowly. To check this is

indicative of co-integration, try creating another signal that is INDEPENDENT

of r1t but also has a unit root nonstationarity. For example, let t be iid N (0; 1),

and ct = 0:99ct1 + t

. Checking the c

(m) coe¢ cients of c should of course

conÖrm it has the unit root nonstationarity. But now also look at the crosscorrelation coe¢ cients between ct and r1t

, and between ct and r2t

. You should see

signiÖcantly less cross-correlation between them.

6. ARCH/GARCH

Now we try ARCH/GARCH modeling. Take two years each of daily data for S&P500

and say two stocks that interest you, letís say some time within the last 20 years.

(You can get data o§ Yahoo Finance). SpeciÖcally, we are looking at adjusted closing

prices, and as a Örst step compute the daily log-returns for the period. Each year

will produce about 250 data points, but realize the exact number may vary.

First graph the returns and square returns. The returns should appear to be close to

constant mean (if not zero mean). Since we do not want to deal with the means here,

compute and subtract o§ the sample means, so from this point we will assume the

returns rt have 0 mean (constant). You may also try multiyear models, say 2007-2009,

which captures the 2008 Önancial crisis, or 2018-2020 (to date) which captures…. hmm,

that which will not be mentioned!

Compute and graph the auto-correlation coe¢ cients up to lag 20 for the returns AND

the square returns. If there are not signiÖcant values for the returns, that suggests our

mean model can be simple, i.e., constant mean. If the values for the square returns

however show some signiÖcance, then it is indicative a conditional heteroskedasticity

is in play. Ideally, for this simple experiment, your data should exhibit this (conditional heteroskedasticity and constant mean), and pick di§erent data sets if not. If we

assume the conditional expectation is 0 (letís say subtract o§ the sample mean Örst, if

necessary), our model is simply:

rt = t

t = tzt

2

t = GARCH (p; q)

; 2

where zt are iid with E (zt) = 0, E (z

2

t

) =

(a) First some simulation. For a GARCH(1; 1) model we can write:

2

t = ! + 2

t1 + 2

t1

Synthesize rt according to such a model, with zt N (0; 1), with ! = 0:5, = 0:6,

= 0:4. Superimpose graphs of rt and t

. You should see variability in the

volatility of rt

, that t seems to “track”. Now, repeat this with zt have a Students

t-distribution with = 5 degrees of freedom.

For each case, Öt a GARCH(1; 1) model, and an ARCH(2) model, assuming a

Gaussian distribution for zt (this is called QML- quasi-maximum likelihood, since

you are assuming a Gaussian likelihood function even when it is not the true one).

For the GARCH(1; 1) model, do you get back something close to your coe¢ cients?

For the ARCH(2) model, superimpose plots of rt and t as generated by the model,

and see if you have a good match for the volatility. What impact, if any, does the

Students t-distribution with = 5 degrees of freedom have?

(b) Your GARCH Ötting library should provide some information regarding the signiÖcance of the model coe¢ cients. Take your actual Önancial data, and Öt rt with

a GARCH(1,1) and ARCH(2) and see if either provides good Öts, and also plot rt

and t superimposed. If the volatility model is a good Öt, the graph of t should

appear to be like an ìenvelopeî for the return data.