Description
Questions:
1. Given n values {xi}
n
i=1 having mean µ, median ν and standard deviation σ, prove that |µ − ν| ≤ σ. Assume
n is even. [10 points]
2. Consider four sets of n values each: {xi}
n
i=1, {yi}
n
i=1, {zi}
n
i=1, {wi}
n
i=1. Consider for all i, 1 ≤ i ≤ n, we have
zi = axi + b and wi = cyi + d where a 6= 0, c 6= 0. Here a, b, c, d are constants. Then prove that r(z, w) =
±r(x, y) where r is the correlation coefficient. Comment on when the we would have r(z, w) = r(x, y) and
when we would have r(x, y) = −r(z, w). [10 points]
3. Given n distinct values {xi}
n
i=1 with mean µ and standard deviation σ, prove that for all i, we have |xi−µ| ≤
σ
√
n − 1. [10 points]
In the following problems, you can use the mean, median and standard deviation functions from MATLAB.
4. Generate a sine wave in MATLAB of the form y = 5 sin(2x + π/3) where x ranges from -10 to 10 in steps of
0.02. Now randomly select 40 values in the array y (using MATLAB function ‘randperm’) and corrupt them
by adding random values from 5 to 10 using the MATLAB function ‘rand’. This will generate a corrupted
sine wave which we will denote as z. Now your job is to filter z using the following steps.
• Create a new array ymedian to store the filtered sine wave.
• For a value at index i in z, consider a neighborhood N(i) consisting of z(i), 8 values to its right and 8
values to its left. For indices near the left or right end of the array, you may not have 8 neighbors in
one of the directions. In such a case, the neighborhood will contain fewer values.
1
• Set ymedian(i) to the median of all the values in N(i). Repeat this for every i.
This process is called as ‘moving median filtering’, and will produce a filtered signal in the end. Repeat the
entire procedure described here using the arithmetic mean instead of the median. This is called as ‘moving
average filtering’. Plot the original (i.e. clean) sine wave y, the corrupted sine wave z and the filtered sine
wave using mean and median on the same figure in different colors. Introduce a legend on the plot (find
out how to do this in MATLAB). Include an image of the plot in your report. Now compute and print the
relative mean squared error between each result and the original clean sine wave. The relative mean squared
error between y and its estimate ˆy is defined as
P
i
(yi − yˆi)
2
P
i
y
2
i
.
Now repeat all the steps above when the random values to corrupt the sine wave lay in a range from 100 to
120, and include the plot of the sine waves in your report, and write down the relative mean square error
values.
Which method (median or arithmetic mean) produced better relative mean squared error? Why? Explain
in your report. [6+3+3+3=15 points]
5. Suppose that you have computed the mean, median and standard deviation of a set of n numbers stored in
array A where n is very large. Now, you decide to add another number to A. Write a MATLAB function
to update the previously computed mean, another MATLAB function to update the previously computed
median, and yet another MATLAB function to update the previously computed standard deviation. Note
that you are not allowed to simply recompute the mean and standard deviation by looping through all the
data. You may need to derive a formula for this. Include the formula and its derivation in your report. Your
MATLAB functions should be of the form function newMean = UpdateMean (OldMean, NewDataValue, A,
N), function newMedian = UpdateMedian (oldMedian, A, N) and function newStd = UpdateStd (OldMean,
OldStd, NewMean, NewDataValue, A, N). [5+5+5 = 15 points]
2