## Description

Part1) Probability – 20 points

a) From the Bayes’ rule example given in Section 3.10, compute the probabilities that a

randomly selected non-smoker i) has lung disease and ii) does not have lung disease. Show the

calculations without using R. Then, verify with the bayes function provided in the code samples.

b) Suppose that in a particular state, among the registered voters, 40% are democrats, 50 %

are republicans, and the rest are independents. Suppose that a ballot question is whether to

impose sales tax on internet purchases or not. Suppose that 70% of democrats, 40% of

republicans, and 20% of independents favor the sales tax. If a person is chosen at random that

favors the sales tax, what is the probability that the person is i) a democrat? ii) a republican, iii)

an independent. Show the solutions with the calculations without using R. Then, verify with the

bayes function provided in the code samples.

Part2) Random Variables – 30 points

a) Consider the experiment of rolling a pair of dice. Using R, show how would you define a

random variable for the absolute value of the difference of the two rolls, using a user-defined

function.

b) Using the above result, what is the probability that the two rolls differ by exactly 2? What is

the probability that the two rolls differ by at most 2? What is the probability that the two rolls

differ by at least 3? Use the Prob function as shown in the code samples.

c) Show the marginal distribution of the above random variable (using R).

d) Using R, add another random variable to the above probability space using a user defined

function. The random variable is TRUE if the sum of the two rolls is even, and FALSE otherwise.

What is the probability that the sum of the two rolls is even? Show also the marginal distribution

for this random variable.

Part3) Functions – 20 points

Using a for loop, write your own R function, evensum(data), that returns the sum of all the even

values in the given numeric data vector.

Now, without using any loop, write your own R function, evensum2(data), that returns the sum

of all the even values in the given numeric data vector.

Test both functions with sample data.

Sample output:

Part4) R – 30 points

Initialize the Dow Jones Industrials daily closing data as shown below:

dow <- read.csv(‘http://kalathur.com/dow.csv’, stringsAsFactors = FALSE)

Provide the simplest R code and output for all of the following. The code should work for any

given data.

a) Use the diff function to calculate the differences between consecutive values.

Insert the value 0 at the beginning of these differences. Add this result as the DIFFS column of

the data frame.

b) How many days did the Dow close higher than its previous day value? How many days did

the Dow close lower than its previous day value?

c) Show the subset of the data where there was a gain of at least 400 points from its previous

day value.

d) Provide the solution to compute the longest gaining streak of at least 100 points in the data.

Show the data for that longest gaining streak. Hint: Use the rle function provided by R.

Submission:

Create a folder, CS544_HW2_lastName and place the following files in this folder.

Provide the text and code part of the solutions and the corresponding output in a single

Word document, HW2_lastName.doc.

For the code portions, provide the R file, HW2_lastName.R, with each portion of the

code identified by comments.

Archive the folder (CS544_HW2_lastName.zip). Upload the zip file to the Assignments

section of Blackboard.