Statistical Natural Language Processing Exercise Sheet I solved

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment

Description

5/5 - (3 votes)

1) Mathematical Basics (2 points)
Use set theory and the definition of probability functions to show that:
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
2) Indepedence of Events (8 points)
Consider a fair 6-sided die whose sides are numbered from 1 to 6 and each die roll
is independent of the other rolls. In an experiment that consists of rolling the die
twice, the following events can be defined
A : The sum of the two outcomes is at least 10
B : At least one of the two rolls resulted in 6
C : At least one of the two rolls resulted in 1
D : The outcome of the 2nd roll was higher than the 1st roll
E : The difference between the two roll outcomes is exactly 1
(a) Compute the probabilities P(A), P(C), and P(E).
(b) Is event A independent of event B?
(c) Is event A independent of event C?
(d) Are events D and E independent?
3) Bayes Theorem (4 points)
Suppose we are interested in a test to detect a disease which affects one in 100, 000
people on average. A lab has developed a test which works but is not perfect. If a
person has the disease, it will give a positive result with probability 0.97; if they do
not, the test will be positive with probability 0.007. You took the test, and it gave
a positive result. What is the probability that you actually have the disease?
4) Random Variables (6 points)
Are X and Y , as defined in the following table, independently distributed? How did
you check?
x 0 0 1 1
y 0 1 0 1
p(X = x, Y = y) 0.32 0.08 0.48 0.12
Justify your answers using the laws of probability and the definition of probabilistic
independence.
1/2
Submission Instructions
The following instructions are mandatory. Please read them carefully. If you do not follow
these instructions, the tutors can decide not to correct your exercise solutions.
• You have to submit the solutions of this exercise sheet as a team of 2 students.
• If you submit source code along with your assignment, please use Python unless otherwise
agreed upon with your tutor.
• NLTK modules are not allowed, and not necessary, for the assignments unless otherwise
specified.
• Make a single ZIP archive file of your solution with the following structure
– A source_code directory that contains your well-documented source code and a
README file with instructions to run the code and reproduce the results.
– A PDF report with your solutions, figures, and discussions on the questions that you
would like to include. You may also upload scans or photos of high quality.
– A README file with group member names, matriculation numbers and emails.
• Rename your ZIP submission file in the format
exercise02_id#1_id#2.zip
where id#n is the matriculation number of every member in the team.
• Your exercise solution must be uploaded by only one of your team members under Assignments in the General channel on Microsoft Teams.
• If you have any problems with the submission, contact your tutor before the deadline.
2/2