CS/ECE/ME532 Activity 2

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment

Description

5/5 - (1 vote)

1) Let X =



1 1 1 1
2 2 −2 −2
3 −3 3 −3



and w =





1
b
1
c





.

a) Write out and evaluate the vector y = Xw.
b) Find b and c so that y =



4
0
0


 .

c) Find b and c so that y =



0
0
12


 .

2) Recall from the last activity that food involves fats, proteins and carbohydrates. There
are 9 calories for every gram of fat, 4 calories for every gram of protein, and 4 calories
for every gram of carbohydrates. If we define a vector x =



x1
x2
x3



where x1 is the
number of grams of fat, x2 is the number of grams of protein, and x3 is the number of
grams of carbohydrate, then the number of calories is y = x
T w where w =



9
4
4


.

Your nutrition expert has a way of defining a food as “low carb” based on the ratio of
carbohydrate calories to total calories. Let
z =
carbohydrate calories
total calories
A food is classified as low carb if z < 1/4.

a) Express the rule for classifying foods given by your nutritionist as the sign of
an inner product between x and a vector of weights w˜. In other words, specify
w˜1,w˜2 and ˜w3 so that when sign(x
T w˜) = 1 then the food is low carb, and when
sign(x
T w˜) = −1 then the food is not low carb.

b) Nutritionists like to look at the ratios of the types of calories. Consider the
features rf = x1/x3, the ratio of the number of grams of fat to carbohydrate, and
rp = x2/x3, the ratio of the number of grams of protein to carbohydrate. Express
low carb criterion as a function of the features rf and rp.

c) Define the decision boundary as the line where z = 1/4, since a food with z < 1/4
is classified as low carb while a food with z ≥ 1/4 is not low carb. Graph the
decision boundary assuming feature rp is on the vertical axis and rf is on the
horizontal axis. Shade the portion of the rf -rp plane that corresponds to low-carb
foods. Note that rp and rf cannot be negative. Hint: Recall that the equation
y = mx + b describes a line with slope m and y-intercept b.

d) Consider the four cereals:
Cereal 1: 1 gram fat, 8 grams protein, 44 grams carbohydrate
Cereal 2: 0.5 grams fat, 2 grams protein, 25 grams carbohydrate
Cereal 3: 1.3 grams fat, 2.7 grams protein, 29.3 grams carbohydrate
Cereal 4: 9 grams fat, 4 grams protein, 16 grams carbohydrate

Plot the features rf , rp for each cereal in the rf -rp plane and label each pair of
features with the corresponding cereal number. Are any of these classified as low
carb?

e) Almond butter has 9 grams fat, 3.4 grams protein, and 3 grams carbohydrate per
serving. Plot the features rf , rp for almond butter in the rf -rp plane. Is almond
butter classified as a low-carb food?

f) A serving of marinated grilled salmon has 19 grams fat, 23 grams protein, and 1
gram carbohydrate per serving. Plot the features rf , rp in the rf -rp plane. Is this
salmon classified as a low-carb food?

Fairness in machine learning. Various forms of discrimination could be built
into the machine learning algorithms. You can find more details from these
articles: 1) “Artificial Intelligence’s White Guy Problem” by NY Times and 2)
“Machine Bias” by Propublica.

This led the researchers to introduce the notion
of “fair machine learning”, which we will touch upon in the following problem.
Ensuring fairness in machine learning is a current research topic, so please refer
to “Fairness and machine learning: Limitations and Opportunities” by Barocas,
Hardt, and Narayanan for an in-depth survey.

3) Decision boundaries that lead to the best overall performance may result in poor
performance for subgroups of the data that have slightly different characteristics.
Let x
T = [x1, x2]. Consider the following dataset.
x1
x2
(3, 0)
(0, 3)
w = [0, −1]T
w = [2, −3]T
w = [3, −2]T
w = [1, 0]T
w = [3, 2]T
w = [2, 3]T
+

− +



+
− +
+
+

a) Consider the following linear classifier:
x
T w ≥ 0 : predict +,
x
T w < 0 : predict −,
where w is one of the 6 vectors and corresponding decision boundaries shown
above.

For example, w = [1, 0]T
results in the classifier
x1 ≥ 0 : predict +,
x1 < 0 : predict −,

so values to the right of the x2-axis are classified as + and those to the left as −.
Which decision boundaries in the picture (i.e, which w’s) minimize the total
number of misclassifications?

b) A classifier is called fair if it performs equally well on specified subgroups of the
dataset, i.e., the percentage of misclassifications for each subgroup is the same.

The above dataset consists of two subgroups: six blue data points and six red
data points. Of the six candidate classifiers, find the ones that are fair.

c) Among all the fair classifiers, find the one that minimizes the total number of
misclassification. Are all the classifiers that you found in part a) fair?