CS 5350/6350: Machine Learining Homework 2

$30.00

Category: Tags: , , , , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (3 votes)

1 Boolean Functions
In this problem, you will be asked to write Boolean functions and linear threshold functions
based on given labeled data.
1. [3 points] Table 1 shows several data points (the x’s) along with corresponding labels
(y). (That is, each row is an example with a label.) Write down three different Boolean
functions all of which can produce the label y when given the inputs x.
y x1 x2 x3 x4
0 1 0 0 0
0 1 1 0 0
1 1 0 1 1
Table 1: Original Table
2. [5 points] Next, we expand Table 1 to Table 2 by adding more data points. How many
errors will each of your functions from the previous questions make on the full data
set.
3. [7 points] Write down the linear threshold function for the data in Table 2.
1
y x1 x2 x3 x4
0 1 0 0 0
0 1 1 0 0
1 1 0 1 1
0 0 1 0 0
0 0 1 1 0
0 1 1 1 0
1 0 1 1 1
Table 2: Expanded Table
2 Mistake Bound Model of Learning
Consider an instance space consisting of integer points on the two dimensional plane
(x1, x2) with −128 ≤ x1, x2 ≤ 128. Let C be a concept class defined on this instance space.
Each function fr in C is defined by an integer radius r (with 1 ≤ r ≤ 128) as follows:
fr(x1, x2) =
+1 x
2
1 + x
2
2 ≤ r
2
;
−1 otherwise (1)
Our goal is to come up with a error-driven algorithm that will learn the correct function
f ∈ C that correctly classifies a dataset.
Side notes
1. Recall that a concept class is the set of functions from which the true target function
is drawn and the hypothesis space is the set of functions that the learning algorithm
searches over. In this question, both these are the same set.
2. Assume that there is no noise. That is, assume that the data is separable using the
hypothesis class.
Questions
1. [5 points] Determine |C|, the size of concept class.
2. [5 points] To design an error driven learning algorithm, we should be able to first write
down what it means to make a mistake. Suppose our current guess for the function is
fr defined as in Equation 1 above. Say we get an input point (x
t
1
, xt
2
) along with its
label y
t
. Write down an expression (an equality or an inequality) in terms of x
t
1
, x
t
2
, y
t
and r that checks whether the current hypothesis fr has made a mistake.
3. [10 points] Next, we need to specify how we will update a hypothesis if there is an
error. Since fr is completely defined in terms of r, we only need to update r. How
will you update r if there is an error? Consider errors for both positive and negative
examples.
2
4. [20 points] Use the answers from the previous two steps to write a mistake-driven
learning algorithm to learn the function. Please write the algorithm concisely in the
form of pseudocode. What is the maximum number of mistakes that this algorithm
can make on any dataset?
5. (For 6350 students)[15 points total] We have seen the Halving algorithm in class. The
Halving algorithm will maintain a set of hypotheses consistent with all the examples
seen so far and predict using the most frequent label among this set. Upon making a
mistake, the algorithm prune at least half of this set. In this question, you will design
and analyze a Halving algorithm for this particular concept space.
a. [5 points] The set of hypotheses consistent with all examples seen so far can be
defined storing only two integers. How would you do this?
b. [5 points] How would you check if there is an error for an example (x
t
1
, xt
2
) that
has the label y
t
?
c. [5 points] Write the full Halving algorithm for this specific concept space. (Do
not write the same Halving algorithm we saw in class. You need to tailor it to
this problem.) What is its mistake bound?
3 The Perceptron Algorithm and Its Variants
3.1 The Task and Data
Imagine you have access to information about people such as age, gender and level of education. Now, you want to predict whether a person makes over $50K a year or not using
these features.
We will use Adult data set from the UCI Machine Learning repository1
. The original
Adult data set has 14 features, among which 6 are continuous and 8 are categorical. In order
to make it easier to use, we will use a pre-processed version (and subset) of the original Adult
data set, created by the makers of the popular LIBSVM tool. From the LIBSVM website:
“In this data set, the continuous features are discretized into quantiles, and each quantile is
represented by a binary feature. Also, a categorical feature with m categories is converted
to m binary features.”
Use the training/test files called ‘a1a.train’ and ‘a1a.test’, available on the assignments page of the class website.2 This data is in the LIBSVM format, where each row is a
single training example. The format of the each row in the data is