# ECE521 Assignment 1: k Nearest Neighbours (k-NN)

\$30.00

5/5 - (1 vote)

## Introduction

k-NN is one of the simplest nonparametric machine learning models. In this assignment, you will
first implement the basic k-NN algorithm in TensorFlow. You will then use it for both regression

In this assignment, when k > 1, we will aggregate the k nearest neighbours’ predictions by averaging their target values for regression tasks. For classification tasks, a majority voting scheme is
assumed.

## 2 MAKING PREDICTIONS FOR REGRESSION [6 POINTS]

### 1 Euclidean distance function [4 points]

The squared Euclidean distance between two D-dimensional vectors x and z is given by kx−zk
2
2 =
(x − z)
>(x − z) = PD
d=1(xd − zd)
2
. Assume we have two groups of D-dimensional data points
represented respectively by a N1 × D matrix X =

x
(1)T
.
.
.
x
(N1)
T

and a N2 × D matrix Z =

z
(1)T
.
.
.
z
(N2)
T

.
Let the function Deuc(X,Z) returns a N1 × N2 matrix

kx
(1) − z
(1)k
2
2
. . . kx
(1) − z
(N2)k
2
2
.
.
.
.
.
.
.
.
.
kx
(N1) − z
(1)k
2
2
. . . kx
(N1) − z
(N2)k
2
2

containing the pairwise Euclidean distances.

Pairwise distance [4 pt.]:
Write a vectorized Tensorflow Python function that implements the pairwise squared Euclidean
distance function for two input matrices. It should not contain loops and you should make use of
Tensorflow broadcasting. Include the snippets of the Python code.

### 2 Making Predictions for Regression [6 points]

Suppose a particular training dataset consists of N training examples with input features X = 

x
(1)T
.
.
.
x
(N)
T

 and targets Y =

y
(1)T
.
.
.
y
(N)
T

 represented by two matrices. In k-NN modelling for regression,
a standard way to obtain a prediction for a new test point x

is to average the targets/labels of the
k training points nearest to the test point x

. Let Nk
x∗ ⊆ {x
(1)
, . . . , x
(N)} denote the neighbourhood
set of the k training examples closest to x

. A k-NN prediction function yˆ(x

) can therefore be
written concisely as follows:
yˆ(x

) = YT
r

, where r
∗ = [r1, . . . , rN ], rn =
(
1
k
, x
(n) ∈ Nk
x∗
0, otherwise.
r

can be understood as a “responsibility” vector that encodes the contribution of each training
example to the final prediction. In a standard k-NN model, the k nearest training points will have
equal contribution. To measure the performance of the k-NN predictions, we will consider the
mean squared error (MSE) loss:
L =
1
2N
X
N
n=1
kyˆ(x
(n)
) − y
(n)
k
2

The first dataset for this assignment is an 1-D toy dataset, data1D, with 100 data points. We
randomly partition 80 data points as training examples and split the rest into 10 validation points and 10 test points.

You can generate the training and test dataset using the following python
commands:
import numpy as np
np.random.seed(521)
Data = np.linspace(1.0 , 10.0 , num =100) [:, np. newaxis]
Target = np.sin( Data ) + 0.1 * np.power( Data , 2) \
+ 0.5 * np.random.randn(100 , 1)
randIdx = np.arange(100)
np.random.shuffle(randIdx)
trainData, trainTarget = Data[randIdx[:80]], Target[randIdx[:80]]
validData, validTarget = Data[randIdx[80:90]], Target[randIdx[80:90]]
testData, testTarget = Data[randIdx[90:100]], Target[randIdx[90:100]]

1. Choosing the nearest neighbours [2 pt.]:
Write a vectorized Tensorflow Python function that takes a pairwise distance matrix and
returns the responsibilities of the training examples to a new test data point. It should not
contain loops. You may find the tf.nn.top k function useful. Include the relevant snippets of

2. Prediction [4 pt.]:
For the dataset data1D, compute the above k-NN prediction function with k = {1, 3, 5, 50}.
For each of these values of k, compute and report the training MSE loss, validation MSE
loss and test MSE loss. Choose the best k using the validation error.

For k = {1, 3, 5, 50}, plot the prediction function for x ∈ [0, 11] along with the true targets
similar to the figure shown above. (You may create an input matrix X = np.linspace(0.0, 11.0,
num = 1000)[:, np.newaxis] and use Matplotlib to plot the predictions.) Comment on this
plot and how you would pick the best k from it.

### 3 Making Predictions for Classification [10 points]

In this section, k-NN will be used for classification task. You will modify the prediction part of the
k-NN algorithm and apply it for face and gender classification tasks. You will use a tiny version of
the FaceScrub dataset1

. The FaceScrub dataset consists of 100,000+ images of 500+ celebrities.
The tiny version consists only of six celebrities (three actors and three actress) with average of 150
image per subject. The images were downloaded and converted to grey using the available code in
http://www.cs.toronto.edu/~guerzhoy/321/proj1/, then, the faces were cropped and resized
to 32×32 images and stored as .npy file.

You are provided two .npy files; the data.npy file is an
array of size 936×32×32 which contains 936 images of the six celebrities, and the target.npy file
that encodes the class of each image and of size 936×2. The first column encodes the name (ID)
of the actor and the second column encodes the gender as follows:
 The name (ID) of the actors: ‘Lorraine Bracco’, ‘Gerard Butler’, ‘Peri Gilpin’, ‘Angie Harmon’, ‘Daniel Radcliffe’, and ‘Michael Vartan’ are encoded as ‘0’, ‘1’, ‘2’, ‘3’, ‘4’, and ‘5’,
respectively.

 The gender of the actors: ‘Male’ and ‘Female’ are encoded as ‘0’ and ‘1’, respectively.

You should use the following code to load the dataset.
# task = 0 >> select the name ID targets for face recognition task
# task = 1 >> select the gender ID targets for gender recognition task
data = np.reshape(data, [-1, 32*32])
1http://vintage.winklerbros.net/facescrub.html

np.random.seed(45689)
rnd_idx = np.arange(np.shape(data)[0])
np.random.shuffle(rnd_idx)
trBatch = int(0.8*len(rnd_idx))
validBatch = int(0.1*len(rnd_idx))
trainData, validData, testData = data[rnd_idx[1:trBatch],:], \
data[rnd_idx[trBatch+1:trBatch + validBatch],:],\
data[rnd_idx[trBatch + validBatch+1:-1],:]
trainTarget, validTarget, testTarget = target[rnd_idx[1:trBatch], task], \
target[rnd_idx[trBatch + validBatch + 1:-1], task]
return trainData, validData, testData, trainTarget, validTarget, testTarget

1. Predicting class label [4 pt.]:
Modify the prediction function for regression in section 1 and use majority voting over k
nearest neighbors to predict the final class. You may find tf.unique with counts helpful for