Computer Vision CAP 5415 Programming Assignment-III




5/5 - (1 vote)

Question 1: Image Classification [5 pts]

This is an extension of problem 2 from programming assignment 1. In this question your goal is to develop a CNN
classification network to recognize RGB color images.

You will design your own variants of CNN architecture which
should have more than 2 convolutional layers and more than 1 fully connected layers. You will use CIFAR-10
dataset which is available from PyTorch (torchvision.datasets.CIFAR10).

Your tasks:

• Design a CNN architecture which has more than 2 conv layers and more than 1 fully connected layers. It
should make 10 predictions for the 10 classes of CIFAR-10. Train this network on CIFAR-10 for 30 epochs
using cross-entropy loss and SGD optimizer.

Report training/testing loss for each epoch in form of plots and
accuracy scores after 30 epochs. Remember you will need a softmax activation after the final fully connected

• Increase the number of conv layers in the above network and train again. Report the same numbers and plots
again comparing with the first network.

NOTE: You can use the code provided as a solution for programming assignment 1 and extend it.

What to submit:

• Code
• A short write-up about your implementation with results: 1) Accuracy scores for all the variations, 2) Compare
all the variations using accuracy scores. Comment of how the accuracy changes when you increase the number
of conv layers.

Question 2: Image segmentation [5 pts]

In this question you goal is to implement Otsu thresholding to perform image segmentation. The algorithm will be
discussed during a class lecture next week.

Your tasks:

• First implement a simple thresholding based image binarization algorithm. Plot the histogram for three
different input image. Now based on the plot, perform binarization at three different threshold levels.
• Implement a Otsu thresholding. Use the determined threshold to perform segmentation on the three input

NOTE: You are free to choose any 3 images. If the images are colored, you can convert them to greyscale by
averaging the RGB values at each pixel. You can also use any library function to convert it to greyscale.

What to submit:

• Code
• A short write-up about your implementation with results (as indicated for each variation) and your observations
from each results. For each image, you will have to show corresponding histogram and resultant segmented