Description
1. Objective
Boosting is a general method for improving the accuracy of any given learning algorithm. Specifically, one can use it to combine weak learners, each performing only
slightly better than random guess, to form an arbitrarily good hypothesis. In this
project, you are required to implement an AdaBoost and RealBoost algorithms for
frontal human face detection.
2. Data
• Training data: Face and non-face images of the size of 16×16 pixels are given.
• Testing data: Three photos taken at the class are used for testing.
• Hard negatives: Three background images are taken without faces.
3. Tasks
Perform the following tasks. Report your results in the given order and write a concise
interpretation for each result:
1. Construct weak classifiers {hj}: Load the predefined set of Haar filters. Compute the features {fj} by applying each Haar filter to the integral images of the
positive and negative populations. Determine the polarity sj and threshold θj
for each weak classifier hj :
hj (x) = (
−sj if fj < θj ,
sj otherwise.
Write a function which returns the weak classifier with lowest weighted error.
Note, as the samples change their weights over time, the histograms and threshold θ will change.
2. Implement AdaBoost: Implement the AdaBoost algorithm to boost the weak
classifiers. Construct the strong classifier H(x) as an weighted ensemble of T
weak classifiers:
H(x) = sign(F(x)) = sign X
T
t=1
αtht(x)
!
.
Two class photos are given. Note, you need to scale the image into a few scales so
that the faces at the front and back are 16x16 pixels in one of the scaled image.
Run your classifier on these images. Perform non-maximum suppression, i.e.
1
when two positive detections overlap significantly, choose the one that has higher
score. Perform hard negatives mining. You are given background images without
faces. Run your strong classifier on these images. Any “faces” detected by your
classifier are called “hard negatives”. Add them to the negative population in
the training set and re-train your model. Include the following in your report:
(a) Haar filters: Display the top 20 Haar filters after boosting. Report the
corresponding voting weights {αt
: t = 1, . . . , 20}.
(b) Training error of strong classifier: Plot the training error of the strong
classifier over the number of steps T.
(c) Training errors of weak classifiers: At steps T = 0, 10, 50, 100, plot the
curve for the training errors of the top 1, 000 weak classifiers among the pool
of weak classifiers in increasing order. Compare these four curves.
(d) Histograms: Plot the histograms of the positive and negative populations
over F(x), for T = 10, 50, 100, respectively.
(e) ROC: Based on the histograms, plot their corresponding ROC curves for
T = 10, 50, 100, respectively.
(f) Detections: Display the detected faces in both of the provided images
without hard negative mining.
(g) Hard negative mining: Display the detected faces in both of the provided
images with hard negative mining.
3. Implement RealBoost: Implement the RealBoost algorithm using the top
T = 10, 50, 100 features you chose in step (c). Compute the histograms of negative and positive populations and the corresponding ROC curves. Include the
following in your report:
(h) Histograms: Plot the histograms of the positive and negative populations
over F(x), for T = 10, 50, 100, respectively.
(i) ROC: Based on the histograms, plot their corresponding ROC curves. Compare them with the ROC curves in (e).
2