COL380 Assignment 4: Template search in Image using CUDA

$30.00

Category: Tags: , , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (6 votes)

Problem Statement: In this assignment, you are required to use CUDA for parallel computation. You are not allowed to use any framework. Instead, work on CUDA (version CUDA 11) in C++. You will be given a Data RGB image (call it L) and a small query RGB image (call it Q). Your task is to locate the query image Q approximately in the data image L. Note that the query image Q need not be upright with respect to the data image L. There may be a rotated copy of query image Q in data image L. Thus, a match is specified by the X, Y row-column numbers of the lower-left corner of the query image Q in the data image L and its counter-clockwise rotation in degree of the base of query image Q. To simplify the problem, we will only rotate the query image from -45° to +45° in steps of 45°. Image coordinates are (0,0) on the lower left. The required output is a series of <X, Y, degree> triplets. Please note (X, Y) represents row number and column number from the bottom left of Data image L. The images will be passed as a text file. The first two space-separated integers would represent the number of rows (m) and the number of columns (n). Then m*n*3 space-separated integers will be given that would represent the coordinate at Image [i,j,k]. R G B R G B (M*N times) -> Reading row by row So for array A of m*n*3 integers, X[m-i-1,j,k] = A[i*n*3 + j*3 + k] (Indexing corrected as we consider bottom left of image as X[0,0,:]). That means it would read the image row by row from the top row to the bottom row and write each pixel as a triplet of R G B channel values. Pixel value that is X[i,j,k] lies in range [0,255] with integers value only. A perfect match is found if the pixels of the query image Q match the pixel values of the data image L exactly. Note that pixel coordinate <X, Y> of the query image may not have integer coordinates after rotation by d degrees. You compute the colour of non-integer pixel locations of an image using bilinear interpolation. Read bilinear interpolation. We use the interpolated data pixel to match against each query pixel in the case of a rotated query image. We are looking for similarity and not necessarily a perfect match. The similarity will be checked by the RMSD score. We will check if the RMSD (root mean square of the differences) between two images is less than some given threshold. One can read in detail about RMSD. RMSD is the root of the mean of the sum of squares of the difference between corresponding pixel values for each channel (R, G, B). If the image size is m*n*3 where 3 represents RGB channels, then RMSD is given by: The brute force method may be too time-consuming. One can always filter out the image on the basis of some basic condition. There are several filtering techniques. We will use a very simple one. Convert each image to grayscale by taking the grey value V to be (R+G+B)/3. Also, for filtering, we can compare with the upright bounding rectangle of the query image (as one can see a red square in the image below). In the case of the rotated query image, we use its axis-aligned bounding box in the data image to filter. Only if the average of all grey values of the bounding box image is within TH2 of the query image (rotated and interpolated), shall we check if the RMSD is within TH1. An image summary can be computed to represent areas of an image. If two images do not have a similar summary, they may be considered different enough and the detailed RMSD computation may not be necessary. In our case image summary would be the average of all pixel values means it would be an integer. Filtering method: In the above figure, each grid location stands for a pixel. The green rectangle represents the query image rotated by 40°. Its bottom-left corner (coordinate 0,0) is aligned with some pixel of the data image (represented by the upright grid). The filled green circle of the query image is compared with the interpolated value from the four dark-red data pixels. The average data grey-scale values in the bounding box marked in dark red are used for filtering. Any data pixel on the boundary of the bounding box is included in the average. Note: In the example above angle is given at 40 degrees. But for our testing, we will use only three angles -45, 0 and 45 degrees. Example: Data Image (L): Query Image (Q): For the above image, for n=1, the output triple would be (290,330,-45), given the data image size is (600*600) and the query image size is (80,80). One may notice that it is possible not to get integer coordinates, so bilinear interpolation is used to get approximate values. Evaluation Scheme: We will check the triplet outputs for each given data image, query image, threshold 1, threshold 2 and ‘n’. We will see if RMSD is within the threshold and is in top n. You will get full marks if you give the topmost match ( n=1). If you give top ‘n’, there is a 15% bonus. Deliverables ● A zip archive with the filename _.zip. On unzipping it should produce a directory with the name as your _ (all in caps). ● The directory should contain the make file, a bash file, and other files to run your code. Do not refrain from this format. ● First, we will run the make file. It should give executable. You are required to write a run.sh bash file which will take executable and arguments and write top ‘n’ triplets in output.txt. ● We will run bash file as follows: ./run.sh ● We will be using CUDA 11 in our HPC testing system. Please ensure it runs for our testing system.