CSC209 A4

$40.00

Category: You will Instantly receive a download link for .zip solution file upon Payment

Description

5/5 - (1 vote)

In this assignment, you will build a small webserver that allows users to run the image filters you created in Assignment 3 on custom images using only their web browser.

Learning Objectives

At the end of this assignment students will be able to write a program that communicates with other processes across a network using sockets parse parts of HTTP requests write a program that creates new processes read, write, and manipulate plaintext and binary data read and add to a medium-sized C program

Introduction

A web server is a program that uses sockets to listen for HTTP requests, and constructs an appropriate HTTP response. The response often contains a HTML document that the browser renders on the screen, althought you will also see other types of responses in this assignment.

Take a look at main.html in the starter code to see an example of data that the server will send back to the client when the client sends a request. You can see how your web browser will render this data by using the Open File option in your web browser (E.g. Chrome, Firefox) to open main.html .

Note that  when you open the file like this, the dropdown menu beside “Select an image” is empty. That’s because the server fills in this image list based on the current contents of the images directory using a bit of JavaScript. For this assignment, we are simplifying (and hard-coding) some aspects of a web server and the HTTP protocol. In particular, your the server only considers the first line of the HTTP request that contains the type of request (GET or POST), and the resource requested.

NOTE 1: We strongly recommend that you only use Chrome or Firefox to test your work on this assignment. We tested the starter code on these two browsers. NOTE 2: It is important that you kill image_server program (especially on the lab machines) when you are not working on it. Although we have tried to make the code safe, we cannot guarantee that there are no security holes in this application.

Setup and starter code

Do a git pull in your repository; you will see the starter code under the new a4/ directory. Like previous assignments, there’s quite a lot of provided code to read through and understand. Here is an overview of the starter code to help you get started. Source code: image_server.c is the main program. It contains the central handle_client function, and a main to actually run the program. request.h , request.c : functions to handle the parsing of requests. request.h also defines the key structs used to store data for these requests. response.h , response.c : functions to handle how the server should respond to requests (i.e., what data the server should write back to the client). socket.h and socket.c contain the basic server socket code from lecture.

You shouldn’t need to change these files. Makefile : a sample makefile. Running make will create and populate the images/ and filters/ directories required by the server, in addition to compiling the image_server executable. Other files: copy : this is an executable in case you did not complete Assignment 3 successfully. You’ll be able to use this executable to test your work on the Teaching Lab servers, but it likely won’t run on your machine.

Of course, you can ignore this file completely and simply use your own version from Assignment 3. dog.bmp : a sample bitmap file. main.html : an HTML file that acts as the “welcome” page for the server. Users can use this page to submit other types of request to the server.

NOTE: Be careful which files you commit and push to your repo. In particular, you should not commit any executable files, image files, or .o files to your repo. Lab 10 and port number setup We strongly recommend working on this assignment after completing Lab 10 (http://www.teach.cs.toronto.edu/~csc209h/fall/labs/lab10-sockets.html) .

You can use any code you wrote on Lab 10 for this assignment. The first thing you should do is change the PORT number in the Makefile in the same way as Lab 10; this will help you avoid port conflicts with other students on the Teaching Lab machines.

Part 1: Parsing the first line of an HTTP request

When you type a URL into your browser, it will send a request to the server. The URL to request that the main.html page be displayed is http://localhost:/main.html if you are running the web browser on the same machine as the server, and if you replace with the port number that your server is listening on. If you want to run the server on wolf.teach.cs.toronto.edu and the web browser on your own laptop or desktop, you can use the URL http://wolf.teach.cs.toronto.edu:/main.html .

This causes your browser to send a request to the server that looks something like the following: (Note that all lines use the CRLF line endings (“\r\n”).) GET /main.html HTTP/1.1 Host: localhost:3000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:56.0) Gecko/20100101 Firefox/56.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive Upgrade-Insecure-Requests: 1 Cache-Control: max-age=0 For most of this assignment, we only care about the first line of the HTTP request and will ignore all of the other data in the HTTP request.

Understanding the start line of an HTTP request The image server we are building can respond to different kinds of requests: an initial one to load the main HTML page, a request to upload a bitmap file, and a request to run a filter. This data is encoded in the first line (called start line) of an HTTP request in the format HTTP/1.1 where:  is either GET or POST consists of a forward slash ( / ) and resource name (e.g., main.html ).

If the method is GET , the resource name can be optionally followed immediately by a ? and then zero or more name-value pairs (called query parameters), each separated by & . Each name-value pair is written in the form = ; name and values are always string. does not have any spaces. Moreover, the resource name, and query parameter names and values all do not contain ? , & , and = . Some examples of HTTP request start lines are: GET / HTTP/1.1 GET /main.html HTTP/1.1 GET /my-resource HTTP/1.1 POST /david HTTP/1.1 GET /my-resource?name1=value1&name2=value2 HTTP/1.1 (Note: this is a simplified version of the full format of this line. Read more about the full format here (https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html) .)

Your first task is to complete the implementation of handle_client up to and including the parsing of the first line of the HTTP request according to the format above, including the implementation of parse_req_start_line and some helper functions in request.c . Note that even though the client buffer is not null-terminated automatically, all values stored in the ReqData struct must be null-terminated strings. We’ve included a function log_request that you can use to check the values of these strings.

To debug your work, you can run your server and then visit the url localhost:/main.html? name1=value1&name2=value2 in a web browser of your choice (where you should replace with your port number, configured above). Even though you won’t see anything in the web browser, if you go to your running server, you should see the following output from log_request : Request parsed: [GET] [/main.html] name1 -> value1 name2 -> value2 Try out different URLs to ensure your parsing is working properly.

Part 2: String responses

Now that you have parsed the request start line, you can begin writing the code that will allow the server to respond to different requests. Your next task is to add to the implementation of handle_client according to its comments so that it does the following: When given a GET request with resource name MAIN_HTML (this is a defined constant), it renders the provided main.html page. Any query parameters and the rest of the HTTP request are ignored. On any other request, it renders the provided “Not Found” string.

Note that you aren’t responsible for writing any of the actual response text yourself; instead, read the provided starter code carefully to understand the functions we’ve provided, and call them appropriately to complete this part. To check your work, you should try visiting localhost:/main.html and localhost:/randompage.html in your web browser. In the first case, you should see a plain-looking webpage with the title “CSC209: Image Filter Server”; in the second, you should see the message “Page not found”.

Note: the provided code illustrates two types of responses: main.html is rendered as an HTML response, while the “Page not found” message is rendered as a plain text response. Your web browser probably displays them quite differently; cool, eh?

Part 3: Running an image filter

The main.html page has two different forms the user may use to interact with the server. The first form allows the user to select an image filter executable and a file name. When the user presses the “Run filter” button, a GET request is sent with target with resource name /image-filter , and two query parameters filter , whose value is the name of the filter to apply, and image , whose value is the filename of the bitmap image to process.

For example: GET /image-filter?filter=copy&image=dog.bmp Your task here is to implement the image_filter_response function and call it appropriately in handle_client when given a request. Once you have this working, you should be able to submit a request by pressing the “Run filter” button and download the filtered bitmap image! Input validation The “filter” dropdown in the form uses preset options for the different filters from Assignment 3 (except scale , which we avoided because it takes a command-line argument).

To make them all accessible, you can simply copy-and-paste your executables from Assignment 3 into the a4/filters/ directory. (But even if you didn’t complete Assignment 3, you can test your work here by using the provided copy executable.) The “image” dropdown is populated dynamically with the names of the files in the a4/images/ directory, which the provided Makefile populates with dog.bmp . If users always stick to using the form, then the server always receives valid requests to image-filter ; the problem lies in the fact that query parameters in a GET request are very easy to change. For example, there’s nothing stopping the user from making the following request (typing it directly into their browser’s address bar): GET /image-filter?filter=hahaha&image=nodog.bmp

To get you thinking about this, we’re asking you to make the following checks for the query parameters in your image_filter_response implementation: 1. Both query parameters “filter” and “image” must be present. 2. Neither value can contain a slash character ‘/’ . 3. The filter value must refer to an executable file under a4/filters/ . 4. The image value must refer to a readable file under a4/images/ . If any of these conditions are violated, you should send a “bad request error” response back to the client.

Part 4: Uploading files

The second form on the main.html page enables users to upload their own bitmap files directly to the server. The previous two types of requests could be completed just by looking at their start lines, and ignoring the rest of the request data. However, for uploading files the entire request must be read in, as the actual bitmap file data sent by the client is contained in the body of the request, not just its first line. Here is an example of the format of this request if we upload toronto.bmp (toronto.bmp) .

Remember that all line endings use network newlines, \r\n . POST /image-upload HTTP 1.1 Content-Type: multipart/form-data; boundary=—7353 —–7353 Content-Disposition: form-data; name=”bitmap”; filename=”toronto.bmp” Content-Type: image/bmp —–7353– The important components are: The Content-Type: multipart/form-data header line ends with the boundary string (“—7353” in the above example), which immediately follows “boundary=”.

This is used to separate sections of the request. The characters “—–7353” (comprised of “–” plus the boundary string), which always occurs on its own line, begins the start of the form data storing the uploaded bitmap file. We’ll call this the bitmap data section. The first line of the bitmap data section contains name-value pairs. The last name-value pair is always of the form filename=”” , where is the filename of the uploaded file. The actual bitmap file data begins of the fourth line of the bitmap data section.

Note: because the file data itself may contain bytes corresponding to \r\n , we can’t can’t just search for the next “\r\n” to find the end boundary line. The sequence of characters “\r\n—–7353–\r\n” marks the end of the bitmap data (and in fact the end of the request data).

This is quite a bit, and in fact we’ve given you code that does a lot of parsing of this already. First, read through image_upload_response in response.c , which is the main function that handles this type of request. It makes use of three main helpers in request.c to parse the request and actually save the file: get_boundary and get_bitmap_filename extract the boundary string and filename of the bitmap file uploaded, respectively. save_file_upload is responsible for actually reading in the image data from the request body, and saving it into a file on the server.

Your task is to implement this function. More concretely, when this function is called, the bitmap filename has just been parsed, and the line containing it removed from the buffer. The remaining part of the request has the following structure (don’t forget about the network newlines \r\n ): Content-Type: image/bmp —–7353– Your task is to extract just the bitmap data from this request and write it to a file.

Buffering the data The tricky part here is that the entire bitmap data does not fit into the buffer! This means that you’ll need to use a loop that continually reads new data from the socket into the buffer, and then writes that data to the file. (This is similar to the copy filter from A3 except that you should read the data in chunks rather than one byte at a time.)

The challenge is that you need to determine where the bitmap data stops. You can take either of the following approaches for this assignment: 1. You may assume that the uploaded bitmap file is valid, and so extract the file size from the data to determine the exact number of bytes to read. 2. You may assume that the HTTP request is valid, and so keep reading until the sequence of characters representing the final boundary string (“\r\n—–7353–\r\n” in our above example) is detected.

Submission Commit all your code to your repository. Running make image_server should produce the single executable image_server . Our tests will be responsible for providing correct image filter executables and setting up the images/ and filters/ directories.

Note: it is generally not good practice to commit the .o or executable files to your repository, as these files should be automatically generated from your source code. If you accidentally committed such files, please remove them from your repo ( git rm ) before your final submission.

Finally, give yourself a pat on the back for completing the last assignment in CSC209! And show your non-CS friend or your parents what you accomplished!