ECE/CS 438: Communication Networks Machine Problem 1

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment

Description

5/5 - (5 votes)

Abstract
This machine problem introduces you to a bare-bones HTTP client that can get
data from any web server. This is the kind of code that is running in your browser.
You will also create a HTTP server that can serve data to other clients much like
how a real server would function.
1 Introduction
In this assignment, you will implement a simple HTTP client and server. The client will be
able to GET correctly from standard web servers, and browsers will be able to GET correctly
from your server. The test setup will be two VMs, one server and one client. Each test will
use your client or wget, and your server or thttpd. Your client doesn’t have to support caching
or recursively retrieving embedded objects. HTTP uses TCP – you can use Beej’s client.c and
server.c as a base. Your server must support concurrent connections: if one client is downloading
a 10MB object, another client that comes looking for a 10KB object shouldn’t have to wait for
the first to finish.
2 What is expected in this MP?
2.1 HTTP Client
Your client should run as
./http_client http://hostname[:port]/path/to/file
e.g.
./http_client http://127.0.0.1/index.html
./http_client http://illinois.edu/index.html
./http_client http://12.34.56.78:8888/somefile.txt
./http_client http://localhost:5678/somedir/anotherfile.html
If there is no :port, assume port 80 – the standard HTTP port. You should write
the file that you receive to a file called “output” (no file extension, like txt or html).
Here’s the very simple HTTP GET that wget uses:
GET /test.txt HTTP/1.1
User-Agent: Wget/1.12 (linux-gnu)
Host: localhost:3490
Connection: Keep-Alive
The GET /test.txt instructs the server to return the file called test.txt in the server’s top-level
web directory. User-Agent identifies the type of client. Host is the URL that the client was
originally told to get from – exactly what the user typed. This is useful in case a single server has
multiple domain names resolving to it (maybe www.cs.illinois.edu and www.math.illinois.edu),
and each domain name actually refers to different content. This could be a bare IP address, if
that’s what the user had typed. The 3490 is the port – this server was listening on 3490, so I
1
called “wget localhost:3490/test.txt”. Finally, Connection: Keep-Alive refers to TCP connection
reuse, which will be discussed in class.
Note that the newlines are technically supposed to be CRLF – so, “\r\n” on a Unix
machine.
Only the first line is essential for a server to know what file to give back, so your HTTP GETs
can be just that first line. HTTP specifies that the end of a request should be marked by a
blank line — so be sure to have two newlines at the end. (This demarcation is necessary
because TCP presents you with a stream of bytes, rather than packets.)
2.2 HTTP Server
Now for the HTTP response. Here’s what Google returns for a simple GET of /index.html:
Your server’s headers will be much simpler (but still correct and complete): only include the
response code. When correctly returning the requested document, use HTTP/1.1 200 OK, like
this example. When the client requests a non-existent file, return HTTP/1.1 404 Not Found.
Note that you can still have document text on a 404 – allowing for nicely formatted / more
informative “whoops, file not found!” messages. For any other errors, you may simply return
400 Bad Request. An important note: see how there’s a blank line between the header and
document text in the Google response? That’s a well defined part of the protocol, marking the
end of the header. Your server must include this blank line. Again, HTTP newlines are CRLF.
Your server should take the port to run on as a command line argument, and should treat all
filepaths it’s asked for as being relative to its current working directory. (Meaning just pass
the client’s request directly to fopen: if the client asks for GET /somedir/somefile.txt, the
correct argument to fopen is somedir/somefile.txt). Your server executable should be called
http_server, e.g.:
sudo ./http_server 80
./http_server 8888
(The sudo is there because using any port <1024 requires root access.)
3 VM Setup
You’ll need 2 VMs to test your client and server together. Unfortunately, VirtualBox’s default
setup does not allow its VMs to talk to the host or each other. There is a simple fix, but
then that prevents them from talking to the internet. So, be sure you have done all of your
apt-get installs before doing the following! (To be sure, just run: sudo apt-get install gcc
make gdb valgrind iperf tcpdump ) Make sure the VMs are fully shut down. Go to each of
their Settings menus, and go to the Network section. Switch the Adapter Type from NAT to
“host-only”, and click ok (or you can add another host-only network adapter also). When you
2
start them, you should be able to ssh to them from the host, and it should be able to ping the
other VM. You can use ifconfig to find out the VMs’ IP addresses. If they both get the same
address, sudo ifconfig eth0 newipaddr will change it. (If you make the 2nd VM by cloning
the first + choosing reinitialize MAC address, that should give different addresses.)
4 Autograder and Submission
Similar to the MP0, checkout your mp1 directory from the class repository:
git pull
git fetch release
git merge release/main -m “Merging release repository”
The contents of the checked out mp1 folder will be very similar like mp0 when you first checked
it out . Use those programs as a starting point and make the modifications required for this
assignment.
Modify the Makefile such that a simple make command in your mp1 folder creates the required
executables. The autograder does just that. Be careful about the executable filenames and
output filenames.
A new script (inc_and_push.sh) has been added for your convenience. It increments the version,
commits the modified tracked files in git, and then pushes everything.
You have also been provided with a text file, partners.txt. It should have the following format:
NetID1
NetID2
where the NetID’s belong to you and your partner. One student’s submission suffices for both
teammates to get their grade. If both partners submit, their grade will be the maximum of the
two grades.
Follow the git instructions from MP0 to submit your code. Once the autograder is enabled, you
will be able to run the ./see_results.sh script to get the results.
Tests generally take 1-4 minutes, and there may be a queue of students. You can see where you
are in the queue at
http://cs438fa22.csl.illinois.edu:8080/queue/queue_mp1.html. This is a UIUC-private
IP. If your device is not accessing it through campus network, please use Illinois VPN to get a
private IP.
Caution: During the hours leading up to the submission deadline the queues could be multiple
hours long. So it is advisable to get your work done early.
PLEASE do not fall into the trap of “debugging on the autograder”. If you submit a new version
every time you make some change that might help pass an extra test, you are going to waste a
lot of time waiting for results. Rather, only submit when you have made major progress or have
definitively figured out what you were previously doing wrong. If you aren’t genuinely surprised
that your most recent submission didn’t increase your score, you are submitting too often.
Your grade is the highest score that the auto-grader ever gives you.
5 Grade Breakdown
25%: you submitted your assignment correctly and it compiles correctly on the autograder (You
must have at least successfully committed your files into git to benefit from this.)
25%: wget can retrieve files from your HTTP server
25%: your client can retrieve files from your HTTP server
25%: your server does concurrency correctly: 1 very long download does not block many smaller
3
downloads from starting immediately.
(We will use diff to compare the server’s copy with the downloaded copy, and you should do
the same. If diff produces any output, you aren’t transferring the file correctly.)
6 Notes
• You must use C or C++.
common mistake: libraries must go at the end of the compile command.
• Your program must have a Makefile; running “make” should build all executables.
• Do not put compiled binary files (*.o, the final executable) into git: 5% penalty.
• Do not use a public github repo. You will be held partially responsible for any resultant
plagiarism.
• Your code must be your own. You can discuss very general concepts with others, but if
you find yourself looking at a screenful/whiteboard of pseudocode (let alone real code),
you’re going too far.
• Refer to the class slides and official student handbook for academic integrity policy. In
summary, the standard for guilt is “more probable than not probable”, and penalties range
from warnings to recommending suspension/expulsion, based entirely on the instructor’s
impression of the situation.
• The College of Engineering has some guidelines for penalties that we think are reasonable,
but we reserve the right to ignore them when appropriate.
• You can use libraries from wherever for data structures. You MUST acknowledge the
source in a README. Algorithms (e.g. Dijkstra’s) should be your own.
• Your code must run on the test setup, which is just some Ubuntu 16.04 LTS Server VMs,
running on VirtualBox.
• We will not look at your program on your laptop or EWS.
• Input files on the grader are READ-ONLY. Do not use the “rb+” mode to read them; the
“+” asks for write permission. (In general, you shouldn’t use “rb+” unless you need it,
which should be rare.)
• Input files on the grader are general binary data, NOT text.
• If you run the see_results.sh file on mp1 before the autograder has been activated, your directory will move to the _grades branch. You will have to manually execute git checkout
main to get back to your working branch. DO NOT work on the _grades branch!
• All of your source files will be checked for plagiarism. So do not use piece of code outside
of what has been provided in _release.
4