Programming Assignment 3: UNIX Threads

$35.00

Category: Tags: , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (2 votes)

1 Introduction
For many computing tasks it is sufficient that one instruction be executed immediately after
another, sequentially, until the program completes. However, in the case that many tasks
need to be executed at once, or that a single task can be appropriately split into sub-tasks,
a program’s performance can be greatly improved by working on tasks concurrently or in
parallel through the use of threads. Sometimes it works just as well to use multiple processes,
or multitasking, instead of multithreading, but there are many benefits that result from
multithreading.
2 Description
You are given a starter code directory. If you build the code using make, 2 executable files are
created: dataserver and client. Here, the client runs both executables as parent-child
processes using fork(). The client process then communicates with the server process using
“named pipe” IPC mechanism encapsulated in class RequestChannel, which you can use
without modification. Now, the client sends a few requests through RequestChannel to the
server, which responds with some data. There are 3 persons in this system: John Smith, Jane
Smith and John Doe. Every request has the format “data ” (e.g., “data John
Smith”). The server responds to each such request with a number in range [0, 99] generated
randomly. This emulates collecting data from a real system that keeps patient vitals or some
other important piece of information, and the client is a small device that shows those data
after collecting them from the main data base.
You can make the project and run ./client (with no arguments) to observe execution
and behavior of the system. The client process calls fork() and runs the dataserver program.
Then, it populates a SafeBuffer object with 300 requests, 100 each for three hypothetical
“users”: John Smith, Jane Smith, and Joe Smith. This is done using one single-threaded
process. The command-line argument, -n, (the getopt loop is provided for you in this
assignment) allows you to specify the number of requests per person. For example, typing
./client -n 10000 would cause 30000 requests to be generated (10000 per person). Since
the program is single threaded, you should observe that the program will run much slower
with 100 times more work to do.
1
CSCE 313
In large systems, it is quite possible that the number of requests could grow exponentially.
Websites like Amazon receive around 80 million requests daily. Other companies such as
Google and Youtube, receive requests numbering in the hundreds of millions. No single
process will ever be able to this volume of traffic. So, how do the worlds largest businesses
handle it in stride? Instead of making one process do all of the work, multiple processes
handle data requests concurrently. In doing so, it is possible to handle large quantities of data
since one can continue to add more cores/process threads to the system as load increases.
3 Assignment
1) In this assignment, you are to increase the efficiency of the client-server program given to
you by using multithreading. The command line option, -w, (look at the getopt loop in the
starter code) takes a number for an argument, which defines how many worker threads your
program will utilize. Your program must spawn that many worker threads successfully (see
requirements below), and these threads must work together to handle each of the 3n requests.
2) Multithreading often requires the use of specialized “thread-safe” data structures.
In order to facilitate multithreading, you will modify a few classes (e.g., SafeBuffer and
Histogram). SafeBuffer can simply wrap around an STL data structure (we recommend
std::queue), but must have at least a constructor, a destructor, and the operations push()
and pop(). The data going through the SafeBuffer must be in FIFO order (see requirements
below). Both functions must use locks to ensure that the buffer can be modified concurrently
at either end (concurrent modification at both ends requires semaphores, which we will tackle
in the next assignment). The Histogram object may be updated simulatenously as well.
Therefore, you must make that class thread-safe as well.
In addition to processing the requests using multithreading, and coding a specialized
data structure for that purpose, the request buffer must also be initially populated using
multithreading. However, writing the code for request threads will be much easier than for
the worker threads: you will only ever have 3 request threads, one for each user (John, Jane,
and Joe from earlier; note we said 3 request threads, not 3 request thread functions), and
each one simply pushes n requests (corresponding to its respective user) to request_buffer.
Furthermore, the request threads do not have to run concurrently with the worker threads
(and in fact must not), which means that the only lines of request-thread-related code in main
(apart from setting up the thread parameters) will be three pthread_create calls followed
by thread pthread_join function calls.
To complement and clarify what is said above, your program should have the following
characteristics and satisfy the following requirements:
• The program should function correctly for any number of threads. This does not mean
that you have to handle all the errors resulting from large numbers of threads, but only
that high numbers of threads do not in and of themselves affect program correctness.
Practically speaking this means that your synchronization is provably correct, but
not necessarily that your error-handling is robust. You will not have to test your
program with more threads than the minimum of (f d_max − 2)/2 (f d_max is the
maximum number of file descriptors that a single process can have open at the same
time) and however many threads causes pthread_create to fail. The value of f d_max
2
CSCE 313
on your system can be checked (on Unix systems) with “ulimit -n” at the command
line, while the number of threads that causes pthread_create to fail may need to be
determined experimentally. Note that both values differ greatly between operating
systems (sometimes even between different users on the same operating system…).
• No fifo special files (the named pipes used by the RequestChannel class to communicate
between the client and server) should remain after running the client program with any
number of threads.
• The client program should not take an unreasonably long time to execute. You can use
the sample code as a performance baseline.
• All requests for the program must be handled correctly, which includes being processed
in FIFO order. The histogram should report the correct number of requests and should
NOT drop any requests.
• Your code may not make use of any global variables, as they (in most circumstances)
exemplify poor programming practice. The easiest way to avoid this is by using storage
structs (more on those later).
4 Deliverables
• Code:
– You are to turn in one file ZIP file which contains the files provided in the sample
code, modified to fulfill the requirements of the assignment. Along with this, turn
in the report (described below) and any other code that your program requires to
run.
– If your program requires specific steps to compile and run, please provide a
README file that describes these steps.
• Report:
– Once you’ve finished all programming tasks, author a report that answers the
following questions about your code:
1. Describe what your code does and how it differs from the code that was
initially given to you.
2. Make a graph that shows how your client program’s running time for n =
10000 varies with the value for w. Include at least 20 data points (more or
less evenly spaced) starting at 5 and going to the highest number that will
run (without reporting some kind of error) on your OS. After making the
graph, describe the behavior of the client program as w increases. Is there a
point at which the overhead of managing threads in the kernel outweighs the
benefits of multithreading? Also compare (quantitatively and qualitatively)
your client program’s performance to the code you were originally given.
3
CSCE 313
∗ Note: Timing must be done internally by your program, rather than by the
shell it’s running in. A couple of good options for this are clock_gettime()
and gettimeofday() (see corresponding manpages: man 3 clock_gettime
and man 2 gettimeofday).
∗ Note: You may find that the provided test.sh script is a good starting
point for gathering data. Feel free to modify it as you find necessary.
3. Describe the platform that you gathered your timing data on (I.e. CSCE
Linux server, Raspberry Pi, personal computer, etc…). For any device other
than the departmental servers, briefly describe the operating system that it
runs (and if you used one of the departmental servers, just say which one you
used). Also answer the following sub-questions:
∗ What is the maximum number of threads that the host machine will allow
your client program to create before reporting an error? What error is
reported?
∗ What does the operating system do when your client program tries to
create more threads than allowed?
∗ How does your client program behave in response?
Notes Concerning Global Variables and Storage Structs
Global variables are generally regarded as poor coding practice but we realize that some
students haven’t yet been taught how to avoid using them in a multithreaded program, where
they seem to be a very easy and attractive option. Here, then, is the way we have found
easiest and most effective to avoid using global variables: storage structs. (we use “struct”
here for our explanation but note that, in C++, “class” works just as well)
A storage struct is an old-fashioned C struct, which serves mainly to store and organize
different variables. An instance of such a struct can be passed to a thread function as the last
argument given to pthread_create. The thread function then accesses its arg parameter,
a void pointer, and casts it back to the type of the storage struct that is being used, and
is then able access all the struct’s fields and methods. If a programmer cleverly defines the
struct’s fields based on the needs of his program, he has a very clean and efficient way to
pass parameters to a thread function.
Instances of storage structs can be declared and initialized in main, and any memory
allocated for them is valid so long as it remains in scope. This means that pointers to storage
structs that have been allocated on the stack in main, and to their data fields, can be passed
to thread functions and will be valid until they go out of scope. If heap memory is used, it
will be valid until explicitly deleted. Pointers and storage structs can thus be used to solve
the scoping problems previously solved by global variables, making the latter unnecessary.
You may find it helpful to go over the example code in the man page for pthread_create,
since it includes an example of storage struct usage (look for the thread_info struct.
Notes Concerning the Graph for your Report
Figure #2: Sample Timing Data
4
CSCE 313
0 200 400 600 800 1,000 1,200 1,400 1,600 1,800 2,000
0
0.5
1
1.5
2
·107
Number of Worker Threads (w)
Execution Time (in
µ-secs)
This graph was created using data gathered on a MacBook Pro 2011 running OS X Sierra,
with n held constant at 10000. You will not be expected to gather this many data points
(there are more than 60 in this graph), or run anywhere near as many as 1900 threads,
for your report: gathering this much data required increasing the number of available file
descriptors as root, as well as more error-handling code that you will be expected to write.
It does, however, demonstrate the expected behavior as w increases, and is intended to
encourage you to test your program with larger numbers of threads than you might initially
be inclined to. After all, the report requires you to test right up to the default limitations of
your OS (but no further).
If, in gathering data for your report, you find that the default limitations of your OS (namely,
available file descriptors and thread creation resources) prevent you from building a graph
that demonstrates the full range of expected behavior (i.e. performance increases, then
flattens out, then suddenly becomes horrible), not to worry: you are only required to run as
many threads as your OS will allow by default. But you will not be able to complete Part 3
of the report by doing any less.
5