# CS451 “Introduction to Parallel and Distributed Computing” Homework 2

\$30.00

## Description

1. Shared memory programming (Pthreads & openMP)
In this problem you are asked to write two parallel algorithms (one in Pthreads and the
other in openMP) for solving a dense linear equations of the form A*x=b, where A is an n*n
matrix and b is a vector. You will use Gaussian elimination without pivoting.
The algorithm has two parts:
(a) Gaussian Elimination: the original system of equation is reduced to an upper triangular form
Ux=y, where U is a matrix of size n*n in which all elements below the diagonal are zeros
which are the modified values of the A matrix, and the diagonal elements have the value 1.
The vector y is the modified version of the b vector when you do the updates on the matrix
and the vector in the Gaussian elimination stage.
(b) Back Substitution: the new system of equations is solved to obtain the values of x.
The Gaussian elimination stage of the algorithm comprises (n-1) steps. In the algorithm, the ith
step eliminates nonzero subdiagonal elements in column i by subtracting the ith row from row j in
the range [i+1,n], in each case scaling the ith row by the factor Aji/ Aii so as to make the element
Aji zero. See the figure.
Hence the algorithm sweeps down the matrix from the top left corner to the bottom right corner,
leaving subdiagonal elements behind it.
Normalization
0

Update
Values
2
The whole point of parallel programming is performance, so you will be graded partially on
the efficiency of your algorithm. Therefore, once you achieve a correct solution, you should
perform some experimentation with your programs to try to optimize its performance. You should
hand in two working and documented programs. You should provide a basic correctness
argument for your solution (arguing that the relevant dependencies and other issues are taken care
of by proper synchronization). In addition, you should describe some of the different versions of
your program that you tried, what performance results you got out of them, and why you think
your current version is reasonably efficient (or what more you could do to make it more
efficient).
Suggestions:
• A sequential C code is provided on Blackboard.
• Gaussian elimination involves O(n3
) operations. The back substitution stage requires only
O(n2
) operations, so you are not expected to parallelize back substitution.
• The algorithm should scale, assuming n is much larger than the number of processors.
• There should be clear comments at the beginning of the code explaining your algorithm.
Note on cheating: This is an individual assignment. Copying problem solutions or code is
cheating. Both the person copying and the person giving the answers will be equally
penalized. Make sure you do your own work. Also, make sure your code can’t be copied by
other students. To do that, you can change every file permission with chmod 700 [file], or