CS69011: Computing Lab 1 Assignment 2

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (3 votes)

Consider a social network like Facebook, which has users � = {�%, … , �(}, and shared content � = {�%, … , �,}.
The shared content may include images, videos, third party URLs etc., which signify user activity and interest on
the social networks. Additionally, the social network has two more data sources: friendship links (undirected
edges) between users � ⊆ � × �, and sharing links (directed edges) � ⊆ � × �, where each edge 2�3, �45 ∈ �
denotes that user �3 has shared content �4. Together, they form the user-content graph for the social network. For
the purpose of this assignment, you can assume that this graph is a connected graph.
Input File format:
The user-content graph is input as follows:
n (number of users)
:
: m (number of contents)
:
: Task 1: Marks: 10
For the first task, consider the problem of content recommendation, a content �4 is shown to user �3 based on the
distance (length of the shortest path) between �3 and �4. Note that, there are no paths through the content nodes
since all share edges are directed. Consider �% and �7 are not friends (say connected through the social network
with distance 5), but both have shared content �%, and �7 has also shared content �7. Then, the distance
between �% and �7 is not 3 due to path (�% -> �% -> �7 -> �7).
Your task is to write a program which will list the first � recommended contents, �39, … , �3:, in increasing order
of their distances from user �3. The content which is already shared by �3 should not be displayed. Here the user
�3 and � should be taken as input, and name of the file containing user content graph should be taken as command
line input.
Hint: Use a modification of Dijkstra’s single source shortest path algorithm.
Input: filename for the user content graph in the above format, the user �3 for whom the content will be
recommended and � the maximum number of recommended contents. These should be taken as command line
input.
Output: The output for this task is a list of recommended content �39, … , �3: along with the corresponding
distances �39, … , �3: from user �3.
Algorithm description:
Dijkstra’s algorithm maintains, at each stage:
1. Distances to all nodes in the graph, some of them infinity.
2. A set Q of all vertices which can be explored next.
In every step, the minimum distance node in set Q is finalized (it’s distance is frozen as the shortest distance and
the node is removed from Q), and its neighbors are processed or relaxed (added to the set Q and their distance are
updated). You have to modify the above algorithm so that you keep a counter which counts the number of content
nodes which are finalized. Once this number reaches �, you can stop the algorithm and output the � content nodes
that you have discovered along with their distances.
Data structures: Both the distance and the Q set can be maintained as arrays of size the number of vertices. Mindistance node can be found using a linear scan or a min-heap. The min-heap implementation is more efficient and
will get more marks.
You can call this function int topKMinDist(Graph *G, int start, int * nodes, int *
distances), where G is an input graph pointer (can be taken as an adjacency list), and nodes and distances are
arrays for return top k node ids and distances. The return value from the function can be the actual number of
content nodes (which could be less than k).
Task 2: Marks: 10
In the second task, consider the problem of friend recommendation, using the number of common contents in the
�-neighborhood. Specifically, given a user �3, you will rank users �4 using the count of contents �<, such that distances between �3 and �<, �(�3, �<), and �(�4, �<) are both less than �. Let �(�3, �4) be the count of k-neighbors which are content nodes in the user-content graph. Write a program which given a user content graph, a user �3 and �, prints the list of top � users �4 in decreasing order of �(�3, �4). Input: the user content graph as above, a user �3, neighborhood size �, list length �. The filename for the user content graph can be taken as first command line input, followed by other arguments in the order mentioned above. Output: The list of top � users along with their corresponding distances. Algorithm description: In this problem, you have to find all content node whose distance is less than � from the starting user node �3. Let that list be �. Next, find all user nodes �4 which are at distance at most � from each of these content nodes. Since, there are not paths starting with content nodes, you have consider all paths from all users nodes which are directly connected to the content nodes. The algorithm for finding distance are same as described in task 1. You need two functions: int distanceLContentNodes(Graph *G, int start, int * nodes, int * distances) and int connectedUserNodes(Graph *G, int start, int * nodes, int * distances) Task 3: Marks: 10 Now, consider that there are privacy settings for each share link (�3, �4) where each link is marked as: • Friends: only friends can see the post. • Friends of friends: friends of friends can also see the post. • Public: everyone can see the post. The privacy setting induces additional and implicit “view” edges, which connect each shared content �4 to friends of �3 (1-hop neighbors), friends of friends of �3 (both 1-hop and 2-hop neighbors), and public (all users of the system). The third task is to recommend content on the basis of distance in the “view” graph, which consists of the original edge as well as view edges introduced due to privacy settings. However, you are not allowed to generate the full view graph and then use the algorithm developed in task 1. Your algorithm should only “expand” the relevant share edges. For example, if the top list ends with content at distance 3 in the view graph, there is no need to expand shares of content beyond distance 5 in the user-content graph, since distance can reduce by at most 2 in the view graph, unless the share is a public one. Note that, here all content can be recommended to all users. Only post information is private. Algorithm hint: You should treat public shares separately, since they will be at distance 1 from all users. While updating the distances in the shortest path algorithm, you should take care of privacy setting of the edge. Input File format: The modified input format of privacy aware user-content graph is as follows: n (number of users) :
: m (number of contents)
: ,; ,; …



: ,; ,; …
Input: filename for the privacy-aware user-content graph in the above format, the user �3 for whom the content
will be recommended and � the maximum number of recommended content. These should be taken as command
line input.
Output: The output for this task is a list of recommended content �39, … , �3: along with the corresponding
distances �39, … , �3: from user �3.
Algorithm Description:
There are two cases:
1. Content �4 has been posted publicly by some user. Hence, it is at distance 1 from all other users in the
view graph. Hence it should be included in the list.
2. If the number of such content is less than �. Then, note that all distance � content nodes are:
a. Either “friends” shares of distance � − 1 user nodes from �3
b. Or “friends of friends” shares of distance � − 2 user nodes from �3.
Use the above logic to progressively enumerate all distance � content nodes for � = 1, … ,�, where � is the
diameter of the graph. If all content node have been listed, or �-content nodes have been listed (whichever is
earlier), exit.
Submission:
Submit 3 c source files for: task 1, task 2, and task 3, respectively. Write your name and roll number in a comment
at the top of the file along with compilation instructions. The program should read the graph as input from a file
and the name of the file should be first argument. Other inputs should be read from command line arguments as
mentioned in the task. You can assume that maximum number of users and contents to be 1000 each.
Marks will be given for more understandable (logical order of functions, naming of variables, etc.), maintainable
(proper comments) and concise (no code duplication) programs.