DSCI551 Homework 4 (Indexing and Query Execution)

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (1 vote)

1. [40 points] Consider the following B+-tree for the search key “age. Suppose the degree d of the
tree = 2, that is, each node (except for root) must have at least two keys and at most 4 keys.

Note that sibling nodes are nodes with the same parent.
a. [10 points] Describe the process of finding keys for the query condition “age >= 15 and
age <= 50”. How many blocks I/O’s are needed for the process?

b. [15 points] Show the B+-tree after deleting 20, 30, and 43 (in the shown order). Show
the updated tree after EACH deletion.

c. [15 points] Show the tree after inserting 14, 15, and 45 (in the shown order) into the
tree produced in sub-question b . Show the updated tree after EACH insertion.

2. [60 points] Consider natural-joining tables R(a, b) and S(a,c). Suppose we have the following
scenario.
i. R is a clustered relation with 20,000 blocks.
ii. S is a clustered relation with 50,000 blocks.

iii. 102 pages available in main memory for the join.
iv. Assume the output of join is given to the next operator in the query execution plan
(instead of writing to the disk) and thus the cost of writing the output is ignored.

Describe the steps for each of the following join algorithms. For sorting and hashing-based
algorithms, also indicate the sizes of output from each step. What is the total number of block
I/O’s needed for each algorithm? Which algorithm is most efficient in terms of block’s I/O?

a. [10 points] (Block-based) nested-loop join with R as the outer relation.
b. [10 points] (Block-based) nested-loop join with S as the outer relation.

c. [20 points] Sort-merge join (assume only 100 pages are used for sorting and 101 pages for
merging). Note that if join can not be done by using only a single merging pass, runs from
one or both relations need to be further merged, in order to reduce the number of runs.
Select the relation with a larger number of runs for further merging first if both have too
many runs.

d. [20 points] Partitioned-hash join (assume 101 pages used in partitioning of relations and no
hash table is used to lookup in joining tuples).

Submission Requirements:
Please read carefully before submitting your work:
Please submit one .pdf file containing answers for both Question 1 and 2. The answers can either be
typed or handwritten. Name the submission files as firstname_lastname_HW4.pdf