## Description

Objectives

This assignment has two parts.

In Part 1, you will be trained to draw an E/R diagram (Task 1) and transform it into relational schemas (Task 2).

In Part 2, you will be trained to master important techniques related to database normalization (Tasks 3-5).

Download A4.zip. Answer the questions in A4.ipynb.

Part 1. Entity-Relationship Model (10 points)

You will design a database for SFU. This database will include information about departments, students, courses (and their offerings):

Information about students includes their SID, name and age. The SID of a student is assumed to be unique, not shared by any other student. Each student is either a graduate or or an undergraduate.

Each student must be in one category or the other, and cannot be in both categories simultaneously.

For graduate students, we record what their research field is.

For undergraduate students, we record their concentration.

Information about departments includes their name and address. The name of a department is assumed to be unique, not shared by any other department.

We need to be able to associate student with the departments with which they are affiliated. Each student has to be affiliated with exactly one department.

Information about a course includes its number (e.g., “354”), name (e.g., “Introduction to Databases”), and capacity (e.g., 110). We also need to be able to know the unique department that owns each course: no cross-listing of courses across departments is allowed, and every course is owned by exactly one department.

Note: you cannot assume that course number uniquely identifies a course; in fact, you cannot assume even that course number together with course name uniquely identify a course. However, course number uniquely identifies courses within a department.

Finally, we need to record all terms — identified as semester (e.g., “fall”) and year (e.g., “2018”) — in which each course has been offered in the history of the university.

Assume that for a course to be offered during a term, it has at least one student enrolled. Also a course is offered at most once during each term. In other words, a course cannot have multiple sections during one term.

Finally, assume that a student can take courses “owned” by departments with which the student is not affiliated. And a student should be enrolled in at least one course.

Task 1: E/R Diagram (5 points)

Render the SFU database in the version of the E/R model that we studied in class, with exactly the constraints and requirements specified above.

Drawing

Task 2: From E/R Diagram to Relational Schemas (5 points).

Please follow the above E/R Diagram and write SQL queries to create required tables in sfu.db

%load_ext sql

%sql sqlite:///sfu.db

u’Connected: @sfu.db’

#REPLACE WITH YOUR CODE

Part 2. Normalization (10 points)

Task 3. Decompose a relational schema into BCNF

Consider a relational schema and a set of functional dependencies:

R(A,B,C,D,E) with functional dependencies A→E , BC→A , DE→B

Decompose R(A,B,C,D,E) into BCNF. Show all of your work and explain, at each step, which dependency violations you are correcting. You have to write down a description of your decomposition steps. （2 points)

REPLACE WITH YOUR ANSWER

Task 4. Find a set of FDs that is consistent with a closed attribute set

A set of attributes X is called closed (with respect to a given set of functional dependencies) if X+=X . Consider a relation with schema R(A,B,C,D) and an unknown set of functional dependencies. For each closed attribute set below, give a set of functional dependencies that is consistent with it.

a. All sets of attributes are closed (1 point)

REPLACE WITH YOUR ANSWER

b. The only closed sets are {} and {A,B,C,D} (1 point)

REPLACE WITH YOUR ANSWER

c. The only closed sets are {} , {A,B} , and {A,B,C,D} (1 point)

REPLACE WITH YOUR ANSWER

Task 5. Normalize a database

Suppose Mike is the owner of a small store. He uses the following database (mike.db) to store monthly sales of his store.

Sales(name, discount, mouth, price)

%load_ext sql

%sql sqlite:///mike.db

The sql extension is already loaded. To reload it, use:

%reload_ext sql

u’Connected: @mike.db’

%sql select * from Sales limit 5

* sqlite:///mike.db

sqlite:///sfu.db

Done.

name discount month price

bar1 0.15 apr 19

bar8 0.15 apr 19

gizmo3 0.15 apr 19

gizmo7 0.15 apr 19

mouse1 0.15 apr 19

However, Mike finds that the database is difficult to update (i.e., when inserting new data into the database). Your job is to help Mike to normalize his database. You should do the following steps(a-d):

a. Find all nontrivial functional dependencies in the database. This is a reverse engineering task, so expect to proceed in a trial and error fashion. Search first for the simple dependencies, say name→discount then try the more complex ones, like name,discount→month , as needed. To check each functional dependency you have to write a SQL query.

Your challenge is to write this SQL query for every candidate functional dependency that you check, such that:

the query’s answer is always short (say: no more than ten lines – remember that 0 results can be instructive as well)

you can determine whether the FD holds or not by looking at the query’s answer. Try to be clever in order not to check too many dependencies, but don’t miss potential relevant dependencies. For example, if you have A → B and C → D, you do not need to derive AC → BD as well.

Write down all FDs that you found. (1 point)

REPLACE WITH YOUR ANSWER

For each FD above, write down the SQL query that discovered it (remember short queries are preferred) (1 point)

# REPLACE WITH YOUR CODE

b. Decompose the Sales table into BCNF. Like Task 1, show a description of your decomposition steps. (1 point)

REPLACE WITH YOUR ANSWER

c. Write down SQL queries to create the BCNF tables in the mike.db. Create keys and foreign keys where appropriate. (1 point)

# REPLACE WITH YOUR CODE

d. Populate the BCNF tables using the data from the sales table. (1 point)

Hint: see SQL INSERT INTO SELECT Statement

# REPLACE WITH YOUR CODE

Submission

Download A4.zip. Answer the questions in A4.ipynb. Put A4.ipynb, ER-diagram.png, sfu.db, and the mike.db (with populated BCNF tables) into A4-submission.zip.

Submit A4-submission.zip to the CourSys activity Assignment 4.