CSC 455: Database Processing for Large-Scale Analytics Assignment 3

$30.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (4 votes)

Part 1

In this and the next part we will use an extended version of the schema from Assignment 2. You can find it in a file ZooDatabase.sql posted with this assignment on D2L.

Once again, it is up to you to write the SQL queries to answer the following questions:

1. List the animals (animal names) and the ID of the zoo keeper assigned to them.

2. Now repeat the previous query and make sure that the animals without a handler also appear in the answer.

3. Report, for every zoo keeper name, the total number of hours they spend feeding all animals in their care.

4. Report every handling assignment (as a list of assignment date, zoo keeper name and animal name). Sort the result of the query by the assignment date in an ascending order.

5. Find the names of animals that have at least 1 zoo keeper assigned to them.

6. Find the names of animals that have 0 or 1 (i.e., less than 2) zoo keepers assigned to them.

• Optional query:

List all combination of animals where the difference between feeding time requirement is within 0.25 hours (e.g., Grizzly bear, 3, Bengal tiger, 2.75). Hint: this will require a self-join. Avoid listing identical pairs such as (Grizzly bear, 3, Grizzly bear, 3)

Part 2

A. Write a python script that is going to read the queries that you have created in Part-1 from a SQL file, execute each SQL query against SQLite3 database and print the output of that query. You must read your SQL queries from a file, please do not copy SQL directly into python code. The code that would run commands from the ZooDatabase.sql file is provided (runSQL.py), so all you have to do is to change it so that it reads your queries from a SQL file and prints the output of your queries. You can refer to example code from the previous assignment that prints query results using fetchall(). You do not have to format the output in any particular fashion – however, you must print every row individually using a loop.

B. Repeat the work you did in Part-2 of the previous homework using the data file
Public_Chauffeurs_Short_hw3.csv attached in D2L in this assignment dropbox.
It contains roughly the same data, with two changes: NULL may now be represented by NULL or an empty string (,NULL, or ,,) and some of the names have the following form “Last, First” instead of “First Last”, which is problematic because when you split the string on a comma, you end up with too many values to insert.

Part 3

Using the company.sql database (posted in with this assignment), write the following SQL queries.

1. Find the names of all employees who are directly supervised by ‘Franklin T Wong’.

2. For each project, list the project name, project number, and the total hours per week (by all employees) spent on that project.

3. For each department, retrieve the department name and the average salary of all employees working in that department. Order the output by department number in ascending order.

4. Retrieve the average salary of all female employees.

5. For each department whose average salary is greater than $43,000, retrieve the department name and the number of employees in that department.

6. Retrieve the names of employees whose salary is within $22,000 of the salary of the employee who is paid the most in the company (e.g., if the highest salary in the company is $82,000, retrieve the names of all employees that make at least $60,000.).

Be sure that your name and “Assignment 3” appear at the top of your submitted file.