Corso di laurea magistrale in Data Science
Facoltà di Ingegneria dell'Informazione, Informatica e Statistica, Sapienza Università di Roma

Data Management for Data Science -
Homework Assignments

2023/2024

Prof. Domenico Lembo
Prof. Riccardo Rosati


Students can present homeworks during the lectures. There will be three homework assignments, which will be announced on this web page.


Assignment 1 - SQL

Choose an application domain and, using a relational DBMS, build a database. This can be done in two ways:

Students can use publicly available DBMSs like MySQL or PostgreSQL (see below), or other, commercial DBMSs.

The queries defined by the students should comprise all the aspects of SQL queries analyzed during the course lectures and exercises (joins, aggregations, nested queries, queries with negated subqueries). The complexity of the queries produced should be at least comparable to the specification appearing in this exercise on SQL.


Assignment 2 - SQL evaluation and optimization

Starting from the database developed in the first homework, every group has to identify at least 4 SQL queries that pose performance problems to the DBMS. The students have to show both the "slow" and the "fast" execution of the queries, where the fast version is obtained by:

Ideally, these queries should be picked from the queries created for the first homework; however, new queries can be considered if none of the previous queries poses performance problems to the DBMS.


Rules

GROUPS: The homework must be done by groups of two students.

GROUP REGISTRATION: Every group must send an email to both prof. Lembo and prof. Rosati no later than March 27, 2024 (strict deadline), with subject: "DMDS homework group" and containing:

The teachers will create and maintain a list of the projects officially registered, and such a list will be accessible on Classroom (so the students can check on that list if their registration has been successful).

PRESENTATION OF 1ST AND 2ND HOMEWORK: The presentation of the work done wlll consist of a short (15 minutes) session in which the students will show the work done by directly interacting with the relational DBMS on their own laptop.

PRESENTATION DATES: The first homework, together with the second one, will be presented online during the week of April 15-19, 2024 (the exact schedule will be published a few days before April 15).

EVALUATION: For every homework, every student will get a score ranging from -4 to +1. The final exam score will be computed as follows:

final_score = hw_1 + hw_2 + hw_3 + 30

where hw_n is the score of homework n (if final_score > 30, then the final score is 30 cum laude).

USEFUL LINKS: