Corso di laurea magistrale in Data Science
Facoltà di Ingegneria dell'Informazione, Informatica e Statistica, Sapienza Università di Roma

Data Management for Data Science -
Homework Assignments

2015/2016

Prof. Riccardo Rosati


Assignment 1 - SQL

Choose an application domain and, using a relational DBMS, build a database. This can be done in two ways:

The work can be done by a single student or by a group of two students.

Students can use publicly available DBMSs like MySQL or PostgreSQL (see below), or other, commercial DBMSs.

The complexity of the relational schema and of the queries produced should be comparable to the specification appearing in the following exercise on SQL.

The presentation of the work done wlll consist of a short (10-15 minutes) session in which the student(s) will show the work done by directly interacting with the relational DBMS on her/his/their own laptop.

Such presentations will take place during the lecture of March 23, 2016.

Useful links:


Assignment 2 - SQL: indexing and query evaluation

  1. Consider the database of Assignment 1, and define a second database containing the same data but a different schema obtained from the original schema by adding (or deleting) primary keys and indices, in order to solve item 2 below.
  2. Write at least 3 or 4 SQL queries such that the evaluation of such queries on the new database is significantly faster than the evaluation of such queries on the old database, and for every such query, provide an explanation of why the execution times are different in the two databases.

The work can be done by the same student groups who presented the first homework assignment on March 23.

The presentation of the work done wlll consist of a very short (at most 5 minutes) session in which the student(s) will show the work done by directly interacting with the relational DBMS on her/his/their own laptop.

Such presentations will take place during the lecture of April 27, 2016.


Assignment 3 - NoSQL

Develop a small data management project using a NoSQL system (see the slides on graph databases and the slides on aggregated databases for more details).

The students can either re-use the domain and the data of the first two assignments, or create a totally new project.

The work must be done by the same student groups who presented the first and second homework assignments.

The presentation of the work done wlll consist of a short (at most 15 minutes) session in which the student(s) will show the work done by directly interacting with the NoSQL system on her/his/their own laptop, highlighting the differences with repsect to a standard (SQL) relational database system.

Such presentations will take place on June 3, 2016, 15:00, via Ariosto 25, room A5.