Can We Trust SQL as a Data Analytics Tool?
Multiple surveys show that SQL and relational databases remain the most common tools used by data scientists. But can we fully trust them? We give a few examples showing unexpected and counterintuitive behavior of even simple SQL queries that make one question analytics results obtained from relational DBMSs. The talk will then give a quick overview of two lines of work that attempt to overcome these problems. One concerns a formal semantics of SQL, to at least eliminate the element of surprise in query results. The other presents a revised evaluation scheme that restores correctness to the notoriously unpredictable behavior of SQL queries over databases with incomplete information.