Making SQL Queries Correct on Incomplete Databases
Multiple issues with SQL's handling of nulls have been well documented. To fix them, a
modification of SQL's query evaluation that eliminates false query answers was proposed
in my ICDT 2015 (also TODS 2016) paper, and shown to retain a good theoretical AC0
complexity. Two notable questions remained though: do false answer occur in real-life
SQL queries? And can the new evaluation scheme be made practical? We now answer these.
Using the standard TPC-H benchmark, we show that false answers are very common for
some typical queries. On the other hand, the modified evaluation from ICDT 2015 does
not work well as it computes very large intermediate results. So we designed a new way
of avoiding wrong answers, based on a different idea, and showed that it works well with
benchmark queries. For half of them, the overhead for correctness was very small, under
4%. For others, we saw either a dramatic improvement of several orders of magnitude, or
a slowdown that was still tolerable when correctness is important. I'll explain how the
new query modification works, present experimental results, and offer comments on the
observed behavior and on deficiencies of query processing in commercial databases that
one needs to overcome to make correct evaluation even more efficient.
(Joint work with Paolo Guagliardo)
Bio sketch. Leonid Libkin is Professor of Foundations of Data Management in the School
of Informatics at the University of Edinburgh. He was previously a Professor at the
University of Toronto and a member of research staff at Bell Laboratories in Murray Hill.
He received his PhD from the University of Pennsylvania in 1994. His main research
interests are in the areas of data management and applications of logic in computer
science. He has written five books and about 200 technical papers. His awards include
a Marie Curie Chair Award and five Best Paper Awards. He has chaired programme
committees of major database conferences (ACM PODS, ICDT) and was the conference
chair of the 2010 Federated Logic Conference. He has given many invited conference
talks and has served on multiple program committees and editorial boards. He is an ACM
fellow, a fellow of the Royal Society of Edinburgh, and a member of Academia Europaea.