Seminario Interdipartimentale di Algoritmica
 

Monday, October 26th, 2009, 12:00 noon
Partial Sorting Problems on Suffixes
Gianni Franceschini, "Sapienza" - University of Rome

DIS - Department of Computer Engineering, Via Ariosto 25
Aula Magna, first floor


Abstract

A partial sorting problem is a sorting problem where (i) only a subset I  of the N input objects is involved in the process and (ii) we want to find only the objects in I with ranks contained in a set  R that is a subset of {1,..,|I|}. Given their basic importance in Computer Science, various special instances have been widely studied in the past: if |I|=N and |R|=N we have the classic sorting problem, if |I|=N and |R|=1 we have the selection problem and so forth. Usually, such studies have been carried out within the settings of the comparison model and with simple unidimensional objects (i.e. comparable in O(1) time) as input elements.
In this talk we focus on partial sorting problems  on suffixes, that is where the input set S contains the suffixes of a given sequence T of N elements. Problems on suffixes have  practical and theoretical relevance in Computer Science in connection with various fields e.g. Text Indexing, Bioinformatics etc. The plain suffix sorting problem (i.e. where all the suffixes are sorted and all the ranks are sought) has been widely studied in the past and its asymptotic complexity is well-known (O(n log n) time in the worst case for a generic alphabet). This is not the case for more generic partial sorting problems on suffixes whose complexity has been discovered only recently (e.g. suffix selection) or is still unknown. Given the nature of the input objects (suffixes are neither unidimensional objects and not independent ones, i.e. they may share large portions of the input sequence T), solving partial sorting problems on suffixes is especially hard and seems to require the introduction of new algorithmic techniques.
In this talk we will first describe an optimal O(n) time worst-case solution to the suffix selection problem (appeared in [Franceschini Muthukrishnan, STOC 2007]) and then we will briefly discuss some of the other problems in this line of research.