Web Algorithmics and Data Mining

Research lines: 

  • Web Search and Mining
  • Graph and Text mining
  • Large-scale Complex Networks
  • On-line Social Networks
  • Algorithmic Mechanism Design and Network Economics

Luca Becchetti, Stefano Leonardi (leader). 

PhD Students: 
Ilaria Bordino, Ida Mele. 

Post Docs: 
Aris Anagnostopoulos, Piotr Sankowski. 

Our interest in on algorithmic methods for characterizing the structure of large-scale complex networks with application to Web structure mining and Web usage mining. We have focussed so far on developing algorithms for graph based feature extraction and detection of significant patterns that characterize social activities, trust relationships and content quality. 
In cooperation with Yahoo! Research group in Barcelona, we developed, analyzed and tested effective, scalable and efficient techniques for the automatic detection of topological structures in the Web graph that are likely to be the result of spamming activity. This research has been expanded to provide efficient methods to estimate the distribution of small substructures that are typically related to specific forms of social interaction. We also developed algorithmic methods for the extraction of meaningful information from the massive data available in query logs, a task of critical importance for detecting semantic relations between users, queries and pages. The design and analysis of economic mechanisms in the realm of the Internet and the Web and the computational issues of implementing economic mechanisms, as for instance ad auctions for on-line advertising, is another major research direction of our group. In the last few years we have concentrated our efforts in the area of the design of efficient cost-sharing and utilitarian mechanisms for network design, single and multi-objective optimization problems. 
The Web has evolved from an excellent medium for sharing information into a complex and attractive social environment for the delivery of content rich information, products and services. In this respect, mining social network data for enhancing and personalizing web search and retrieval is a major research direction. Development of algorithmic strategies and analytic tools for influence spreading, viral marketing and technology adoption is of crucial importance for many computer mediated collaboration and commercial activities. E- commerce applications also require the implementation of economic mechanisms that address new problems, such as computerized auctions for Web ads. Marketing on the Web also requires sophisticated algorithmic tools for mining the huge amount of user activity data collected from search engines and other applications, for the identification of important trends or to provide fundamental tools, such as recommendation services. Finally, Web size and the increasing importance of the above applications pose serious scalability issues that we are going to tackle, such as the development of sophisticated ads and query caching techniques. 

DELIS - Dynamically Evolving Large Scale Information Systems
 January 2004 - February 2008   - EU FP6  FET 

Projects managed by DI - Sapienza: 
WEB RAM - Web Retrieval and Mining 
January 2007 - December 2008  -   MIUR PRIN