HYPER

How Your Peers Exchange Resources



A project within the IBM Shared University Research program

Partners
:
Start date: September 20, 2003

Objectives

 
Motivations
 
Expected results
 
Presentations
 
Publications
 
People
 
Links


Objectives

The goal of the project is to set up a framework for data integration in peer-to-peer networking environments, and to develop methods, techniques, and algorithms for data integration in these environments. Differently from the traditional setting, data integration in a peer-to-peer networking environment is not based on the existence of a global schema. Instead, each peer exports data in terms of its own schema, and information integration is achieved by establishing mappings among the various peer schemas. Therefore, the notion of mapping is central to this problem. We plan to investigate this notion, with the goal of devising techniques for representing mappings, for characterizing their semantics, and for carrying out query processing based on them.



Motivations

Most of the formal approaches to data integration refer to an architecture based on a global schema and a set of sources. The sources contain the real data, while the global schema provides a reconciled, integrated, and virtual view of the underlying sources. One of the challenging issues in these systems is to answer queries posed to the global schema. Due to the architecture of the system, query processing requires a reformulation step: the query over the global schema must be re-expressed in terms of a set of queries over the sources. Differently from the traditional setting, data integration in a peer-to-peer networking environment is not based on the existence of a global schema. Instead, in these systems every node (peer) acts as both client and server, and provides part of the overall information available from a distributed environment, without relying on a single global view. A suitable infrastructure is adopted for managing the information in the various nodes. The mechanism for linking the various nodes is based on the assumption that each peer exports data in terms of its own schema, and information integration is achieved by establishing mappings among the various peer schemas. Current P2P systems focus strictly on handling semantic-free, large-granularity requests for objects by identifier, which both limits their utility and restricts the techniques that might be employed to distribute
the data. These current sharing systems are largely limited to applications in which objects are described by their name, and exhibit strong limitations in establishing complex links between peers. To overcome these limitations, we envisage a framework where the mappings between peers are established by specifying that certain views over one peer schema correspond to certain views over another peer schema. Two fundamental issues in this framework are (i) assigning a suitable semantics to these mappings, and (ii) designing query answering algorithms that are sound and complete with respect to the semantics.


Expected results

The project aims at both investigating the theoretical foundations of data integration in a peer-to-peer networking environments, as well as developing techniques and a prototype implementation for query answering in peer-to-peer systems. Specifically, the expected results are the following:
  1. Methodology for data integration in a peer-to-peer networking environment.
    This methodology aims at specifying both the architecture of a single peer, and how a peer exchanges information with other peers. Specifically, we envision a framework in which each peer exports its data in terms of a schema visible to other peers, and is equipped with two kinds of mappings, called local and peer-to-peer, respectively. Local mappings specify how data sources that are local to the peer are linked to the peer schema. Peer-to-peer mappings specify how views over one peer schema are related to views over other peer schemas. Each peer specifies its mappings to other peers autonomously. As a consequence, a distinguishing feature of our approach is not to limit in any way the topology of the peer-to-peer network.
  2. Semantic characterization.
    We aim at investigating different methods for specifying formal semantics of the framework outlined above, comparing their characteristics, and selecting the method best suited for devesing effective query answering techniques.
  3. Query answering algorithms in peer-to-peer data integration.
    Another distinguishing feature of our approach is that the query answering algorithms will be proved to be sound and complete with respect to the semantic characterization mentioned above.
  4. Implementation via web services.
    The whole framework, including the query answering algorithms will be implemented on a web-service based platform.


Presentations



Publications


People


Links

 

Pagina a cura di Maurizio Lenzerini
Ultima modifica: 23/09/2003