How Your Peers Exchange Resources
within the IBM Shared University Research program
date: September 20, 2003
of the project is to set up a framework for data integration in
peer-to-peer networking environments, and to develop methods,
techniques, and algorithms for data integration in these environments.
Differently from the traditional setting, data integration in a
peer-to-peer networking environment is not based
on the existence of a global schema. Instead, each peer exports data in
terms of its own schema, and information integration is achieved by
establishing mappings among the various peer schemas. Therefore, the
notion of mapping is central to this problem. We plan to investigate
this notion, with the goal of devising techniques for representing
mappings, for characterizing their semantics, and for carrying out
processing based on them.
the formal approaches to data integration refer to an architecture
on a global schema and a set of sources. The sources contain the real
data, while the global schema provides a reconciled, integrated, and
virtual view of the underlying sources. One of the challenging issues
these systems is to answer queries posed to the global schema. Due to
the architecture of the system, query processing requires a
reformulation step: the query over the global schema must be
re-expressed in terms of a set of queries over the sources. Differently
from the traditional setting, data integration in a
peer-to-peer networking environment is not based
on the existence of a global schema. Instead, in these systems every
node (peer) acts as both client and server, and provides part of the
overall information available from a distributed environment, without
relying on a single global view. A suitable infrastructure is adopted
for managing the information in the various nodes. The mechanism for
linking the various nodes is based on the assumption that each peer
exports data in terms of its own schema, and information integration is
achieved by establishing mappings among the various peer schemas.
Current P2P systems focus strictly on handling semantic-free,
large-granularity requests for objects by identifier, which both limits
their utility and restricts the techniques that might be employed to
the data. These current sharing systems are largely limited to
applications in which objects are described by their name, and exhibit
strong limitations in establishing complex links between peers. To
overcome these limitations, we envisage a framework where the mappings
between peers are established by specifying that certain views over one
peer schema correspond to certain views over another peer schema. Two
fundamental issues in this framework are (i) assigning a suitable semantics
to these mappings, and (ii)
designing query answering algorithms that are sound and complete with
respect to the semantics.
The project aims at both investigating
the theoretical foundations of data integration in a peer-to-peer
networking environments, as well as developing techniques and a
prototype implementation for query answering in peer-to-peer systems.
Specifically, the expected results are the following:
for data integration in a peer-to-peer networking environment.
This methodology aims at specifying both the architecture of a single
peer, and how a peer exchanges information with other peers.
Specifically, we envision a framework in which each peer exports its
data in terms of a schema visible to other peers, and is equipped with
two kinds of mappings, called local and peer-to-peer, respectively.
Local mappings specify how data sources that are local to the peer are
linked to the peer schema. Peer-to-peer mappings specify how views over
one peer schema are related to views over other peer schemas. Each peer
specifies its mappings to other peers autonomously. As a consequence, a
distinguishing feature of our approach is not to limit in any way the
topology of the peer-to-peer network.
We aim at investigating different methods for specifying formal
semantics of the framework outlined above, comparing their
characteristics, and selecting the method best suited for devesing
effective query answering techniques.
answering algorithms in peer-to-peer data integration.
Another distinguishing feature of our approach is that the query
answering algorithms will be proved to be sound and complete with
respect to the semantic characterization mentioned above.
- Implementation via web services.
The whole framework, including
the query answering algorithms will be implemented on a web-service
of the talk by Maurizio Lenzerini at the International Workshop on Databases,
Information Systems, and P2P Computing, Berlin,
Germany, September 2003.
- Slides of the
invited tutorial on "Data integration: A
theoretical perspective", by Maurizio Lenzerini, at
the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database
Systems, PODS 2002, Madison, Winsconsin, USA, June 2002.
- Related projects:
- Related papers:
- Alon Halevy, Zack Ives, Dan Suciu and Igor Tatarinov. "Schema
Mediation in Peer Data Management Systems". Proceedings of the
International Conference on Data Engineering, ICDE, 2003.
- Luciano Serafini, Fausto Giunchiglia, John Mylopoulos, Philip
A. Bernstein. "Local Relational Model: A Logical Formalization of
Database Coordination". CONTEXT
Pagina a cura di
Ultima modifica: 23/09/2003