D2I
Integrazione, Warehousing e Mining di sorgenti eterogenee
The ARTEMIS prototype for the construction of reconciled views based on affinity evaluation and interactive clustering

Silvana Castano, Alfio Ferrara, Michele Melchiori, Giorgio Ornetti

 
Tema Tema 1: Integrazione di dati provenienti da sorgenti eterogenee
Codice D1-P7
Data 11 ottobre 2002
Tipo di prodotto Prototipo software
Unità responsabile MI
Unità coinvolte MI
Autori Silvana Castano, Alfio Ferrara, Michele Melchiori, Giorgio Ornetti
Autore da contattare Alfio Ferrara
Dipartimento di Scienze dell'Informazione
Università degli Studi di Milano
Via Comelico 39, 20135 Milano
ferrara@dsi.unimi.it
Presentazione prototipo D1-P7

Documentazione in linea  http://isserver.usr.dico.unimi.it/artemis/d2i/
 


Descrizione

The ARTEMIS tool environment performs the semantic integration of strongly heterogeneous data sources, both structured and semi-structured. The integration process is based on the construction of a semantically rich representation of the data sources to be integrated, by means of a common data model based on the ODLi3 language. The first phase of the ARTEMIS integration process is the schema matching. Goal of this functionality is to identify ODLi3 classes candidate to integration, that is, classes that describe the same or semantically related information in different source schemas. To this end, affinity coefficients are evaluated for all possible pairs of ODLi3 classes, based on the relationships in the Common Thesaurus properly strengthened. To asses the level of affinity of two ODLi3 classes in a comprehensive way, a Global Affinity coefficient taking into account also the knowledge provided by extensional properties is introduced. Global affinity coefficients are then used by a hierarchical clustering algorithm, to classify ODLi3 classes. The output of the clustering procedure is an affinity tree, where ODLi3 classes are the leaves and intermediate nodes have an associated affinity value, holding for the classes in the corresponding cluster. Cluster for integration (candidate clusters) are interactively selected from the affinity tree using a threshold based mechanism. The definition of the mediation scheme is performed through rule-based unification techniques by which classes belonging to a given cluster are reconciled into global classes.

 

Ambiente di sviluppo e di esecuzione

JAVA JDK 1.4 - CORBA 2.2
Eseguibile in ambienti Windows 9x/NT - Unix

 

Back

 
 
 
Sito a cura di Domenico Lembo
lembo@dis.uniroma1.it