Profiling Transformations in Heterogeneous and Large Scale Metadata Harvesting Processes

Joao Sequeira, Joao Edmundo, Hugo Manguinhas, Gilberto Pedrosa, Jose Borbinha

Abstract


most organizations communicate automatically with each other through their information systems and the Internet. When those data structures are not unique, data transformation is a required process. Two fundamental issues of that process are the interpretation of the source and destination schemas, and the definition of the mapping relating them. The definition of these mappings can also be called schema matching. Matching can be performed through two distinct methods, manually by humans or automatically by algorithms. Since matching can be a non-deterministic process, it suites best the manual process, but when the schemas are complex (and especially when there is an incomplete knowledge of the source schema) it might require a large intellectual effort. In these scenarios automatic processes can be used to produce drafts of the mapping, to be corrected and accepted later by a human. The motivation of this work is to contribute to improving interoperability in heterogeneous and large-scale data integration processes in the scope of digital libraries. Libraries, archives, museums and other related organizations face the need to share their resource descriptive metadata.

Full Text:

PDF