How Portable Are the Metadata Standards for Scientific Data? A Proposal for a Metadata Infrastructure

Jian Qin, Kai Li


The one-covers-all approach in current metadata standards for scientific data has
serious limitations in keeping up with the ever-growing data and being built as part of a metadata
infrastructure. This paper reports the preliminary findings from a survey to metadata standards in
the scientific data domain and argues for the need for a metadata infrastructure. The survey collected
4400+ unique elements from 16 standards and categorized these elements into 9 categories. Preliminary
findings from the data include inconsistent naming of elements across standards, a fraction of
single-word element names, and varying linguistic forms of elements. The limitations of large,
complex standards and widely varied naming practices are the major hurdles for building a metadata
infrastructure. The paper articulated the three principles for metadata infrastructure: the least
effort principle is the premise on which the metadata infrastructure argument operates; being portable
is the essential condition or prerequisite for metadata schemes to be “infrastructurized” – a word
coined to denote the state of being built into or as part of the infrastructure; and the infrastructure
service principle means that metadata elements, vocabularies, entities, and other metadata artifacts
are established as the underlying foundation upon which the tools and applications as well as functions
of metadata services are built.

