Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Research Data

Inna Kouper, Stacy R. Konkiel, Jennifer A. Liss, Juliet L. Hardesty


Data curation projects frequently deal with data that were not created for the
purposes of long-term preservation and re-use. How can curation of such legacy data be improved by
supplying necessary metadata? In this report, we address this and other questions by creating robust
metadata for twenty legacy research datasets. We report on quantitative and qualitative metrics of
creating domain-specific metadata and propose a four-prong framework of metadata creation for legacy
research data. Our findings indicate that there is a steep learning curve in encoding metadata using
the FGDC content standard for digital geospatial metadata. Our project also demonstrates that data
curators who are handed research data “as is” and are tasked with incorporating such data into a
data sharing environment can be very successful in creating descriptive metadata -- particularly,
in conducting subject analysis and assigning keywords based on controlled vocabularies and thesauri.
At the same time, they need to be aware of limitations in their efforts when it comes to structural
and administrative metadata.