An Exploratory Analysis of Subject Metadata in the Digital Public Library of America

Hannah Tarver, Mark Phillips, Oksana Zavalina, Priya Kizhakkethil


This paper presents results of an exploratory quantitative analysis of subject representation in the large dataset of over 8 million item-level metadata records in the Digital Public Library of America (DPLA) originating from a number of institutions that serve as content or service hubs of DPLA. The findings demonstrate both similarities and differences in subject representation across content and service hub providers.  This benchmark study provides empirical data about the distribution of subjects at the hub level (e.g., minimum, maximum, and average number of subjects per record; number of records without subjects; and number of unique subjects) as well as distribution by hub type (content or service hubs), and subjects shared across similar hubs or across the entire aggregation.

Full Text:

PDF (Paper)