AbstractThis paper presents results of an exploratory quantitative analysis of subject representation in the large dataset of over 8 million item-level metadata records in the Digital Public Library of America (DPLA) originating from a number of institutions that serve as content or service hubs of DPLA. The findings demonstrate both similarities and differences in subject representation across content and service hub providers. This benchmark study provides empirical data about the distribution of subjects at the hub level (e.g., minimum, maximum, and average number of subjects per record; number of records without subjects; and number of unique subjects) as well as distribution by hub type (content or service hubs), and subjects shared across similar hubs or across the entire aggregation.
The copyright for articles is retained by the author(s), with first publication rights granted to DCMI for publication in the electronic and print proceedings. By virtue of their appearance in this open access publication, articles are free to be used with proper attribution for educational and other non-commercial purposes. Other uses may require the permission of the author(s).