Project Report

Leverage Natural Language Processing (NLP) to improve the discoverability of academic resources

Charlene Chou ORCID,Shravan Khunti ORCID,Harshit Bhargava ORCID

DOI: 10.23106/dcmi.952586098

Abstract

This interdisciplinary project is a collaboration among library metadata librarians, data scientists, digital library technologists, university IT, and the university press. Its goal is to improve the discoverability of academic resources by enhancing metadata through Natural Language Processing (NLP) and embedding-based semantic search, addressing the limitations of traditional keyword-based retrieval. To support this pilot, a library NLP system architecture has been designed, including the development of a vector database to enable semantic search within discovery platforms

Author information

Charlene Chou

Division of Libraries, New York University,US

Shravan Khunti

Center for Data Science, New York University,US

Harshit Bhargava

Center for Data Science, New York University,US

Cite this article

Chou, C., Khunti, S., & Bhargava, H. (2025). Leverage Natural Language Processing (NLP) to improve the discoverability of academic resources. Proceedings of the International Conference on Dublin Core and Metadata Applications, 2025. https://doi.org/10.23106/dcmi.952586098
Published

Issue

DCMI 2025 Conference Proceedings
Location:
University of Barcelona, Barcelona, Spain
Dates:
October 22-25, 2025
CC-0 Logo Metadata and citations of this article is published under the Creative Commons Zero Universal Public Domain Dedication (CC0), allowing unrestricted reuse. Anyone can freely use the metadata from DCPapers articles for any purpose without limitations.
CC-BY Logo This article full-text is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license allows use, sharing, adaptation, distribution, and reproduction in any medium or format, provided that appropriate credit is given to the original author(s) and the source is cited.