Short Paper

Metadata in Trustworthy AI: From Data Quality to ML Modeling

Download PDF Read Online

Metadata play a significant role in making AI models trustworthy by providing information on input, output, models, pipelines, and other artifacts to meet the requirements for trustworthy AI. This concept paper focuses on what role metadata play in an AI lifecycle and how metadata research can ride out this AI wave with innovative creations. Specifically, we explore metadata’s role and potential related to data quality and ML models. The multidimensionality of metadata for data in AI is driving metadata to be micro-specific, embedded in data and models, highly computational, and fast-moving or agile. While there are no universally agreeable metadata schemas for documenting the artifacts in ML model development, there are some common areas or types of metadata for ML models. Data quality and ML models are tightly connected and can impact one another in significant ways. Trustworthy AI must rely on quality data and responsible, ethical, reproducible, verifiable ML models, and the assurance of these data and ML model properties relies on metadata. The complex, fast paced, and highly computational nature of metadata for AI artifacts (datasets, models, pipelines, algorithms, lineages, etc.) is making conventional metadata development processes and methods outdated, but meanwhile has prompted some innovative metadata creations.

Author information

Jian Qin
Syracuse Iniversity, US
Bei Yu
Syracuse University, US

Cite this article

Qin, J., & Yu, B. (2024). Metadata in Trustworthy AI: From Data Quality to ML Modeling. International Conference on Dublin Core and Metadata Applications, 2023.

DOI : 10.23106/dcmi.953354037

CC-0 Logo Metadata and citations of this article is published under the Creative Commons Zero Universal Public Domain Dedication (CC0), allowing unrestricted reuse. Anyone can freely use the metadata from DCPapers articles for any purpose without limitations.
CC-BY Logo This article full-text is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license allows use, sharing, adaptation, distribution, and reproduction in any medium or format, provided that appropriate credit is given to the original author(s) and the source is cited.