Full Paper

Why Build Custom Categorizers Using Boolean Queries Instead of Machine Learning? Robert Wood Johnson Foundation Case Study

Joseph Busch ,Vivian Bliss

DOI: 10.23106/dcmi.952139142

Abstract

This presentation will cover a case study for using Boolean queries to scope custom categories, provide a Boolean query syntax primer, and then present a step-by-step process for building a Boolean query categorizer. The Robert Wood Johnson Foundation (RWJF) is the largest philanthropy dedicated solely to health in the United States. Taxonomy Strategies has been working with RWJF to develop an enterprise metadata framework and taxonomy to support needs across areas including program management, research and evaluation, communications, finance, etc. We have also been working with RWJF on methods to apply automation to support taxonomy development and implementation within their various information management applications. Machine learning has become a popular and hyped method promoted by large information management application vendors including Microsoft, IBM, Salesforce and others. The problem is that machine learning is opaque. The benefit is that you don’t need to do any preparation, content just gets processed. The problem is that the categories are generic, may be irrelevant, can be biased, and are difficult to change or tune. Pre-defined categories (e.g., a controlled vocabulary or taxonomy) plus Boolean queries to scope the context for categories are much more transparent. The benefit is relevant categories. The problem is that pre-defined categories requires work to set up, and specialized skills. But how hard is it do this?

Author information

Joseph Busch

Taxonomy Strategies,US

Vivian Bliss

Taxonomy Strategies,US

Cite this article

Busch, J., & Bliss, V. (2018). Why Build Custom Categorizers Using Boolean Queries Instead of Machine Learning? Robert Wood Johnson Foundation Case Study. Proceedings of the International Conference on Dublin Core and Metadata Applications, 2018. https://doi.org/10.23106/dcmi.952139142
Published

Issue

DC-2018--The Porto, Portugal Proceedings
Location:
Porto, Portugal
Dates:
September 10-13, 2018
CC-0 Logo Metadata and citations of this article is published under the Creative Commons Zero Universal Public Domain Dedication (CC0), allowing unrestricted reuse. Anyone can freely use the metadata from DCPapers articles for any purpose without limitations.
CC-BY Logo This article full-text is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license allows use, sharing, adaptation, distribution, and reproduction in any medium or format, provided that appropriate credit is given to the original author(s) and the source is cited.