Data users need relevant context and research expertise to effectively search
for and identify relevant datasets. Leading data providers, such as the
Inter-university Consortium for Political and Social Research (ICPSR), offer
standardized metadata and search tools to support data search. Metadata
standards emphasize the machine-readability of data and its documentation.
There are opportunities to enhance dataset search by improving users' ability
to learn about, and make sense of, information about data. Prior research has
shown that context and expertise are two main barriers users face in
effectively searching for, evaluating, and deciding whether to reuse data. In
this paper, we propose a novel chatbot-based search system, DataChat, that
leverages a graph database and a large language model to provide novel ways for
users to interact with and search for research data. DataChat complements data
archives' and institutional repositories' ongoing efforts to curate, preserve,
and share research data for reuse by making it easier for users to explore and
learn about available research data.Comment: 6 pages, 2 figures, and 1 table. Accepted to the 86th Annual Meeting
of the Association for Information Science & Technolog