5 research outputs found
Towards More Usable Dataset Search: From Query Characterization to Snippet Generation
Reusing published datasets on the Web is of great interest to researchers and
developers. Their data needs may be met by submitting queries to a dataset
search engine to retrieve relevant datasets. In this ongoing work towards
developing a more usable dataset search engine, we characterize real data needs
by annotating the semantics of 1,947 queries using a novel fine-grained scheme,
to provide implications for enhancing dataset search. Based on the findings, we
present a query-centered framework for dataset search, and explore the
implementation of snippet generation and evaluate it with a preliminary user
study.Comment: 4 pages, The 28th ACM International Conference on Information and
Knowledge Management (CIKM 2019
DING! Dataset Ranking using Formal Descriptions
Considering that thousands if not millions of linked datasets will be published soon, we motivate in this paper the need for an efficient and effective way to rank interlinked datasets based on formal descriptions of their characteristics. We propose DING (from Dataset RankING) as a new approach to rank linked datasets using information provided by the voiD vocabulary. DING is a domain-independent link anal- ysis that measures the popularity of datasets by considering the cardinality and types of the relationships. We propose also a methodology to automatically assign weights to link types. We evaluate the proposed ranking algorithm against other well known ones, such as PageRank or HITS, using synthetic voiD descriptions. Early results show that DING performs better than the standardWeb ranking algorithms