QUEST: QUery-driven Exploration of Semistructured Data with ConflicTs and partial knowledge, CleanDB
- Publication date
- Publisher
Abstract
An important reality when integrating scientific data is the fact that data may often be “missing”, partially specified, or conflicting. Therefore, in this paper, we present an assertion-based data model that captures both value-based and structure-based “nulls” in data. We also introduce the QUEST system, which leverages the proposed model for Query-driven Exploration of Semistructured data with conflicT s and partial knowledge. Our approach to integration lies in enabling researchers to observe and resolve conflicts in the data by considering the context provided by the data requirements of a given research question. In particular, we discuss how pathcompatibility can be leveraged, within the context of a query, to develop a high-level understanding of conflicts and nulls in data. 1 Motivation and Related Work Through a joint effort of archaeologists and computer scientists, we are developing an integrated framework of knowledge-based collaborative tools that will provide the foundation for a shared information infrastructure for archaeology and contribute substantially to a shared knowledge infrastructure of science [21]. Today, the incapacity to integrate data across projects cripples archaeologists ’ and other scientists’ efforts to recognize phenomena operating on large spatio-temporal scales and to conduct crucial comparative studies [20, 21]. A major challenge with integration of data is that the meaning of an archaeological observation is rarely self-evident