1 research outputs found
An IR-based Approach Towards Automated Integration of Geo-spatial Datasets in Map-based Software Systems
Data is arguably the most valuable asset of the modern world. In this era,
the success of any data-intensive solution relies on the quality of data that
drives it. Among vast amount of data that are captured, managed, and analyzed
everyday, geospatial data are one of the most interesting class of data that
hold geographical information of real-world phenomena and can be visualized as
digital maps. Geo-spatial data is the source of many enterprise solutions that
provide local information and insights. In order to increase the quality of
such solutions, companies continuously aggregate geospatial datasets from
various sources. However, lack of a global standard model for geospatial
datasets makes the task of merging and integrating datasets difficult and
error-prone. Traditionally, domain experts manually validate the data
integration process by merging new data sources and/or new versions of previous
data against conflicts and other requirement violations. However, this approach
is not scalable and is hinder toward rapid release, when dealing with
frequently changing big datasets. Thus more automated approaches with limited
interaction with domain experts is required. As a first step to tackle this
problem, in this paper, we leverage Information Retrieval (IR) and geospatial
search techniques to propose a systematic and automated conflict identification
approach. To evaluate our approach, we conduct a case study in which we measure
the accuracy of our approach in several real-world scenarios and we interview
with software developers at Localintel Inc. (our industry partner) to get their
feedbacks.Comment: ESEC/FSE 2019 - Industry trac