7,149 research outputs found

    A Probabilistic Data Fusion Modeling Approach for Extracting True Values from Uncertain and Conflicting Attributes

    Get PDF
    Real-world data obtained from integrating heterogeneous data sources are often multi-valued, uncertain, imprecise, error-prone, outdated, and have different degrees of accuracy and correctness. It is critical to resolve data uncertainty and conflicts to present quality data that reflect actual world values. This task is called data fusion. In this paper, we deal with the problem of data fusion based on probabilistic entity linkage and uncertainty management in conflict data. Data fusion has been widely explored in the research community. However, concerns such as explicit uncertainty management and on-demand data fusion, which can cope with dynamic data sources, have not been studied well. This paper proposes a new probabilistic data fusion modeling approach that attempts to find true data values under conditions of uncertain or conflicted multi-valued attributes. These attributes are generated from the probabilistic linkage and merging alternatives of multi-corresponding entities. Consequently, the paper identifies and formulates several data fusion cases and sample spaces that require further conditional computation using our computational fusion method. The identification is established to fit with a real-world data fusion problem. In the real world, there is always the possibility of heterogeneous data sources, the integration of probabilistic entities, single or multiple truth values for certain attributes, and different combinations of attribute values as alternatives for each generated entity. We validate our probabilistic data fusion approach through mathematical representation based on three data sources with different reliability scores. The validity of the approach was assessed via implementation into our probabilistic integration system to show how it can manage and resolve different cases of data conflicts and inconsistencies. The outcome showed improved accuracy in identifying true values due to the association of constructive evidence

    Data Credence in IoR: Vision and Challenges

    Get PDF
    As the Internet of Things permeates every aspect of human life, assessing the credence or integrity of the data generated by "things" becomes a central exercise for making decisions or in auditing events. In this paper, we present a vision of this exercise that includes the notion of data credence, assessing data credence in an efficient manner, and the use of technologies that are on the horizon for the very large scale Internet of Things

    Data Credence in IoT: Vision and Challenges

    Get PDF
    As the Internet of Things permeates every aspect of human life, assessing the credence or integrity of the data generated by "things" becomes a central exercise for making decisions or in auditing events. In this paper, we present a vision of this exercise that includes the notion of data credence, assessing data credence in an efficient manner, and the use of technologies that are on the horizon for the very large scale Internet of Things

    Node Classification in Uncertain Graphs

    Full text link
    In many real applications that use and analyze networked data, the links in the network graph may be erroneous, or derived from probabilistic techniques. In such cases, the node classification problem can be challenging, since the unreliability of the links may affect the final results of the classification process. If the information about link reliability is not used explicitly, the classification accuracy in the underlying network may be affected adversely. In this paper, we focus on situations that require the analysis of the uncertainty that is present in the graph structure. We study the novel problem of node classification in uncertain graphs, by treating uncertainty as a first-class citizen. We propose two techniques based on a Bayes model and automatic parameter selection, and show that the incorporation of uncertainty in the classification process as a first-class citizen is beneficial. We experimentally evaluate the proposed approach using different real data sets, and study the behavior of the algorithms under different conditions. The results demonstrate the effectiveness and efficiency of our approach

    EFFICIENT INFORMATION INTEGRATION SYSTEM FOR TEMPORAL AND SPATIAL DATA

    Get PDF
    In this dissertation, I develop a novel inconsistency detection and data fusion method for data integration systems. Inconsistent data may lead to incorrect query results and induce unexplainable outcomes. I propose an inconsistency detection method to find out which data items (e.g., temporal or spatial report) have the higher potential to cause data conflicts as well as to estimate a reasonable consistent reported value. My approach is based on representing overlapping data reports as a characteristic linear system. The characteristic linear system can be used to estimate consistent reported values within overlapping time and space intervals. I explore applicability of the proposed approach in different domains. In particular, I perform temporal data fusion with time-overlapping reports using a historical database. I also experiment with spatial data fusion involving space-overlapping reports using simulation of sensor data sets of robots performing search and rescue task. Finally, I apply the proposed approach to combine temporal and spatial fusion and demonstrate that such multidimensional fusion improves inconsistency detection and target value estimation
    corecore