2,295 research outputs found

    Certainty of outlier and boundary points processing in data mining

    Full text link
    Data certainty is one of the issues in the real-world applications which is caused by unwanted noise in data. Recently, more attentions have been paid to overcome this problem. We proposed a new method based on neutrosophic set (NS) theory to detect boundary and outlier points as challenging points in clustering methods. Generally, firstly, a certainty value is assigned to data points based on the proposed definition in NS. Then, certainty set is presented for the proposed cost function in NS domain by considering a set of main clusters and noise cluster. After that, the proposed cost function is minimized by gradient descent method. Data points are clustered based on their membership degrees. Outlier points are assigned to noise cluster and boundary points are assigned to main clusters with almost same membership degrees. To show the effectiveness of the proposed method, two types of datasets including 3 datasets in Scatter type and 4 datasets in UCI type are used. Results demonstrate that the proposed cost function handles boundary and outlier points with more accurate membership degrees and outperforms existing state of the art clustering methods.Comment: Conference Paper, 6 page

    Closed-loop Bayesian Semantic Data Fusion for Collaborative Human-Autonomy Target Search

    Full text link
    In search applications, autonomous unmanned vehicles must be able to efficiently reacquire and localize mobile targets that can remain out of view for long periods of time in large spaces. As such, all available information sources must be actively leveraged -- including imprecise but readily available semantic observations provided by humans. To achieve this, this work develops and validates a novel collaborative human-machine sensing solution for dynamic target search. Our approach uses continuous partially observable Markov decision process (CPOMDP) planning to generate vehicle trajectories that optimally exploit imperfect detection data from onboard sensors, as well as semantic natural language observations that can be specifically requested from human sensors. The key innovation is a scalable hierarchical Gaussian mixture model formulation for efficiently solving CPOMDPs with semantic observations in continuous dynamic state spaces. The approach is demonstrated and validated with a real human-robot team engaged in dynamic indoor target search and capture scenarios on a custom testbed.Comment: Final version accepted and submitted to 2018 FUSION Conference (Cambridge, UK, July 2018

    Context for Ubiquitous Data Management

    Get PDF
    In response to the advance of ubiquitous computing technologies, we believe that for computer systems to be ubiquitous, they must be context-aware. In this paper, we address the impact of context-awareness on ubiquitous data management. To do this, we overview different characteristics of context in order to develop a clear understanding of context, as well as its implications and requirements for context-aware data management. References to recent research activities and applicable techniques are also provided

    A framework for distributed managing uncertain data in RFID traceability networks

    Get PDF
    The ability to track and trace individual items, especially through large-scale and distributed networks, is the key to realizing many important business applications such as supply chain management, asset tracking, and counterfeit detection. Networked RFID (radio frequency identification), which uses the Internet to connect otherwise isolated RFID systems and software, is an emerging technology to support traceability applications. Despite its promising benefits, there remains many challenges to be overcome before these benefits can be realized. One significant challenge centers around dealing with uncertainty of raw RFID data. In this paper, we propose a novel framework to effectively manage the uncertainty of RFID data in large scale traceability networks. The framework consists of a global object tracking model and a local RFID data cleaning model. In particular, we propose a Markov-based model for tracking objects globally and a particle filter based approach for processing noisy, low-level RFID data locally. Our implementation validates the proposed approach and the experimental results show its effectiveness.Jiangang Ma, Quan Z. Sheng, Damith Ranasinghe, Jen Min Chuah and Yanbo W

    The design and implementation of fuzzy query processing on sensor networks

    Get PDF
    Sensor nodes and Wireless Sensor Networks (WSN) enable observation of the physical world in unprecedented levels of granularity. A growing number of environmental monitoring applications are being designed to leverage data collection features of WSN, increasing the need for efficient data management techniques and for comparative analysis of various data management techniques. My research leverages aspects of fuzzy database, specifically fuzzy data representation and fuzzy or flexible queries to improve upon the efficiency of existing data management techniques by exploiting the inherent uncertainty of the data collected by WSN. Herein I present my research contributions. I provide classification of WSN middleware to illustrate varying approaches to data management for WSN and identify a need to better handle the uncertainty inherent in data collected from physical environments and to take advantage of the imprecision of the data to increase the efficiency of WSN by requiring less information be transmitted to adequately answer queries posed by WSN monitoring applications. In this dissertation, I present a novel approach to querying WSN, in which semantic knowledge about sensor attributes is represented as fuzzy terms. I present an enhanced simulation environment that supports more flexible and realistic analysis by using cellular automata models to separately model the deployed WSN and the underlying physical environment. Simulation experiments are used to evaluate my fuzzy query approach for environmental monitoring applications. My analysis shows that using fuzzy queries improves upon other data management techniques by reducing the amount of data that needs to be collected to accurately satisfy application requests. This reduction in data transmission results in increased battery life within sensors, an important measure of cost and performance for WSN applications

    Handling location uncertainty in probabilistic location-dependent queries

    Get PDF
    Location-based services have motivated intensive research in the field of mobile computing, and particularly on location-dependent queries. Existing approaches usually assume that the location data are expressed at a fine geographic precision (physical coordinates such as GPS). However, many positioning mechanisms are subject to an inherent imprecision (e.g., the cell-id mechanism used in cellular networks can only determine the cell where a certain moving object is located). Moreover, even a GPS location can be subject to an error or be obfuscated for privacy reasons. Thus, moving objects can be considered to be associated not to an exact location, but to an uncertainty area where they can be located. In this paper, we analyze the problem introduced by the imprecision of the location data available in the data sources by modeling them using uncertainty areas. To do so, we propose to use a higher-level representation of locations which includes uncertainty, formalizing the concept of uncertainty location granule. This allows us to consider probabilistic location-dependent queries, among which we will focus on probabilistic inside (range) constraints. The adopted model allows us to develop a systematic and efficient approach for processing this kind of queries. An experimental evaluation shows that these probabilistic queries can be supported efficiently

    Probabilistic Skyline Queries over Uncertain Moving Objects

    Get PDF
    Data uncertainty inherently exists in a large number of applications due to factors such as limitations of measuring equipments, update delay, and network bandwidth. Recently, modeling and querying uncertain data have attracted considerable attention from the database community. However, how to perform advanced analysis on uncertain data remains an interesting question. In this paper, we focus on the execution of skyline computation over uncertain moving objects. We propose a novel probabilistic skyline model where an uncertain object may take a probability to be in the skyline at a certain time point, therefore a p-t-skyline contains those moving objects whose skyline probabilities are at least p at time point t. Computing probabilistic skyline over a large number of uncertain moving objects is a daunting task in practice. In order to efficiently compute the probabilistic skyline query, we propose a discrete-and-conquer strategy, which follows the sampling-bounding-pruning-refining procedure. To further reduce the skyline computation cost, we propose an enhanced framework that is based on a multi-dimensional indexing structure combined with the discrete-and-conquer strategy. Through extensive experiments with synthetic datasets, we show that the framework can efficiently support skyline queries over uncertain moving object and is scalable on large data sets
    • ā€¦
    corecore