1,371 research outputs found

    Inferring Anomalies from Data using Bayesian Networks

    Get PDF
    Existing studies on data mining has largely focused on the design of measures and algorithms to identify outliers in large and high dimensional categorical and numeric databases. However, not much stress has been given on the interestingness of the reported outlier. One way to ascertain interestingness and usefulness of the reported outlier is by making use of domain knowledge. In this thesis, we present measures to discover outliers based on background knowledge, represented by a Bayesian network. Using causal relationships between attributes encoded in the Bayesian framework, we demonstrate that meaningful outliers, i.e., outliers which encode important or new information are those which violate causal relationships encoded in the model. Depending upon nature of data, several approaches are proposed to identify and explain anomalies using Bayesian knowledge. Outliers are often identified as data points which are ``rare'', ''isolated'', or ''far away from their nearest neighbors''. We show that these characteristics may not be an accurate way of describing interesting outliers. Through a critical analysis on several existing outlier detection techniques, we show why there is a mismatch between outliers as entities described by these characteristics and ``real'' outliers as identified using Bayesian approach. We show that the Bayesian approaches presented in this thesis has better accuracy in mining genuine outliers while, keeping a low false positive rate as compared to traditional outlier detection techniques

    Communication-aware motion planning in mobile networks

    Get PDF
    Over the past few years, considerable progress has been made in the area of networked robotic systems and mobile sensor networks. The vision of a mobile sensor network cooperatively learning and adapting in harsh unknown environments to achieve a common goal is closer than ever. In addition to sensing, communication plays a key role in the overall performance of a mobile network, as nodes need to cooperate to achieve their tasks and thus have to communicate vital information in environments that are typically challenging for communication. Therefore, in order to realize the full potentials of such networks, an integrative approach to sensing (information gathering), communication (information exchange), and motion planning is needed, such that each mobile sensor considers the impact of its motion decisions on both sensing and communication, and optimizes its trajectory accordingly. This is the main motivation for this dissertation. This dissertation focuses on communication-aware motion planning of mobile networks in the presence of realistic communication channels that experience path loss, shadowing and multipath fading. This is a challenging multi-disciplinary task. It requires an assessment of wireless link qualities at places that are not yet visited by the mobile sensors as well as a proper co-optimization of sensing, communication and navigation objectives, such that each mobile sensor chooses a trajectory that provides the best balance between its sensing and communication, while satisfying the constraints on its connectivity, motion and energy consumption. While some trajectories allow the mobile sensors to sense efficiently, they may not result in a good communication. On the other hand, trajectories that optimize communication may result in poor sensing. The main contribution of this dissertation is then to address these challenges by proposing a new paradigm for communication-aware motion planning in mobile networks. We consider three examples from networked robotics and mobile sensor network literature: target tracking, surveillance and dynamic coverage. For these examples, we show how probabilistic assessment of the channel can be used to integrate sensing, communication and navigation objectives when planning the motion in order to guarantee satisfactory performance of the network in realistic communication settings. Specifically, we characterize the performance of the proposed framework mathematically and unveil new and considerably more efficient system behaviors. Finally, since multipath fading cannot be assessed, proper strategies are needed to increase the robustness of the network to multipath fading and other modeling/channel assessment errors. We further devise such robustness strategies in the context of our communication-aware surveillance scenario. Overall, our results show the superior performance of the proposed motion planning approaches in realistic fading environments and provide an in-depth understanding of the underlying design trade-off space

    Automatic Bayesian Density Analysis

    Full text link
    Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference. As a result, supervision from statisticians is usually needed to find the right model for the data. However, since domain experts are not necessarily also experts in statistics, we propose Automatic Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible at large. Specifically, ABDA allows for automatic and efficient missing value estimation, statistical data type and likelihood discovery, anomaly detection and dependency structure mining, on top of providing accurate density estimation. Extensive empirical evidence shows that ABDA is a suitable tool for automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19

    Probabilistic Multi-Dimensional Classification

    Full text link
    Multi-dimensional classification (MDC) can be employed in a range of applications where one needs to predict multiple class variables for each given instance. Many existing MDC methods suffer from at least one of inaccuracy, scalability, limited use to certain types of data, hardness of interpretation or lack of probabilistic (uncertainty) estimations. This paper is an attempt to address all these disadvantages simultaneously. We propose a formal framework for probabilistic MDC in which learning an optimal multi-dimensional classifier can be decomposed, without loss of generality, into learning a set of (smaller) single-variable multi-class probabilistic classifiers and a directed acyclic graph. Current and future developments of both probabilistic classification and graphical model learning can directly enhance our framework, which is flexible and provably optimal. A collection of experiments is conducted to highlight the usefulness of this MDC framework.Comment: Accepted for the 39th Conference on Uncertainty in Artificial Intelligence (UAI 2023
    • …
    corecore