788 research outputs found

    Anomaly detection and dynamic decision making for stochastic systems

    Full text link
    Thesis (Ph.D.)--Boston UniversityThis dissertation focuses on two types of problems, both of which are related to systems with uncertainties. The first problem concerns network system anomaly detection. We present several stochastic and deterministic methods for anomaly detection of networks whose normal behavior is not time-varying. Our methods cover most of the common techniques in the anomaly detection field. We evaluate all methods in a simulated network that consists of nominal data, three flow-level anomalies and one packet-level attack. Through analyzing the results, we summarize the advantages and the disadvantages of each method. As a next step, we propose two robust stochastic anomaly detection methods for networks whose normal behavior is time-varying. We develop a procedure for learning the underlying family of patterns that characterize a time-varying network. This procedure first estimates a large class of patterns from network data and then refines it to select a representative subset. The latter part formulates the refinement problem using ideas from set covering via integer programming. Then we propose two robust methods, one model-free and one model-based, to evaluate whether a sequence of observations is drawn from the learned patterns. Simulation results show that the robust methods have significant advantages over the alternative stationary methods in time-varying networks. The final anomaly detection setting we consider targets the detection of botnets before they launch an attack. Our method analyzes the social graph of the nodes in a network and consists of two stages: (i) network anomaly detection based on large deviations theory and (ii) community detection based on a refined modularity measure. We apply our method on real-world botnet traffic and compare its performance with other methods. The second problem considered by this dissertation concerns sequential decision mak- ings under uncertainty, which can be modeled by a Markov Decision Processes (MDPs). We focus on methods with an actor-critic structure, where the critic part estimates the gradient of the overall objective with respect to tunable policy parameters and the actor part optimizes a policy with respect to these parameters. Most existing actor- critic methods use Temporal Difference (TD) learning to estimate the gradient and steepest gradient ascent to update the policies. Our first contribution is to propose an actor-critic method that uses a Least Squares Temporal Difference (LSTD) method, which is known to converge faster than the TD methods. Our second contribution is to develop a new Newton-like actor-critic method that performs better especially for ill-conditioned problems. We evaluate our methods in problems motivated from robot motion control

    An Improved Wavelet‐Based Multivariable Fault Detection Scheme

    Get PDF
    Data observed from environmental and engineering processes are usually noisy and correlated in time, which makes the fault detection more difficult as the presence of noise degrades fault detection quality. Multiscale representation of data using wavelets is a powerful feature extraction tool that is well suited to denoising and decorrelating time series data. In this chapter, we combine the advantages of multiscale partial least squares (MSPLSs) modeling with those of the univariate EWMA (exponentially weighted moving average) monitoring chart, which results in an improved fault detection system, especially for detecting small faults in highly correlated, multivariate data. Toward this end, we applied EWMA chart to the output residuals obtained from MSPLS model. It is shown through simulated distillation column data the significant improvement in fault detection can be obtained by using the proposed methods as compared to the use of the conventional partial least square (PLS)‐based Q and EWMA methods and MSPLS‐based Q method

    Detection and optimization problems with applications in smart cities

    Full text link
    This dissertation proposes solutions to a selected set of detection and optimization problems, whose applications are focused on transportation systems. The goal is to help build smarter and more efficient transportation systems, hence smarter cities. Problems with dynamics evolving in two different time-scales are considered: (1) In a fast time-scale, the dissertation considers the problem of detection, especially statistical anomaly detection in real-time. From a theoretical perspective and under Markovian assumptions, novel threshold estimators are derived for the widely used Hoeffding test. This results in a test with a much better ability to control false alarms while maintaining a high detection rate. From a practical perspective, the improved test is applied to detecting non-typical traffic jams in the Boston road network using real traffic data reported by the Waze smartphone navigation application. The detection results can alert the drivers to reroute so as to avoid the corresponding areas and provide the most urgent "targets" to the Transportation department and/or emergency services to intervene and remedy the underlying cause resulting in these jams, thus, improving transportation systems and contributing to the smart city agenda. (2) In a slower time-scale, the dissertation investigates a host of optimization problems, including estimation and adjustment of Origin-Destination (OD) demand, traffic assignment, recovery of travel cost functions, and joint recovery of travel cost functions and OD demand (joint problem). Integrating these problems leads to a data-driven predictive model which serves to diagnose/control/optimize the transportation network. To ensure good accuracy of the predictive model and increase its robustness and consistency, several novel formulations for the travel cost function recovery problem and the joint problem are proposed. A data-driven framework is proposed to evaluate the Price-of-Anarchy (PoA; a metric assessing the degree of congestion under selfish user-centric routing vs. socially-optimal system-centric routing). For the case where the PoA is larger than expected, three viable strategies are proposed to reduce it. To demonstrate the effectiveness and efficiency of the proposed approaches, case-studies are conducted on three benchmark transportation networks using synthetic data and an actual road network (from Eastern Massachusetts (EMA)) using real traffic data. Moreover, to facilitate research in the transportation community, the largest highway subnetwork of EMA has been released as a new benchmark network

    A Review of Kernel Methods for Feature Extraction in Nonlinear Process Monitoring

    Get PDF
    Kernel methods are a class of learning machines for the fast recognition of nonlinear patterns in any data set. In this paper, the applications of kernel methods for feature extraction in industrial process monitoring are systematically reviewed. First, we describe the reasons for using kernel methods and contextualize them among other machine learning tools. Second, by reviewing a total of 230 papers, this work has identified 12 major issues surrounding the use of kernel methods for nonlinear feature extraction. Each issue was discussed as to why they are important and how they were addressed through the years by many researchers. We also present a breakdown of the commonly used kernel functions, parameter selection routes, and case studies. Lastly, this review provides an outlook into the future of kernel-based process monitoring, which can hopefully instigate more advanced yet practical solutions in the process industries

    Inferring Power Grid Information with Power Line Communications: Review and Insights

    Full text link
    High-frequency signals were widely studied in the last decade to identify grid and channel conditions in PLNs. PLMs operating on the grid's physical layer are capable of transmitting such signals to infer information about the grid. Hence, PLC is a suitable communication technology for SG applications, especially suited for grid monitoring and surveillance. In this paper, we provide several contributions: 1) a classification of PLC-based applications; 2) a taxonomy of the related methodologies; 3) a review of the literature in the area of PLC Grid Information Inference (GII); and, insights that can be leveraged to further advance the field. We found research contributions addressing PLMs for three main PLC-GII applications: topology inference, anomaly detection, and physical layer key generation. In addition, various PLC-GII measurement, processing, and analysis approaches were found to provide distinctive features in measurement resolution, computation complexity, and analysis accuracy. We utilize the outcome of our review to shed light on the current limitations of the research contributions and suggest future research directions in this field.Comment: IEEE Communication Surveys and Tutorials Journa

    Distributions based Regression Techniques for Compositional Data

    Get PDF
    A systematic study of regression methods for compositional data, which are unique and rare are explored in this thesis. We start with the basic machine learning concept of regression. We use regression equations to solve a classification problem. With partial least squares discriminant analysis (PLS-DA), we follow regression algorithms and solve classification problems, like spam filtering and intrusion detection. After getting the basic understanding of how regression works, we move on to more complex algorithms of distributions based regression. We explore the uni-dimensional case of distributions, applied to regression, the beta-regression. This gives us an understanding of how, when the data to be predicted, or the outcome, is assumed to be of beta distribution, a prediction can be made with regression equations. To further enhance our understanding, we look into Dirichlet distribution, which is for a multi-dimensional case. Unlike traditional regression, here we are predicting a compositional outcome. Two novel regression approaches based on distributions are proposed for compositional data, namely generalized Dirichlet regression and Beta-Liouville regression. They are extensions of Beta regression in a multi-dimensional scenario, similar to Dirichlet regression. The models are learned by maximum likelihood estimation algorithm using Newton-Raphson approach. The performance comparison between the proposed models and other popular solutions is given and both synthetic and real data sets extracted from challenging applications such as market share analysis using Google-Trends and occupancy estimation in smart buildings are evaluated to show the merits of the proposed approaches. Our work will act as a tool for product based companies to estimate how their investments in advertising have yielded results in the market shares. Google-Trends gives an estimate of the popularity of a company, which reflects the effect of advertisements. This thesis bridges the gap between open source data from Google-Trends and market shares

    Foundational principles for large scale inference: Illustrations through correlation mining

    Full text link
    When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number nn of acquired samples (statistical replicates) is far fewer than the number pp of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data." Sample complexity however has received relatively less attention, especially in the setting when the sample size nn is fixed, and the dimension pp grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. We demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks

    Hyperspectral Remote Sensing Data Analysis and Future Challenges

    Full text link
    • 

    corecore