418 research outputs found

    Sequential anomaly detection in the presence of noise and limited feedback

    Full text link
    This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: (1) {\em filtering}, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations, and (2) {\em hedging}, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset.Comment: 19 pages, 12 pdf figures; final version to be published in IEEE Transactions on Information Theor

    Discrete Denoising with Shifts

    Full text link
    We introduce S-DUDE, a new algorithm for denoising DMC-corrupted data. The algorithm, which generalizes the recently introduced DUDE (Discrete Universal DEnoiser) of Weissman et al., aims to compete with a genie that has access, in addition to the noisy data, also to the underlying clean data, and can choose to switch, up to mm times, between sliding window denoisers in a way that minimizes the overall loss. When the underlying data form an individual sequence, we show that the S-DUDE performs essentially as well as this genie, provided that mm is sub-linear in the size of the data. When the clean data is emitted by a piecewise stationary process, we show that the S-DUDE achieves the optimum distribution-dependent performance, provided that the same sub-linearity condition is imposed on the number of switches. To further substantiate the universal optimality of the S-DUDE, we show that when the number of switches is allowed to grow linearly with the size of the data, \emph{any} (sequence of) scheme(s) fails to compete in the above senses. Using dynamic programming, we derive an efficient implementation of the S-DUDE, which has complexity (time and memory) growing only linearly with the data size and the number of switches mm. Preliminary experimental results are presented, suggesting that S-DUDE has the capacity to significantly improve on the performance attained by the original DUDE in applications where the nature of the data abruptly changes in time (or space), as is often the case in practice.Comment: 30 pages, 3 figures, submitted to IEEE Trans. Inform. Theor

    Evolutionary computing and particle filtering: a hardware-based motion estimation system

    Get PDF
    Particle filters constitute themselves a highly powerful estimation tool, especially when dealing with non-linear non-Gaussian systems. However, traditional approaches present several limitations, which reduce significantly their performance. Evolutionary algorithms, and more specifically their optimization capabilities, may be used in order to overcome particle-filtering weaknesses. In this paper, a novel FPGA-based particle filter that takes advantage of evolutionary computation in order to estimate motion patterns is presented. The evolutionary algorithm, which has been included inside the resampling stage, mitigates the known sample impoverishment phenomenon, very common in particle-filtering systems. In addition, a hybrid mutation technique using two different mutation operators, each of them with a specific purpose, is proposed in order to enhance estimation results and make a more robust system. Moreover, implementing the proposed Evolutionary Particle Filter as a hardware accelerator has led to faster processing times than different software implementations of the same algorithm

    Feedback Coding for Efficient Interactive Machine Learning

    Get PDF
    When training machine learning systems, the most basic scenario consists of the learning algorithm operating on a fixed batch of data, provided in its entirety before training. However, there are a large number of applications where there lies a choice in which data points are selected for labeling, and where this choice can be made “on the fly” after each selected data point is labeled. In such interactive machine learning (IML) systems, it is possible to train a model with far fewer labels than would be required with random sampling. In this thesis, we identify and model query structures in IML to develop direct information maximization solutions as well as approximations that allow for computationally efficient query selection. To do so, we frame IML as a feedback communications problem and directly apply principles and tools from coding theory to design and analyze new interaction selection algorithms. First, we directly apply a recently developed feedback coding scheme to sequential human-computer interaction systems. We then identify simplifying query structures to develop approximate methods for efficient, informative query selection in interactive ordinal embedding construction and preference learning systems. Finally, we combine the direct application of feedback coding with approximate information maximization to design and analyze a general active learning algorithm, which we study in detail for logistic regression.Ph.D

    ADAPTIVE CHANNEL AND SOURCE CODING USING APPROXIMATE INFERENCE

    Get PDF
    Channel coding and source coding are two important problems in communications. Although both channel coding and source coding (especially, the distributed source coding (DSC)) can achieve their ultimate performance by knowing the perfect knowledge of channel noise and source correlation, respectively, such information may not be always available at the decoder side. The reasons might be because of the time−varying characteristic of some communication systems and sources themselves, respectively. In this dissertation, I mainly focus on the study of online channel noise estimation and correlation estimation by using both stochastic and deterministic approximation inferences on factor graphs.In channel coding, belief propagation (BP) is a powerful algorithm to decode low−density parity check (LDPC) codes over additive white Gaussian noise (AWGN) channels. However, the traditional BP algorithm cannot adapt efficiently to the statistical change of SNR in an AWGN channel. To solve the problem, two common workarounds in approximate inference are stochastic methods (e.g. particle filtering (PF)) and deterministic methods (e.g. expectation approximation (EP)). Generally, deterministic methods are much faster than stochastic methods. In contrast, stochastic methods are more flexible and suitable for any distribution. In this dissertation, I proposed two adaptive LDPC decoding schemes, which are able to perform online estimation of time−varying channel state information (especially signal to noise ratio (SNR)) at the bit−level by incorporating PF and EP algorithms. Through experimental results, I compare the performance between the proposed PF based and EP based approaches, which shows that the EP based approach obtains the comparable estimation accuracy with less computational complexity than the PF based method for both stationary and time−varying SNR, and enhances the BP decoding performance simultaneously. Moreover, the EP estimator shows a very fast convergence speed, and the additional computational overhead of the proposed decoder is less than 10% of the standard BP decoder.Moreover, since the close relationship between source coding and channel coding, the proposed ideas are extended to source correlation estimation. First, I study the correlation estimation problem in lossless DSC setup, where I consider both asymmetric and non−asymmetric SW coding of two binary correlated sources. The aforementioned PF and EP based approaches are extended to handle the correlation between two binary sources, where the relationship is modeled as a virtual binary symmetric channel (BSC) with a time−varying crossover probability. Besides, to handle the correlation estimation problem of Wyner−Ziv (WZ) coding, a lossy DSC setup, I design a joint bit−plane model, by which the PF based approach can be applied to tracking the correlation between non−binary sources. Through experimental results, the proposed correlation estimation approaches significantly improve the compression performance of DSC.Finally, due to the property of ultra−low encoding complexity, DSC is a promising technique for many tasks, in which the encoder has only limited computing and communication power, e.g. the space imaging systems. In this dissertation, I consider a real−world application of the proposed correlation estimation scheme on the onboard low−complexity compression of solar stereo images, since such solutions are essential to reduce onboard storage, processing, and communication resources. In this dissertation, I propose an adaptive distributed compression solution using PF that tracks the correlation, as well as performs disparity estimation, at the decoder side. The proposed algorithm istested on the stereo solar images captured by the twin satellites systemof NASA’s STEREO project. The experimental results show the significant PSNR improvement over traditional separate bit−plane decoding without dynamic correlation and disparity estimation
    • …
    corecore