1 research outputs found

    Coping with Training Contamination in Unsupervised Distributional Anomaly Detection

    No full text
    Abstract—In previous work [1], we presented several distributional approaches to anomaly detection for a speech activity detector by training a model on purely nominal data and estimating the divergence between it and other input. Here, we reformulate the problem in an unsupervised framework and allow for anomalous contamination of the training data. After noting the instability of Gaussian mixture models (GMMs) in this context, we focus on non-parametric methods using regularly binned histograms. While the performance of the log likelihood baseline suffered as the amount of contamination was increased, many of the distributional approaches were not affected. We found that the L1 distance, χ 2 statistic, and information theory divergences consistently outperformed the other methods for a variety of contamination levels and test segment lengths. I
    corecore