10,646 research outputs found

    The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization

    Full text link
    This paper focuses on an examination of an applicability of Recurrent Neural Network models for detecting anomalous behavior of the CERN superconducting magnets. In order to conduct the experiments, the authors designed and implemented an adaptive signal quantization algorithm and a custom GRU-based detector and developed a method for the detector parameters selection. Three different datasets were used for testing the detector. Two artificially generated datasets were used to assess the raw performance of the system whereas the 231 MB dataset composed of the signals acquired from HiLumi magnets was intended for real-life experiments and model training. Several different setups of the developed anomaly detection system were evaluated and compared with state-of-the-art OC-SVM reference model operating on the same data. The OC-SVM model was equipped with a rich set of feature extractors accounting for a range of the input signal properties. It was determined in the course of the experiments that the detector, along with its supporting design methodology, reaches F1 equal or very close to 1 for almost all test sets. Due to the profile of the data, the best_length setup of the detector turned out to perform the best among all five tested configuration schemes of the detection system. The quantization parameters have the biggest impact on the overall performance of the detector with the best values of input/output grid equal to 16 and 8, respectively. The proposed solution of the detection significantly outperformed OC-SVM-based detector in most of the cases, with much more stable performance across all the datasets.Comment: Related to arXiv:1702.0083

    Realtime market microstructure analysis: online Transaction Cost Analysis

    Full text link
    Motivated by the practical challenge in monitoring the performance of a large number of algorithmic trading orders, this paper provides a methodology that leads to automatic discovery of the causes that lie behind a poor trading performance. It also gives theoretical foundations to a generic framework for real-time trading analysis. Academic literature provides different ways to formalize these algorithms and show how optimal they can be from a mean-variance, a stochastic control, an impulse control or a statistical learning viewpoint. This paper is agnostic about the way the algorithm has been built and provides a theoretical formalism to identify in real-time the market conditions that influenced its efficiency or inefficiency. For a given set of characteristics describing the market context, selected by a practitioner, we first show how a set of additional derived explanatory factors, called anomaly detectors, can be created for each market order. We then will present an online methodology to quantify how this extended set of factors, at any given time, predicts which of the orders are underperforming while calculating the predictive power of this explanatory factor set. Armed with this information, which we call influence analysis, we intend to empower the order monitoring user to take appropriate action on any affected orders by re-calibrating the trading algorithms working the order through new parameters, pausing their execution or taking over more direct trading control. Also we intend that use of this method in the post trade analysis of algorithms can be taken advantage of to automatically adjust their trading action.Comment: 33 pages, 12 figure

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Automated novelty detection in the WISE survey with one-class support vector machines

    Get PDF
    Wide-angle photometric surveys of previously uncharted sky areas or wavelength regimes will always bring in unexpected sources whose existence and properties cannot be easily predicted from earlier observations: novelties or even anomalies. Such objects can be efficiently sought for with novelty detection algorithms. Here we present an application of such a method, called one-class support vector machines (OCSVM), to search for anomalous patterns among sources preselected from the mid-infrared AllWISE catalogue covering the whole sky. To create a model of expected data we train the algorithm on a set of objects with spectroscopic identifications from the SDSS DR13 database, present also in AllWISE. OCSVM detects as anomalous those sources whose patterns - WISE photometric measurements in this case - are inconsistent with the model. Among the detected anomalies we find artefacts, such as objects with spurious photometry due to blending, but most importantly also real sources of genuine astrophysical interest. Among the latter, OCSVM has identified a sample of heavily reddened AGN/quasar candidates distributed uniformly over the sky and in a large part absent from other WISE-based AGN catalogues. It also allowed us to find a specific group of sources of mixed types, mostly stars and compact galaxies. By combining the semi-supervised OCSVM algorithm with standard classification methods it will be possible to improve the latter by accounting for sources which are not present in the training sample but are otherwise well-represented in the target set. Anomaly detection adds flexibility to automated source separation procedures and helps verify the reliability and representativeness of the training samples. It should be thus considered as an essential step in supervised classification schemes to ensure completeness and purity of produced catalogues.Comment: 14 pages, 15 figure

    Detecting Outliers in Data with Correlated Measures

    Full text link
    Advances in sensor technology have enabled the collection of large-scale datasets. Such datasets can be extremely noisy and often contain a significant amount of outliers that result from sensor malfunction or human operation faults. In order to utilize such data for real-world applications, it is critical to detect outliers so that models built from these datasets will not be skewed by outliers. In this paper, we propose a new outlier detection method that utilizes the correlations in the data (e.g., taxi trip distance vs. trip time). Different from existing outlier detection methods, we build a robust regression model that explicitly models the outliers and detects outliers simultaneously with the model fitting. We validate our approach on real-world datasets against methods specifically designed for each dataset as well as the state of the art outlier detectors. Our outlier detection method achieves better performances, demonstrating the robustness and generality of our method. Last, we report interesting case studies on some outliers that result from atypical events.Comment: 10 page
    corecore