22 research outputs found

    Anomaly Detection for Vision-based Railway Inspection

    Get PDF
    none7nomixedRiccardo Gasparini; Stefano Pini; Guido Borghi; Giuseppe Scaglione; Simone Calderara; Eugenio Fedeli; Rita CucchiaraRiccardo Gasparini; Stefano Pini; Guido Borghi; Giuseppe Scaglione; Simone Calderara; Eugenio Fedeli; Rita Cucchiar

    Integrated Clustering and Anomaly Detection (INCAD) for Streaming Data (Revised)

    Full text link
    Most current clustering based anomaly detection methods use scoring schema and thresholds to classify anomalies. These methods are often tailored to target specific data sets with "known" number of clusters. The paper provides a streaming clustering and anomaly detection algorithm that does not require strict arbitrary thresholds on the anomaly scores or knowledge of the number of clusters while performing probabilistic anomaly detection and clustering simultaneously. This ensures that the cluster formation is not impacted by the presence of anomalous data, thereby leading to more reliable definition of "normal vs abnormal" behavior. The motivations behind developing the INCAD model and the path that leads to the streaming model is discussed.Comment: 13 pages; fixes typos in equations 5,6,9,10 on inference using Gibbs samplin

    Stochastic motion of test particle implies that G varies with time

    Full text link
    The aim of this letter is to propose a new description to the time varying gravitational constant problem, which naturally implements the Dirac's large numbers hypothesis in a new proposed holographic scenario for the origin of gravity as an entropic force. We survey the effect of the Stochastic motion of the test particle in Verlinde's scenario for gravity\cite{Verlinde}. Firstly we show that we must get the equipartition values for tt\rightarrow\infty which leads to the usual Newtonian gravitational constant. Secondly,the stochastic (Brownian) essence of the motion of the test particle, modifies the Newton's 2'nd law. The direct result is that the Newtonian constant has been time dependence in resemblance as \cite{Running}.Comment: Accepted in International Journal of Theoretical Physic

    Supervised Hyperparameter Estimation for Anomaly Detection

    Full text link
    The version of record of this article, first published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), is available online at Publisher’s website: https://doi.org/10.1007/978-3-030-61705-9_20The detection of anomalies, i.e. of those points found in a dataset but which do not seem to be generated by the underlying distribution, is crucial in machine learning. Their presence is likely to make model predictions not as accurate as we would like; thus, they should be identified before any model is built which, in turn, may require the optimal selection of the detector hyperparameters. However, the unsupervised nature of this problem makes that task not easy. In this work, we propose a new estimator composed by an anomaly detector followed by a supervised model; we can take then advantage of this second model to transform model estimation into a supervised problem and, as a consequence, the estimation of the detector hyperparameters can be done in a supervised setting. We shall apply these ideas to optimally hyperparametrize four different anomaly detectors, namely, Robust Covariance, Local Outlier Factor, Isolation Forests and One-class Support Vector Machines, over different classification and regression problems. We will also experimentally show the usefulness of our proposal to estimate in an objective and automatic way the best detector hyperparametersThe authors acknowledge financial support from the European Regional Development Fund and from the Spanish Ministry of Economy, Industry, and Competitiveness - State Research Agency, project TIN2016-76406-P (AEI/FEDER, UE). They also thank the UAM–ADIC Chair for Data Science and Machine Learning and gratefully acknowledge the use of the facilities of Centro de Computación Científica (CCC) at UA

    Efficient Training of Graph-Regularized Multitask SVMs

    Get PDF
    We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20,000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss functions, and derive its dual representation. Building on the work of Hsieh et al. [1,2], we derive an algorithm for optimizing the large-margin objective and prove its convergence. Our computational experiments show a speedup of up to three orders of magnitude over LibSVM and SVMLight for several standard benchmarks as well as challenging data sets from the application domain of computational biology. Combining our optimization methodology with the COFFIN large-scale learning framework [3], we are able to train a multi-task SVM using over 1,000,000 training points stemming from 4 different tasks. An efficient C++ implementation of our algorithm is being made publicly available as a part of the SHOGUN machine learning toolbox [4]

    Efficient Training of Graph-Regularized Multitask SVMs

    No full text

    Ensembles of lasso screening rules

    No full text
    In order to solve large-scale lasso problems, screening algorithms have been developed that discard features with zero coefficients based on a computationally efficient screening rule. Most existing screening rules were developed from a spherical constraint and half-space constraints on a dual optimal solution. However, existing rules admit at most two half-space constraints due to the computational cost incurred by the half-spaces, even though additional constraints may be useful to discard more features. In this paper, we present AdaScreen, an adaptive lasso screening rule ensemble, which allows to combine any one sphere with multiple half-space constraints on a dual optimal solution. Thanks to geometrical considerations that lead to a simple closed form solution for AdaScreen, we can incorporate multiple half-space constraints at small computational cost. In our experiments, we show that AdaScreen with multiple half-space constraints simultaneously improves screening performance and speeds up lasso solvers

    Interactive Anomaly Detection Based on Clustering and Online Mirror Descent

    No full text
    In several applications, when anomalies are detected, human experts have to investigate or verify them one by one. As they investigate, they unwittingly produce a label - true positive (TP) or false positive (FP). In this paper, we propose a method (called OMD-Clustering) that exploits this label feedback to minimize the FP rate and detect more relevant anomalies, while minimizing the expert effort required to inves- tigate them. The OMD-Clustering method iteratively suggests the top-1 anomalous instance to a human expert and receives feedback. Before suggesting the next anomaly, the method re-ranks instances so that the top anomalous instances are similar to the TP instances and dissimi- lar to the FP instances. This is achieved by learning to score anomalies differently in various regions of the feature space. An experimental eval- uation on several real-world datasets is conducted. The results show that OMD-Clustering achieves significant improvement in both detection pre- cision and expert effort compared to state-of-the-art interactive anomaly detection methods
    corecore