11,398 research outputs found
Information theoretic novelty detection
We present a novel approach to online change detection problems when the training sample size is small. The proposed approach is based on estimating the expected information content of a new data point and allows an accurate control of the false positive rate even for small data sets. In the case of the Gaussian distribution, our approach is analytically tractable and closely related
to classical statistical tests. We then propose an approximation scheme to extend our approach to the case of the mixture of Gaussians. We evaluate extensively our approach on synthetic data and on three real benchmark data
sets. The experimental validation shows that our method maintains a good overall accuracy, but significantly improves the control over the false positive rate
Kernel Ellipsoidal Trimming
Ellipsoid estimation is an issue of primary importance in many practical areas such as control, system identification, visual/audio tracking, experimental design, data mining, robust statistics and novelty/outlier detection. This paper presents a new method of kernel information matrix ellipsoid estimation (KIMEE) that finds an ellipsoid in a kernel defined feature space based on a centered information matrix. Although the method is very general and can be applied to many of the aforementioned problems, the main focus in this paper is the problem of novelty or outlier detection associated with fault detection. A simple iterative algorithm based on Titterington's minimum volume ellipsoid method is proposed for practical implementation. The KIMEE method demonstrates very good performance on a set of real-life and simulated datasets compared with support vector machine methods
Producing PID controllers for testing clustering - Investigating novelty detection for use in classifying PID parameters
PID controllers performance depend on how they are tuned. Tuning a controller is not easy either and many use their experience and intuition, or automatic software for tuning. We present a way to test the quality of controllers using statistics. The method uses multivariate extreme value statistics with novelty detection. With the analyser presented in this paper one can compare fresh PID parameters to those that have been tuned well. This tool can help in troubleshooting with PID controller tuning. Conventional novelty detection methods use a Gaussian mixture model, the analyser here uses a variational mixture model instead. This made the fitting process easier for the user.
Part of this work was to create PID parameter configurations to test the analyser with. We needed both well tuned and poorly tuned parameters for testing the algorithm, as well as several examples of both cases. A genetic algorithm was seen as a tool that would meet these requirements. Genetic algorithms have previously been used for both test parameters generation and PID controller tuning in many applications. The genetic algorithm was written in Matlab. The reason for using Matlab is that the genetic algorithm uses a Simulink model of a PID control process in its fitness function.
The parameters were simulated and plots of their step response were drawn. The best configurations according to the genetic algorithm had little error compared to the reference value. The error seemed to rise according to the index of goodness used by the genetic algorithm. We set three criterions on the parameters: maximum overshoot, settling time, and sum of absolute error. Each of these criterions had a threshold. Each parameter configuration that crossed at least one of these thresholds were classed abnormal.
The performance of the analyser was assessed with these parameters. The analyser were first trained with a set of normal parameters, then tested with a set of normal and a set of abnormal parameters. The results showed 2 false alarms in both cases out of 104 possible. This gave us an accuracy of 98%, which is a very high one for a novelty detection method.fi=OpinnƤytetyƶ kokotekstinƤ PDF-muodossa.|en=Thesis fulltext in PDF format.|sv=LƤrdomsprov tillgƤngligt som fulltext i PDF-format
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
Anomaly Detection in Multivariate Non-stationary Time Series for Automatic DBMS Diagnosis
Anomaly detection in database management systems (DBMSs) is difficult because
of increasing number of statistics (stat) and event metrics in big data system.
In this paper, I propose an automatic DBMS diagnosis system that detects
anomaly periods with abnormal DB stat metrics and finds causal events in the
periods. Reconstruction error from deep autoencoder and statistical process
control approach are applied to detect time period with anomalies. Related
events are found using time series similarity measures between events and
abnormal stat metrics. After training deep autoencoder with DBMS metric data,
efficacy of anomaly detection is investigated from other DBMSs containing
anomalies. Experiment results show effectiveness of proposed model, especially,
batch temporal normalization layer. Proposed model is used for publishing
automatic DBMS diagnosis reports in order to determine DBMS configuration and
SQL tuning.Comment: 8 page
- ā¦