25,139 research outputs found
Kernel Ellipsoidal Trimming
Ellipsoid estimation is an issue of primary importance in many practical areas such as control, system identification, visual/audio tracking, experimental design, data mining, robust statistics and novelty/outlier detection. This paper presents a new method of kernel information matrix ellipsoid estimation (KIMEE) that finds an ellipsoid in a kernel defined feature space based on a centered information matrix. Although the method is very general and can be applied to many of the aforementioned problems, the main focus in this paper is the problem of novelty or outlier detection associated with fault detection. A simple iterative algorithm based on Titterington's minimum volume ellipsoid method is proposed for practical implementation. The KIMEE method demonstrates very good performance on a set of real-life and simulated datasets compared with support vector machine methods
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Planning as Optimization: Dynamically Discovering Optimal Configurations for Runtime Situations
The large number of possible configurations of modern software-based systems,
combined with the large number of possible environmental situations of such
systems, prohibits enumerating all adaptation options at design time and
necessitates planning at run time to dynamically identify an appropriate
configuration for a situation. While numerous planning techniques exist, they
typically assume a detailed state-based model of the system and that the
situations that warrant adaptations are known. Both of these assumptions can be
violated in complex, real-world systems. As a result, adaptation planning must
rely on simple models that capture what can be changed (input parameters) and
observed in the system and environment (output and context parameters). We
therefore propose planning as optimization: the use of optimization strategies
to discover optimal system configurations at runtime for each distinct
situation that is also dynamically identified at runtime. We apply our approach
to CrowdNav, an open-source traffic routing system with the characteristics of
a real-world system. We identify situations via clustering and conduct an
empirical study that compares Bayesian optimization and two types of
evolutionary optimization (NSGA-II and novelty search) in CrowdNav
Voxel selection in fMRI data analysis based on sparse representation
Multivariate pattern analysis approaches toward detection of brain regions from fMRI data have been gaining attention recently. In this study, we introduce an iterative sparse-representation-based algorithm for detection of voxels in functional MRI (fMRI) data with task relevant information. In each iteration of the algorithm, a linear programming problem is solved and a sparse weight vector is subsequently obtained. The final weight vector is the mean of those obtained in all iterations. The characteristics of our algorithm are as follows: 1) the weight vector (output) is sparse; 2) the magnitude of each entry of the weight vector represents the significance of its corresponding variable or feature in a classification or regression problem; and 3) due to the convergence of this algorithm, a stable weight vector is obtained. To demonstrate the validity of our algorithm and illustrate its application, we apply the algorithm to the Pittsburgh Brain Activity Interpretation Competition 2007 functional fMRI dataset for selecting the voxels, which are the most relevant to the tasks of the subjects. Based on this dataset, the aforementioned characteristics of our algorithm are analyzed, and a comparison between our method with the univariate general-linear-model-based statistical parametric mapping is performed. Using our method, a combination of voxels are selected based on the principle of effective/sparse representation of a task. Data analysis results in this paper show that this combination of voxels is suitable for decoding tasks and demonstrate the effectiveness of our method
Automated novelty detection in the WISE survey with one-class support vector machines
Wide-angle photometric surveys of previously uncharted sky areas or
wavelength regimes will always bring in unexpected sources whose existence and
properties cannot be easily predicted from earlier observations: novelties or
even anomalies. Such objects can be efficiently sought for with novelty
detection algorithms. Here we present an application of such a method, called
one-class support vector machines (OCSVM), to search for anomalous patterns
among sources preselected from the mid-infrared AllWISE catalogue covering the
whole sky. To create a model of expected data we train the algorithm on a set
of objects with spectroscopic identifications from the SDSS DR13 database,
present also in AllWISE. OCSVM detects as anomalous those sources whose
patterns - WISE photometric measurements in this case - are inconsistent with
the model. Among the detected anomalies we find artefacts, such as objects with
spurious photometry due to blending, but most importantly also real sources of
genuine astrophysical interest. Among the latter, OCSVM has identified a sample
of heavily reddened AGN/quasar candidates distributed uniformly over the sky
and in a large part absent from other WISE-based AGN catalogues. It also
allowed us to find a specific group of sources of mixed types, mostly stars and
compact galaxies. By combining the semi-supervised OCSVM algorithm with
standard classification methods it will be possible to improve the latter by
accounting for sources which are not present in the training sample but are
otherwise well-represented in the target set. Anomaly detection adds
flexibility to automated source separation procedures and helps verify the
reliability and representativeness of the training samples. It should be thus
considered as an essential step in supervised classification schemes to ensure
completeness and purity of produced catalogues.Comment: 14 pages, 15 figure
- …