14,061 research outputs found
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
The effect of transparency on recognition of overlapping objects
Are overlapping objects easier to recognize when the objects are transparent or opaque? It is important to know whether the transparency of X-ray images of luggage contributes to the difficulty in searching those images for targets. Transparency provides extra information about objects that would normally be occluded but creates potentially ambiguous depth relations at the region of overlap. Two experiments investigated the threshold durations at which adult participants could accurately name pairs of overlapping objects that were opaque or transparent. In Experiment 1, the transparent displays included monocular cues to relative depth. Recognition of the back object was possible at shorter durations for transparent displays than for opaque displays. In Experiment 2, the transparent displays had no monocular depth cues. There was no difference in the duration at which the back object was recognized across transparent and opaque displays. The results of the two experiments suggest that transparent displays, even though less familiar than opaque displays, do not make object recognition more difficult, and possibly show a benefit. These findings call into question the importance of edge junctions in object recognitio
Finding strong lenses in CFHTLS using convolutional neural networks
We train and apply convolutional neural networks, a machine learning
technique developed to learn from and classify image data, to
Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) imaging for the
identification of potential strong lensing systems. An ensemble of four
convolutional neural networks was trained on images of simulated galaxy-galaxy
lenses. The training sets consisted of a total of 62,406 simulated lenses and
64,673 non-lens negative examples generated with two different methodologies.
The networks were able to learn the features of simulated lenses with accuracy
of up to 99.8% and a purity and completeness of 94-100% on a test set of 2000
simulations. An ensemble of trained networks was applied to all of the 171
square degrees of the CFHTLS wide field image data, identifying 18,861
candidates including 63 known and 139 other potential lens candidates. A second
search of 1.4 million early type galaxies selected from the survey catalog as
potential deflectors, identified 2,465 candidates including 117 previously
known lens candidates, 29 confirmed lenses/high-quality lens candidates, 266
novel probable or potential lenses and 2097 candidates we classify as false
positives. For the catalog-based search we estimate a completeness of 21-28%
with respect to detectable lenses and a purity of 15%, with a false-positive
rate of 1 in 671 images tested. We predict a human astronomer reviewing
candidates produced by the system would identify ~20 probable lenses and 100
possible lenses per hour in a sample selected by the robot. Convolutional
neural networks are therefore a promising tool for use in the search for lenses
in current and forthcoming surveys such as the Dark Energy Survey and the Large
Synoptic Survey Telescope.Comment: 16 pages, 8 figures. Accepted by MNRA
Fast and accurate NN approach for multi-event annotation of time series
technical reportSimilarity search in time-series subsequences is an important time series data mining task. Searching in time series subsequences for matches for a set of shapes is an extension of this task and is equally important. In this work we propose a simple but efficient approach for finding matches for a group of shapes or events in a given time series using a Nearest Neighbor approach. We provide various improvements of this approach including one using the GNAT data structure. We also propose a technique for finding similar shapes of widely varying temporal width. Both of these techniques for primitive shape matching allow us to more accurately and efficiently form an event representation of a time-series, leading in turn to finding complex events which are composites of primitive events. We demonstrate the robustness of our approaches in detecting complex shapes even in the presence of ?don?t care? symbols. We evaluate the success of our approach in detecting both primitive and complex shapes using a data set from the Fluid Dynamics domain. We also show a speedup of up to 5 times over a na?ve nearest neighbor approach
- …