5,955 research outputs found
Analysis of video sequences: table of content and index creation
This paper deals with the representation of video sequences useful
for tasks such as long-term analysis, indexing or browsing. A Table
Of Content and index creation algorithm is presented, as well as
additional tools involved in their creation. The proposed method
does not assume any a priori knowledge about the content or the
structure of the video. It is therefore a generic technique. Some
examples are presented in order to assess the performance of the
algorithmPeer ReviewedPostprint (published version
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
A Short Survey on Data Clustering Algorithms
With rapidly increasing data, clustering algorithms are important tools for
data analytics in modern research. They have been successfully applied to a
wide range of domains; for instance, bioinformatics, speech recognition, and
financial analysis. Formally speaking, given a set of data instances, a
clustering algorithm is expected to divide the set of data instances into the
subsets which maximize the intra-subset similarity and inter-subset
dissimilarity, where a similarity measure is defined beforehand. In this work,
the state-of-the-arts clustering algorithms are reviewed from design concept to
methodology; Different clustering paradigms are discussed. Advanced clustering
algorithms are also discussed. After that, the existing clustering evaluation
metrics are reviewed. A summary with future insights is provided at the end
Recommended from our members
Visually driven analysis of movement data by progressive clustering
The paper investigates the possibilities of using clustering techniques in visual exploration and analysis of large numbers of trajectories, that is, sequences of time-stamped locations of some moving entities. Trajectories are complex spatio-temporal constructs characterized by diverse non-trivial properties. To assess the degree of (dis)similarity between trajectories, specific methods (distance functions) are required. A single distance function accounting for all properties of trajectories, (1) is difficult to build, (2) would require much time to compute, and (3) might be difficult to understand and to use. We suggest the procedure of progressive clustering where a simple distance function with a clear meaning is applied on each step, which leads to easily interpretable outcomes. Successive application of several different functions enables sophisticated analyses through gradual refinement of earlier obtained results. Besides the advantages from the sense-making perspective, progressive clustering enables a rational work organization where time-consuming computations are applied to relatively small potentially interesting subsets obtained by means of ‘cheap’ distance functions producing quick results. We introduce the concept of progressive clustering by an example of analyzing a large real data set. We also review the existing clustering methods, describe the method OPTICS suitable for progressive clustering of trajectories, and briefly present several distance functions for trajectories
Data clustering using a model granular magnet
We present a new approach to clustering, based on the physical properties of
an inhomogeneous ferromagnet. No assumption is made regarding the underlying
distribution of the data. We assign a Potts spin to each data point and
introduce an interaction between neighboring points, whose strength is a
decreasing function of the distance between the neighbors. This magnetic system
exhibits three phases. At very low temperatures it is completely ordered; all
spins are aligned. At very high temperatures the system does not exhibit any
ordering and in an intermediate regime clusters of relatively strongly coupled
spins become ordered, whereas different clusters remain uncorrelated. This
intermediate phase is identified by a jump in the order parameters. The
spin-spin correlation function is used to partition the spins and the
corresponding data points into clusters. We demonstrate on three synthetic and
three real data sets how the method works. Detailed comparison to the
performance of other techniques clearly indicates the relative success of our
method.Comment: 46 pages, postscript, 15 ps figures include
- …