2,414 research outputs found
Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization
Given a set of time series, it is of interest to discover subsets that share similar properties. For instance, this may be useful for identifying and estimating a single model that may fit conveniently several time series, instead of performing the usual identification and estimation steps for each one. On the other hand time series in the same cluster are related with respect to the measures assumed for cluster analysis and are suitable for building multivariate time series models. Though many approaches to clustering time series exist, in this view the most effective method seems to have to rely on choosing some features relevant for the problem at hand and seeking for clusters according to their measurements, for instance the autoregressive coe±cients, spectral measures or the eigenvectors of the covariance matrix. Some new indexes based on goodnessof-fit criteria will be proposed in this paper for fuzzy clustering of multivariate time series. A general purpose fuzzy clustering algorithm may be used to estimate the proper cluster structure according to some internal criteria of cluster validity. Such indexes are known to measure actually definite often conflicting cluster properties, compactness or connectedness, for instance, or distribution, orientation, size and shape. It is argued that the multiobjective optimization supported by genetic algorithms is a most effective choice in such a di±cult context. In this paper we use the Xie-Beni index and the C-means functional as objective functions to evaluate the cluster validity in a multiobjective optimization framework. The concept of Pareto optimality in multiobjective genetic algorithms is used to evolve a set of potential solutions towards a set of optimal non-dominated solutions. Genetic algorithms are well suited for implementing di±cult optimization problems where objective functions do not usually have good mathematical properties such as continuity, differentiability or convexity. In addition the genetic algorithms, as population based methods, may yield a complete Pareto front at each step of the iterative evolutionary procedure. The method is illustrated by means of a set of real data and an artificial multivariate time series data set.Fuzzy clustering, Internal criteria of cluster validity, Genetic algorithms, Multiobjective optimization, Time series, Pareto optimality
Louvain-like Methods for Community Detection in Multi-Layer Networks
In many complex systems, entities interact with each other through
complicated patterns that embed different relationships, thus generating
networks with multiple levels and/or multiple types of edges. When trying to
improve our understanding of those complex networks, it is of paramount
importance to explicitly take the multiple layers of connectivity into account
in the analysis. In this paper, we focus on detecting community structures in
multi-layer networks, i.e., detecting groups of well-connected nodes shared
among the layers, a very popular task that poses a lot of interesting questions
and challenges. Most of the available algorithms in this context either reduce
multi-layer networks to a single-layer network or try to extend algorithms for
single-layer networks by using consensus clustering. Those approaches have
anyway been criticized lately. They indeed ignore the connections among the
different layers, hence giving low accuracy. To overcome these issues, we
propose new community detection methods based on tailored Louvain-like
strategies that simultaneously handle the multiple layers. We consider the
informative case, where all layers show a community structure, and the noisy
case, where some layers only add noise to the system. We report experiments on
both artificial and real-world networks showing the effectiveness of the
proposed strategies.Comment: 16 pages, 4 figure
Multi-criteria Anomaly Detection using Pareto Depth Analysis
We consider the problem of identifying patterns in a data set that exhibit
anomalous behavior, often referred to as anomaly detection. In most anomaly
detection algorithms, the dissimilarity between data samples is calculated by a
single criterion, such as Euclidean distance. However, in many cases there may
not exist a single dissimilarity measure that captures all possible anomalous
patterns. In such a case, multiple criteria can be defined, and one can test
for anomalies by scalarizing the multiple criteria using a linear combination
of them. If the importance of the different criteria are not known in advance,
the algorithm may need to be executed multiple times with different choices of
weights in the linear combination. In this paper, we introduce a novel
non-parametric multi-criteria anomaly detection method using Pareto depth
analysis (PDA). PDA uses the concept of Pareto optimality to detect anomalies
under multiple criteria without having to run an algorithm multiple times with
different choices of weights. The proposed PDA approach scales linearly in the
number of criteria and is provably better than linear combinations of the
criteria.Comment: Removed an unnecessary line from Algorithm
Clustering Solutions of Multiobjective Function Inlining Problem
Hard real time-systems are often small devices operating on batteries that must react within a given deadline, so they must satisfy their timing, code size, and energy consumption requirements. Since these three objectives contradict each other, compilers for real-time systems go towards multiobjective optimizations which result in sets of trade-off solutions. A system designer can use the solution sets to choose the most suitable system configuration. Evolutionary algorithms can find trade-off solutions but the solution set might be large which complicates the task of the system designer. We propose to divide the solution set into clusters, so the system designer chooses the most suitable cluster and examines a smaller subset in detail. In contrast to other clustering techniques, our method guarantees that the sizes of all clusters are less than a predefined limit. Our method clusters a set by using any existing clustering method, divides clusters with sizes exceeding the predefined size into smaller clusters, and reduces the number of clusters by merging small clusters. The method guarantees that the final clusters satisfy the size constraint. We demonstrate our approach by considering a well-known compiler-based optimization called function inlining. It substitutes function calls by the function bodies which decreases the execution time and energy consumption of a program but increases its code size
- …