21,232 research outputs found
-means clustering of extremes
The -means clustering algorithm and its variant, the spherical -means
clustering, are among the most important and popular methods in unsupervised
learning and pattern detection. In this paper, we explore how the spherical
-means algorithm can be applied in the analysis of only the extremal
observations from a data set. By making use of multivariate extreme value
analysis we show how it can be adopted to find "prototypes" of extremal
dependence and we derive a consistency result for our suggested estimator. In
the special case of max-linear models we show furthermore that our procedure
provides an alternative way of statistical inference for this class of models.
Finally, we provide data examples which show that our method is able to find
relevant patterns in extremal observations and allows us to classify extremal
events
A data driven equivariant approach to constrained Gaussian mixture modeling
Maximum likelihood estimation of Gaussian mixture models with different
class-specific covariance matrices is known to be problematic. This is due to
the unboundedness of the likelihood, together with the presence of spurious
maximizers. Existing methods to bypass this obstacle are based on the fact that
unboundedness is avoided if the eigenvalues of the covariance matrices are
bounded away from zero. This can be done imposing some constraints on the
covariance matrices, i.e. by incorporating a priori information on the
covariance structure of the mixture components. The present work introduces a
constrained equivariant approach, where the class conditional covariance
matrices are shrunk towards a pre-specified matrix Psi. Data-driven choices of
the matrix Psi, when a priori information is not available, and the optimal
amount of shrinkage are investigated. The effectiveness of the proposal is
evaluated on the basis of a simulation study and an empirical example
Adaptive inferential sensors based on evolving fuzzy models
A new technique to the design and use of inferential sensors in the process industry is proposed in this paper, which is based on the recently introduced concept of evolving fuzzy models (EFMs). They address the challenge that the modern process industry faces today, namely, to develop such adaptive and self-calibrating online inferential sensors that reduce the maintenance costs while keeping the high precision and interpretability/transparency. The proposed new methodology makes possible inferential sensors to recalibrate automatically, which reduces significantly the life-cycle efforts for their maintenance. This is achieved by the adaptive and flexible open-structure EFM used. The novelty of this paper lies in the following: (1) the overall concept of inferential sensors with evolving and self-developing structure from the data streams; (2) the new methodology for online automatic selection of input variables that are most relevant for the prediction; (3) the technique to detect automatically a shift in the data pattern using the age of the clusters (and fuzzy rules); (4) the online standardization technique used by the learning procedure of the evolving model; and (5) the application of this innovative approach to several real-life industrial processes from the chemical industry (evolving inferential sensors, namely, eSensors, were used for predicting the chemical properties of different products in The Dow Chemical Company, Freeport, TX). It should be noted, however, that the methodology and conclusions of this paper are valid for the broader area of chemical and process industries in general. The results demonstrate that well-interpretable and with-simple-structure inferential sensors can automatically be designed from the data stream in real time, which predict various process variables of interest. The proposed approach can be used as a basis for the development of a new generation of adaptive and evolving inferential sensors that can a- ddress the challenges of the modern advanced process industry
What are the Best Hierarchical Descriptors for Complex Networks?
This work reviews several hierarchical measurements of the topology of
complex networks and then applies feature selection concepts and methods in
order to quantify the relative importance of each measurement with respect to
the discrimination between four representative theoretical network models,
namely Erd\"{o}s-R\'enyi, Barab\'asi-Albert, Watts-Strogatz as well as a
geographical type of network. The obtained results confirmed that the four
models can be well-separated by using a combination of measurements. In
addition, the relative contribution of each considered feature for the overall
discrimination of the models was quantified in terms of the respective weights
in the canonical projection into two dimensions, with the traditional
clustering coefficient, hierarchical clustering coefficient and neighborhood
clustering coefficient resulting particularly effective. Interestingly, the
average shortest path length and hierarchical node degrees contributed little
for the separation of the four network models.Comment: 9 pages, 4 figure
- …