Search CORE

55,053 research outputs found

Probabilistic clustering of interval data

Author: Brito Paula
Dias José G.
Silva A. Pedro Duarte
Publication venue: 'IOS Press'
Publication date: 01/01/2015
Field of study

In this paper we address the problem of clustering interval data, adopting a model-based approach. To this purpose, parametric models for interval-valued variables are used which consider configurations for the variance-covariance matrix that take the nature of the interval data directly into account. Results, both on synthetic and empirical data, clearly show the well-founding of the proposed approach. The method succeeds in finding parsimonious heterocedastic models which is a critical feature in many applications. Furthermore, the analysis of the different data sets made clear the need to explicitly consider the intrinsic variability present in interval data.info:eu-repo/semantics/publishedVersio

Hierarchical Clustering with Simple Matching and Joint Entropy Dissimilarity Measure

Author: Ergüt Özlem
Çilingtürk A Mete
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2014
Field of study

Conventional clustering algorithms are restricted for use with data containing ratio or interval scale variables; hence, distances are used. As social studies require merely categorical data, the literature is enriched with more complicated clustering techniques and algorithms of categorical data. These techniques are based on similarity or dissimilarity matrices. The algorithms are using density based or pattern based approaches. A probabilistic nature to similarity structure is proposed. The entropy dissimilarity measure has comparable results with simple matching dissimilarity at hierarchical clustering. It overcomes dimension increase through binarization of the categorical data. This approach is also functional with the clustering methods, where a- priori cluster number information is available

Digital Commons@Wayne State University

Clustering of TS-fuzzy system

Author: Igrejas Getúlio
Salgado Paulo
Publication venue
Publication date: 01/01/2007
Field of study

This paper presents a fuzzy c-means clustering method for partitioning symbolic interval data, namely the T-S fuzzy rules. The proposed method furnish a fuzzy partition and prototype for each cluster by optimizing an adequacy criterion based on suitable squared Euclidean distances between vectors of intervals. This methodology leads to a fuzzy partition of the TS-fuzzy rules, one for each cluster, which corresponds to a new set of fuzzy sub-systems. When applied to the clustering of TS-fuzzy system the result is a set of additive decomposed TS-fuzzy sub-systems. In this work a generalized Probabilistic Fuzzy C-Means algorithm is proposed and applied to TS-Fuzzy System clustering

Biblioteca Digital do IPB

MAINT.Data: modelling and analysing interval data in R

Author: Brito Paula
Dias José
Filzmoser Peter
Silva Pedro Duarte
Publication venue: 'The R Foundation'
Publication date: 01/01/2021
Field of study

We present the CRAN R package MAINT.Data for the modelling and analysis of multivariate interval data, i.e., where units are described by variables whose values are intervals of IR, representing intrinsic variability. Parametric inference methodologies based on probabilistic models for interval variables have been developed, where each interval is represented by its midpoint and log-range, for which multivariate Normal and Skew-Normal distributions are assumed. The intrinsic nature of the interval variables leads to special structures of the variance-covariance matrix, which are represented by four different possible configurations. MAINT.Data implements the proposed methodologies in the S4 object system, introducing a specific data class for representing interval data. It includes functions and methods for modelling and analysing interval data, in particular maximum likelihood estimation, statistical tests for the different configurations, (M)ANOVA and Discriminant Analysis. For the Gaussian model, Model-based Clustering, robust estimation, outlier detection and Robust Discriminant Analysis are also availableinfo:eu-repo/semantics/publishedVersio

The Advantage of Evidential Attributes in Social Networks

Author: adar
adar
khan
leskovec
newman
scott
scott
shafer
Publication venue
Publication date: 10/07/2017
Field of study

Nowadays, there are many approaches designed for the task of detecting communities in social networks. Among them, some methods only consider the topological graph structure, while others take use of both the graph structure and the node attributes. In real-world networks, there are many uncertain and noisy attributes in the graph. In this paper, we will present how we detect communities in graphs with uncertain attributes in the first step. The numerical, probabilistic as well as evidential attributes are generated according to the graph structure. In the second step, some noise will be added to the attributes. We perform experiments on graphs with different types of attributes and compare the detection results in terms of the Normalized Mutual Information (NMI) values. The experimental results show that the clustering with evidential attributes gives better results comparing to those with probabilistic and numerical attributes. This illustrates the advantages of evidential attributes.Comment: 20th International Conference on Information Fusion, Jul 2017, Xi'an, Chin

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1