2,724 research outputs found
Median evidential c-means algorithm and its application to community detection
Median clustering is of great value for partitioning relational data. In this
paper, a new prototype-based clustering method, called Median Evidential
C-Means (MECM), which is an extension of median c-means and median fuzzy
c-means on the theoretical framework of belief functions is proposed. The
median variant relaxes the restriction of a metric space embedding for the
objects but constrains the prototypes to be in the original data set. Due to
these properties, MECM could be applied to graph clustering problems. A
community detection scheme for social networks based on MECM is investigated
and the obtained credal partitions of graphs, which are more refined than crisp
and fuzzy ones, enable us to have a better understanding of the graph
structures. An initial prototype-selection scheme based on evidential
semi-centrality is presented to avoid local premature convergence and an
evidential modularity function is defined to choose the optimal number of
communities. Finally, experiments in synthetic and real data sets illustrate
the performance of MECM and show its difference to other methods
A similarity-based community detection method with multiple prototype representation
Communities are of great importance for understanding graph structures in
social networks. Some existing community detection algorithms use a single
prototype to represent each group. In real applications, this may not
adequately model the different types of communities and hence limits the
clustering performance on social networks. To address this problem, a
Similarity-based Multi-Prototype (SMP) community detection approach is proposed
in this paper. In SMP, vertices in each community carry various weights to
describe their degree of representativeness. This mechanism enables each
community to be represented by more than one node. The centrality of nodes is
used to calculate prototype weights, while similarity is utilized to guide us
to partitioning the graph. Experimental results on computer generated and
real-world networks clearly show that SMP performs well for detecting
communities. Moreover, the method could provide richer information for the
inner structure of the detected communities with the help of prototype weights
compared with the existing community detection models
General fuzzy min-max neural network for clustering and classification
This paper describes a general fuzzy min-max (GFMM) neural network which is a generalization and extension of the fuzzy min-max clustering and classification algorithms of Simpson (1992, 1993). The GFMM method combines supervised and unsupervised learning in a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. It exhibits a property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes and consists of placing and adjusting the hyperboxes in the pattern space; this is an expansion-contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations. A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given
Adaptive imputation of missing values for incomplete pattern classification
In classification of incomplete pattern, the missing values can either play a
crucial role in the class determination, or have only little influence (or
eventually none) on the classification results according to the context. We
propose a credal classification method for incomplete pattern with adaptive
imputation of missing values based on belief function theory. At first, we try
to classify the object (incomplete pattern) based only on the available
attribute values. As underlying principle, we assume that the missing
information is not crucial for the classification if a specific class for the
object can be found using only the available information. In this case, the
object is committed to this particular class. However, if the object cannot be
classified without ambiguity, it means that the missing values play a main role
for achieving an accurate classification. In this case, the missing values will
be imputed based on the K-nearest neighbor (K-NN) and self-organizing map (SOM)
techniques, and the edited pattern with the imputation is then classified. The
(original or edited) pattern is respectively classified according to each
training class, and the classification results represented by basic belief
assignments are fused with proper combination rules for making the credal
classification. The object is allowed to belong with different masses of belief
to the specific classes and meta-classes (which are particular disjunctions of
several single classes). The credal classification captures well the
uncertainty and imprecision of classification, and reduces effectively the rate
of misclassifications thanks to the introduction of meta-classes. The
effectiveness of the proposed method with respect to other classical methods is
demonstrated based on several experiments using artificial and real data sets
Evidential Label Propagation Algorithm for Graphs
Community detection has attracted considerable attention crossing many areas
as it can be used for discovering the structure and features of complex
networks. With the increasing size of social networks in real world, community
detection approaches should be fast and accurate. The Label Propagation
Algorithm (LPA) is known to be one of the near-linear solutions and benefits of
easy implementation, thus it forms a good basis for efficient community
detection methods. In this paper, we extend the update rule and propagation
criterion of LPA in the framework of belief functions. A new community
detection approach, called Evidential Label Propagation (ELP), is proposed as
an enhanced version of conventional LPA. The node influence is first defined to
guide the propagation process. The plausibility is used to determine the domain
label of each node. The update order of nodes is discussed to improve the
robustness of the method. ELP algorithm will converge after the domain labels
of all the nodes become unchanged. The mass assignments are calculated finally
as memberships of nodes. The overlapping nodes and outliers can be detected
simultaneously through the proposed method. The experimental results
demonstrate the effectiveness of ELP.Comment: 19th International Conference on Information Fusion, Jul 2016,
Heidelber, Franc
Evidential Evolving Gustafson-Kessel Algorithm For Online Data Streams Partitioning Using Belief Function Theory.
International audienceA new online clustering method called E2GK (Evidential Evolving Gustafson-Kessel) is introduced. This partitional clustering algorithm is based on the concept of credal partition defined in the theoretical framework of belief functions. A credal partition is derived online by applying an algorithm resulting from the adaptation of the Evolving Gustafson-Kessel (EGK) algorithm. Online partitioning of data streams is then possible with a meaningful interpretation of the data structure. A comparative study with the original online procedure shows that E2GK outperforms EGK on different entry data sets. To show the performance of E2GK, several experiments have been conducted on synthetic data sets as well as on data collected from a real application problem. A study of parameters' sensitivity is also carried out and solutions are proposed to limit complexity issues
E2GK : Evidential evolving Gustafsson-Kessel algorithm for data streams partitioning using belief functions.
International audienceA new online clustering method, called E2GK (Evidential Evolving Gustafson-Kessel) is introduced in the theoretical framework of belief functions. The algorithm enables an online partitioning of data streams based on two existing and e cient algorithms: Evidantial c- Means (ECM) and Evolving Gustafson-Kessel (EGK). E2GK uses the concept of credal partition of ECM and adapts EGK, o ering a better interpretation of the data structure. Experiments with synthetic data sets show good performances of the proposed algorithm compared to the original online procedure
Land cover classification using fuzzy rules and aggregation of contextual information through evidence theory
Land cover classification using multispectral satellite image is a very
challenging task with numerous practical applications. We propose a multi-stage
classifier that involves fuzzy rule extraction from the training data and then
generation of a possibilistic label vector for each pixel using the fuzzy rule
base. To exploit the spatial correlation of land cover types we propose four
different information aggregation methods which use the possibilistic class
label of a pixel and those of its eight spatial neighbors for making the final
classification decision. Three of the aggregation methods use Dempster-Shafer
theory of evidence while the remaining one is modeled after the fuzzy k-NN
rule. The proposed methods are tested with two benchmark seven channel
satellite images and the results are found to be quite satisfactory. They are
also compared with a Markov random field (MRF) model-based contextual
classification method and found to perform consistently better.Comment: 14 pages, 2 figure
Clustering as an example of optimizing arbitrarily chosen objective functions
This paper is a reflection upon a common practice of solving various types of learning problems by optimizing arbitrarily chosen criteria in the hope that they are well correlated with the criterion actually used for assessment of the results. This issue has been investigated using clustering as an example, hence a unified view of clustering as an optimization problem is first proposed, stemming from the belief that typical design choices in clustering, like the number of clusters or similarity measure can be, and often are suboptimal, also from the point of view of clustering quality measures later used for algorithm comparison and ranking. In order to illustrate our point we propose a generalized clustering framework and provide a proof-of-concept using standard benchmark datasets and two popular clustering methods for comparison
- …