Search CORE

23,366 research outputs found

Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches

Author: A Clauset
A Clauset
A Friggeri
A Lancichinetti
A Lancichinetti
A Van Raan
Alexander Struck
B Ball
C Lee
C Lee
D Sullivan
F Havemann
F Havemann
F Janssens
F Janssens
F Radicchi
Frank Havemann
G Tibély
H Small
IV Marshakova
J Baumes
J Baumes
J Gläser
J Xie
Jochen Gläser
M Rosvall
M Sales-Pardo
M Zitt
Michael Heinz
O Amsterdamska
O Mitesser
R Klavans
Renaud Lambiotte
S Fortunato
S Ghosh
S Gregory
S Gregory
T Evans
V Blondel
W Zachary
X Wang
Y Ahn
Y Kim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 26/07/2011
Field of study

We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three pre-defined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields.Comment: 18 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Electricity load profile classification using Fuzzy C-Means method

Author: Bradley D.
King David J.
Ozveren C. S.
Prahastono Iswan
Publication venue
Publication date: 01/01/2008
Field of study

This paper presents the Fuzzy C-Means (FCM) clustering method. The FCM technique assigns a degree of membership for each data set to several clusters, thus offering the opportunity to deal with load profiles that could belong to more than one group at the same time. The FCM algorithm is based on minimising a c-means objective function to determine an optimal classification. The simulation of FCM was carried out using actual sample data from Indonesia and the results are presented. Some validity index measurements was carried out to estimate the compactness of the resulting clusters or to find the optimal number of clusters for a data set

Abertay Research Portal

Possibilistic and fuzzy clustering methods for robust analysis of non-precise data

Author: Ferraro MARIA BRIGIDA
Giordani Paolo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

This work focuses on robust clustering of data affected by imprecision. The imprecision is managed in terms of fuzzy sets. The clustering process is based on the fuzzy and possibilistic approaches. In both approaches the observations are assigned to the clusters by means of membership degrees. In fuzzy clustering the membership degrees express the degrees of sharing of the observations to the clusters. In contrast, in possibilistic clustering the membership degrees are degrees of typicality. These two sources of information are complementary because the former helps to discover the best fuzzy partition of the observations while the latter reflects how well the observations are described by the centroids and, therefore, is helpful to identify outliers. First, a fully possibilistic k-means clustering procedure is suggested. Then, in order to exploit the benefits of both the approaches, a joint possibilistic and fuzzy clustering method for fuzzy data is proposed. A selection procedure for choosing the parameters of the new clustering method is introduced. The effectiveness of the proposal is investigated by means of simulated and real-life data

Archivio della ricerca- Università di Roma La Sapienza

clValid: An R Package for Cluster Validation

Author: Guy Brock
Somnath Datta
Susmita Datta
Vasyl Pihur
Publication venue
Publication date
Field of study

The R package clValid contains functions for validating the results of a clustering analysis. There are three main types of cluster validation measures available, "internal", "stability", and "biological". The user can choose from nine clustering algorithms in existing R packages, including hierarchical, K-means, self-organizing maps (SOM), and model-based clustering. In addition, we provide a function to perform the self-organizing tree algorithm (SOTA) method of clustering. Any combination of validation measures and clustering methods can be requested in a single function call. This allows the user to simultaneously evaluate several clustering algorithms while varying the number of clusters, to help determine the most appropriate method and number of clusters for the dataset of interest. Additionally, the package can automatically make use of the biological information contained in the Gene Ontology (GO) database to calculate the biological validation measures, via the annotation packages available in Bioconductor. The function returns an object of S4 class "clValid", which has summary, plot, print, and additional methods which allow the user to display the optimal validation scores and extract clustering results.

Research Papers in Economics

Dynamic Fuzzy c-Means (dFCM) Clustering and its Application to Calorimetric Data Reconstruction in High Energy Physics

Author: Agostinelli
Allison
Ambriolaa
Chattopadhyaya
Dave
Fan
Freeman
Jeans
Mjahed
Muller
Nock
Pal
Pal
Radha Pyari Sandhir
Salvatore
Sanjib Muhuri
Suliman
Tapan K. Nayak
Wang
Whiteson
Xie
Yeung
Yu
Zadeh
Publication venue: 'Elsevier BV'
Publication date: 16/04/2012
Field of study

In high energy physics experiments, calorimetric data reconstruction requires a suitable clustering technique in order to obtain accurate information about the shower characteristics such as position of the shower and energy deposition. Fuzzy clustering techniques have high potential in this regard, as they assign data points to more than one cluster,thereby acting as a tool to distinguish between overlapping clusters. Fuzzy c-means (FCM) is one such clustering technique that can be applied to calorimetric data reconstruction. However, it has a drawback: it cannot easily identify and distinguish clusters that are not uniformly spread. A version of the FCM algorithm called dynamic fuzzy c-means (dFCM) allows clusters to be generated and eliminated as required, with the ability to resolve non-uniformly distributed clusters. Both the FCM and dFCM algorithms have been studied and successfully applied to simulated data of a sampling tungsten-silicon calorimeter. It is seen that the FCM technique works reasonably well, and at the same time, the use of the dFCM technique improves the performance.Comment: 15 pages, 10 figures. It is accepted for publication in NIM

arXiv.org e-Print Archive

Crossref

Capturing and Treating Unobserved Heterogeneity by Response Based Segmentation in PLS Path Modeling. A Comparison of Alternative Methods by Computational Experiments

Author: Esposito Vinzi Vincenzo
Ringle Christian M.
Squillacciotti Silvia
Trinchera Laura
Publication venue
Publication date
Field of study

Segmentation in PLS path modeling framework results is a critical issue in social sciences. The assumption that data is collected from a single homogeneous population is often unrealistic. Sequential clustering techniques on the manifest variables level are ineffective to account for heterogeneity in path model estimates. Three PLS path model related statistical approaches have been developed as solutions for this problem. The purpose of this paper is to present a study on sets of simulated data with different characteristics that allows a primary assessment of these methodologies.Partial Least Squares; Path Modeling; Unobserved Heterogeneity

Research Papers in Economics

Fuzzy clustering with volume prototypes and adaptive cluster merging

Author: Kaymak U
Setnes M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Two extensions to the objective function-based fuzzy clustering are proposed. First, the (point) prototypes are extended to hypervolumes, whose size can be fixed or can be determined automatically from the data being clustered. It is shown that clustering with hypervolume prototypes can be formulated as the minimization of an objective function. Second, a heuristic cluster merging step is introduced where the similarity among the clusters is assessed during optimization. Starting with an overestimation of the number of clusters in the data, similar clusters are merged in order to obtain a suitable partitioning. An adaptive threshold for merging is proposed. The extensions proposed are applied to Gustafson–Kessel and fuzzy c-means algorithms, and the resulting extended algorithm is given. The properties of the new algorithm are illustrated by various examples

University of Salford Institutional Repository

Crossref

Repository TU/e

EUR Research Repository