Search CORE

4,970 research outputs found

Bipartite graph for topic extraction

Author: Faleiros Thiago de Paulo
Lopes Alneu de Andrade
Publication venue: Buenos Aires
Publication date
Field of study

This article presents a bipartite graph propagation method to be applied to different tasks in the machine learning unsupervised domain, such as topic extraction and clustering. We introduce the objectives and hypothesis that motivate the use of graph based method, and we give the intuition of the proposed Bipartite Graph Propagation Algorithm. The contribution of this study is the development of new method that allows the use of heuristic knowledge to discover topics in textual data easier than it is possible in the traditional mathematical formalism based on Latent Dirichlet Allocation (LDA). Initial experiments demonstrate that our Bipartite Graph Propagation algorithm return good results in a static context (offline algorithm). Now, our research is focusing on big amount of data and dynamic context (online algorithm).São Paulo Research Foundation (FAPESP) (proj. number 2011/23689-9

A Framework for Comparing Groups of Documents

Author: Maiya Arun S.
Publication venue
Publication date: 01/01/2015
Field of study

We present a general framework for comparing multiple groups of documents. A bipartite graph model is proposed where document groups are represented as one node set and the comparison criteria are represented as the other node set. Using this model, we present basic algorithms to extract insights into similarities and differences among the document groups. Finally, we demonstrate the versatility of our framework through an analysis of NSF funding programs for basic research.Comment: 6 pages; 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP '15

arXiv.org e-Print Archive

CiteSeerX

Crossref

Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches

Author: A Clauset
A Clauset
A Friggeri
A Lancichinetti
A Lancichinetti
A Van Raan
Alexander Struck
B Ball
C Lee
C Lee
D Sullivan
F Havemann
F Havemann
F Janssens
F Janssens
F Radicchi
Frank Havemann
G Tibély
H Small
IV Marshakova
J Baumes
J Baumes
J Gläser
J Xie
Jochen Gläser
M Rosvall
M Sales-Pardo
M Zitt
Michael Heinz
O Amsterdamska
O Mitesser
R Klavans
Renaud Lambiotte
S Fortunato
S Ghosh
S Gregory
S Gregory
T Evans
V Blondel
W Zachary
X Wang
Y Ahn
Y Kim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 26/07/2011
Field of study

We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three pre-defined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields.Comment: 18 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Overlapping Community Detection Optimization and Nash Equilibrium

Author: Crampes Michel
Plantié Michel
Publication venue
Publication date: 01/01/2014
Field of study

Community detection using both graphs and social networks is the focus of many algorithms. Recent methods aimed at optimizing the so-called modularity function proceed by maximizing relations within communities while minimizing inter-community relations. However, given the NP-completeness of the problem, these algorithms are heuristics that do not guarantee an optimum. In this paper, we introduce a new algorithm along with a function that takes an approximate solution and modifies it in order to reach an optimum. This reassignment function is considered a 'potential function' and becomes a necessary condition to asserting that the computed optimum is indeed a Nash Equilibrium. We also use this function to simultaneously show partitioning and overlapping communities, two detection and visualization modes of great value in revealing interesting features of a social network. Our approach is successfully illustrated through several experiments on either real unipartite, multipartite or directed graphs of medium and large-sized datasets.Comment: Submitted to KD

arXiv.org e-Print Archive