Search CORE

19,661 research outputs found

Clustering Memes in Social Media

Author: Becker H.
Metaxas P.
Ratkiewicz J.
Reisinger J.
Sayyadi H.
Simmons M.
Yih W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2013
Field of study

The increasing pervasiveness of social media creates new opportunities to study human social behavior, while challenging our capability to analyze their massive data streams. One of the emerging tasks is to distinguish between different kinds of activities, for example engineered misinformation campaigns versus spontaneous communication. Such detection problems require a formal definition of meme, or unit of information that can spread from person to person through the social network. Once a meme is identified, supervised learning methods can be applied to classify different types of communication. The appropriate granularity of a meme, however, is hardly captured from existing entities such as tags and keywords. Here we present a framework for the novel task of detecting memes by clustering messages from large streams of social data. We evaluate various similarity measures that leverage content, metadata, network features, and their combinations. We also explore the idea of pre-clustering on the basis of existing entities. A systematic evaluation is carried out using a manually curated dataset as ground truth. Our analysis shows that pre-clustering and a combination of heterogeneous features yield the best trade-off between number of clusters and their quality, demonstrating that a simple combination based on pairwise maximization of similarity is as effective as a non-trivial optimization of parameters. Our approach is fully automatic, unsupervised, and scalable for real-time detection of memes in streaming data.Comment: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM'13), 201

arXiv.org e-Print Archive

Crossref

Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches

Author: A Clauset
A Clauset
A Friggeri
A Lancichinetti
A Lancichinetti
A Van Raan
Alexander Struck
B Ball
C Lee
C Lee
D Sullivan
F Havemann
F Havemann
F Janssens
F Janssens
F Radicchi
Frank Havemann
G Tibély
H Small
IV Marshakova
J Baumes
J Baumes
J Gläser
J Xie
Jochen Gläser
M Rosvall
M Sales-Pardo
M Zitt
Michael Heinz
O Amsterdamska
O Mitesser
R Klavans
Renaud Lambiotte
S Fortunato
S Ghosh
S Gregory
S Gregory
T Evans
V Blondel
W Zachary
X Wang
Y Ahn
Y Kim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 26/07/2011
Field of study

We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three pre-defined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields.Comment: 18 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Search based software engineering: Trends, techniques and applications

Author: Adamopoulos K.
Afzal W.
Afzal W.
Aguilar
Al Ba E.
Alander J. T.
Alander J. T.
Alander J. T.
Alba E.
Alba E.
Amoui M.
Ant Oniol G.
Antoniol G.
Antoniol G.
Arcuri A.
Aversano L.
Bodhuin T.
Bouktif S.
Canfora G.
Chang C. K.
Chang C. K.
Chang C. K.
Chao C.
Chicano F.
Clark J. A.
Cortellessa V.
Cowan G. S.
Dolado J. J.
Doval D.
Dozier G.
El-Faki H K.
Erformat M.
Evett M. P.
Fatiregun D.
Feather M. S.
Feather M. S.
Feldt R.
Ferreira M.
Funes P.
Gross H.-G.
Gross H.-G.
Harman M.
Harman M.
Hart J.
He P.
Hodjat B.
Jaeger M. C.
Jarillo G.
Jiang H.
Joshi A. M.
Katz G.
Khoshgoftaar T. M.
Khoshgoftaar T. M.
Kirsopp C.
Lefley M.
Li C.
Liu Y.
Liu Y.
Liu Y.
Mahanti P. K.
Mahdavi K.
Mahdavi K.
Mancoridis S.
Mancoridis S.
Mark Harman
Minohara T.
Mitchell B. S.
Mitchell B. S.
Mitchell B. S.
Monnier Y.
Nguyen C.
Pohlheim H.
Raiha O.
Ruhe G.
Ruhe G.
S. Afshin Mansouri
Sahraoui H. A.
Shan Y.
Shepperd M.
Shyang W.
Simons C. L.
Stephenson M.
Su S.
van Belle T.
Van Den Akker M.
Vivanco R.
Wang Z.
Wegener J.
Yoo S.
Yuanyuan Zhang
Zhang X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2012
Field of study

© ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version is available from the link below.In the past five years there has been a dramatic increase in work on Search-Based Software Engineering (SBSE), an approach to Software Engineering (SE) in which Search-Based Optimization (SBO) algorithms are used to address problems in SE. SBSE has been applied to problems throughout the SE lifecycle, from requirements and project planning to maintenance and reengineering. The approach is attractive because it offers a suite of adaptive automated and semiautomated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives. This article provides a review and classification of literature on SBSE. The work identifies research trends and relationships between the techniques applied and the applications to which they have been applied and highlights gaps in the literature and avenues for further research.EPSRC and E

Crossref

UCL Discovery

Brunel University Research Archive

An integrative clustering approach combining particle swarm optimization and formal concept analysis

Author: A. Alizadeh
A. Brazma
E. Tsiporkova
G. Rustici
J. Besson
J. Besson
J. Handl
J. Kennedy
J.K. Choi
M. Kaytoue-Uberall
P. Rousseeuw
S. Maere
T. Golub
V. Boeva
V. Choi
Zhou
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

Crossref

Ghent University Academic Bibliography

Benchmarking in cluster analysis: A white paper

Author: Boulesteix Anne-Laure
Dangl Rainer
Dean Nema
Guyon Isabelle
Hennig Christian
Leisch Friedrich
Steinley Douglas
Van Mechelen Iven
Publication venue
Publication date: 01/10/2018
Field of study

To achieve scientific progress in terms of building a cumulative body of knowledge, careful attention to benchmarking is of the utmost importance. This means that proposals of new methods of data pre-processing, new data-analytic techniques, and new methods of output post-processing, should be extensively and carefully compared with existing alternatives, and that existing methods should be subjected to neutral comparison studies. To date, benchmarking and recommendations for benchmarking have been frequently seen in the context of supervised learning. Unfortunately, there has been a dearth of guidelines for benchmarking in an unsupervised setting, with the area of clustering as an important subdomain. To address this problem, discussion is given to the theoretical conceptual underpinnings of benchmarking in the field of cluster analysis by means of simulated as well as empirical data. Subsequently, the practicalities of how to address benchmarking questions in clustering are dealt with, and foundational recommendations are made

arXiv.org e-Print Archive

Proceedings - University of Groningen

ARTS repository - University of Groningen

Enlighten

Dissertations of the University of Groningen