Search CORE

33 research outputs found

Comparing the hierarchy of author given tags and repository given tags in a large document archive

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2015
Field of study

Folksonomies - large databases arising from collaborative tagging of items by independent users - are becoming an increasingly important way of categorizing information. In these systems users can tag items with free words, resulting in a tripartite item-tag-user network. Although there are no prescribed relations between tags, the way users think about the different categories presumably has some built in hierarchy, in which more special concepts are descendants of some more general categories. Several applications would benefit from the knowledge of this hierarchy. Here we apply a recent method to check the differences and similarities of hierarchies resulting from tags given by independent individuals and from tags given by a centrally managed repository system. The results from out method showed substantial differences between the lower part of the hierarchies, and in contrast, a relatively high similarity at the top of the hierarchies.Comment: 10 page

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Extracting tag hierarchies

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Vicsek Tamás
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Tagging items with descriptive annotations or keywords is a very natural way to compress and highlight information about the properties of the given entity. Over the years several methods have been proposed for extracting a hierarchy between the tags for systems with a "flat", egalitarian organization of the tags, which is very common when the tags correspond to free words given by numerous independent people. Here we present a complete framework for automated tag hierarchy extraction based on tag occurrence statistics. Along with proposing new algorithms, we are also introducing different quality measures enabling the detailed comparison of competing approaches from different aspects. Furthermore, we set up a synthetic, computer generated benchmark providing a versatile tool for testing, with a couple of tunable parameters capable of generating a wide range of test beds. Beside the computer generated input we also use real data in our studies, including a biological example with a pre-defined hierarchy between the tags. The encouraging similarity between the pre-defined and reconstructed hierarchy, as well as the seemingly meaningful hierarchies obtained for other real systems indicate that tag hierarchy extraction is a very promising direction for further research with a great potential for practical applications.Comment: 25 pages with 21 pages of supporting information, 25 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

ELTE Digital Institutional Repository (EDIT)

FigShare

Ontologies and tag-statistics

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Vicsek Tamás
Publication venue
Publication date: 01/01/2012
Field of study

Due to the increasing popularity of collaborative tagging systems, the research on tagged networks, hypergraphs, ontologies, folksonomies and other related concepts is becoming an important interdisciplinary topic with great actuality and relevance for practical applications. In most collaborative tagging systems the tagging by the users is completely "flat", while in some cases they are allowed to define a shallow hierarchy for their own tags. However, usually no overall hierarchical organisation of the tags is given, and one of the interesting challenges of this area is to provide an algorithm generating the ontology of the tags from the available data. In contrast, there are also other type of tagged networks available for research, where the tags are already organised into a directed acyclic graph (DAG), encapsulating the "is a sub-category of" type of hierarchy between each other. In this paper we study how this DAG affects the statistical distribution of tags on the nodes marked by the tags in various real networks. We analyse the relation between the tag-frequency and the position of the tag in the DAG in two large sub-networks of the English Wikipedia and a protein-protein interaction network. We also study the tag co-occurrence statistics by introducing a 2d tag-distance distribution preserving both the difference in the levels and the absolute distance in the DAG for the co-occurring pairs of tags. Our most interesting finding is that the local relevance of tags in the DAG, (i.e., their rank or significance as characterised by, e.g., the length of the branches starting from them) is much more important than their global distance from the root. Furthermore, we also introduce a simple tagging model based on random walks on the DAG, capable of reproducing the main statistical features of tag co-occurrence.Comment: Submitted to New Journal of Physic

arXiv.org e-Print Archive

ELTE Digital Institutional Repository (EDIT)

Társadalomtudományi doktori iskolák társpublikációs hálózatának elemzése

Author: Palla Gergely
Sasvári Péter László
Tibély Gergely
Urbanovics Anna
Publication venue: Ludovika Egyetemi Kiadó Nonprofit Kft. – Ludovika Press
Publication date: 01/01/2019
Field of study

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

Spectrum, Intensity and Coherence in Weighted Networks of a Financial Market

Author: Albert
Caldarelli
Di Matteo
Gergely Tibély
Jari Saramäki
Jukka-Pekka Onnela
János Kertész
Kimmo Kaski
Laloux
Mantegna
Markowitz
Marsili
Onnela
Onnela
Onnela
Onnela
Onnela
Papp
Plerou
Vandewalle
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

We construct a correlation matrix based financial network for a set of New York Stock Exchange (NYSE) traded stocks with stocks corresponding to nodes and the links between them added one after the other, according to the strength of the correlation between the nodes. The eigenvalue spectrum of the correlation matrix reflects the structure of the market, which also shows in the cluster structure of the emergent network. The stronger and more compact a cluster is, the earlier the eigenvalue representing the corresponding business sector occurs in the spectrum. On the other hand, if groups of stocks belonging to a given business sector are considered as a fully connected subgraph of the final network, their intensity and coherence can be monitored as a function of time. This approach indicates to what extent the business sector classifications are visible in market prices, which in turn enables us to gauge the extent of group-behaviour exhibited by stocks belonging to a given business sector.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Note on the equivalence of the label propagation method of community detection and a Potts model approach

Author: Albert
Blatt
Clauset
Eriksen
Ferrero
Flom
Freeman
Gergely Tibély
Gonzalez
Guimerà
Hellsten
Jeong
János Kertész
Kumpula
Lambiotte
Massen
Newman
Newman
Newman
Palla
Palla
Radicchi
Raghavan
Ravasz
Reichardt
Rosvall
Spirin
Sundaramurthy
Zachary
Publication venue: 'Elsevier BV'
Publication date: 31/03/2008
Field of study

We show that the recently introduced label propagation method for detecting communities in complex networks is equivalent to find the local minima of a simple Potts model. Applying to empirical data, the number of such local minima was found to be very high, much larger than the number of nodes in the graph. The aggregation method for combining information from more local minima shows a tendency to fragment the communities into very small pieces.Comment: 6 page

arXiv.org e-Print Archive

Crossref

Comparing the hierarchy of keywords in on-line news portals

Author: A Clauset
A Trusina
AL Barabási
B Corominas-Murtra
B Corominas-Murtra
C Cattuto
C Cattuto
C Goessmann
CV Damme
D Czégel
D Pumain
David Sousa-Rodrigues
DW McShea
E Mones
E Ravasz
ET Wimberley
F Floeck
FJ Brandenburg
G Ghosal
G Palla
G Tibély
G Tibély
Gergely Palla
Gergely Tibély
H Fushing
H Hirata
HW Ma
J Wickens
JI Perotti
K Juszczyszyn
L Lu
M Batty
M Fattore
M Kaiser
M Nagy
M Nagy
N Eldredge
P Heymann
P Mika
P Pollner
P Spyns
Peter Csermely
PR Krugman
Péter Pollner
R Guimerà
R Lambiotte
S Valverde
SN Dorogovtsev
V Zlatić
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

The tagging of on-line content with informative keywords is a widespread phenomenon from scientific article repositories through blogs to on-line news portals. In most of the cases, the tags on a given item are free words chosen by the authors independently. Therefore, relations among keywords in a collection of news items is unknown. However, in most cases the topics and concepts described by these keywords are forming a latent hierarchy, with the more general topics and categories at the top, and more specialised ones at the bottom. Here we apply a recent, cooccurrence-based tag hierarchy extraction method to sets of keywords obtained from four different on-line news portals. The resulting hierarchies show substantial differences not just in the topics rendered as important (being at the top of the hierarchy) or of less interest (categorised low in the hierarchy), but also in the underlying network structure. This reveals discrepancies between the plausible keyword association frameworks in the studied news portals

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

ELTE Digital Institutional Repository (EDIT)

FigShare