Search CORE

1,406 research outputs found

Rules for Inducing Hierarchies from Social Tagging Data

Author: Coenen Frans
Dong Hang
Wang Wei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Automatic generation of hierarchies from social tags is a challenging task. We identified three rules, set inclusion, graph centrality and information-theoretic condition from the literature and proposed two new rules, fuzzy set inclusion and probabilistic association to induce hierarchical relations. We proposed an hierarchy generation algorithm, which can incorporate each rule with different data representations, i.e., resource and Probabilistic Topic Model based representations. The learned hierarchies were compared to some of the widely used reference concept hierarchies. We found that probabilistic association and set inclusion based rules helped produce better quality hierarchies according to the evaluation metrics

University of Liverpool Repository

Oxford University Research Archive

Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata

Author: Getoor Lise
Lerman Kristina
Plangprasopchok Anon
Publication venue
Publication date: 01/01/2010
Field of study

Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a community organizes knowledge. For instance, we can aggregate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualizing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges, since it is sparse, shallow, ambiguous, noisy, and inconsistent. We describe an approach to folksonomy learning based on relational clustering, which exploits structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo-sharing site Flickr, and demonstrate that the proposed approach addresses the challenges. Moreover, comparing to previous work, the approach produces larger, more accurate folksonomies, and in addition, scales better.Comment: 10 pages, To appear in the Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD) 201

arXiv.org e-Print Archive

CiteSeerX

Tagging, Folksonomy & Co - Renaissance of Manual Indexing?

Author: Voss Jakob
Publication venue
Publication date: 01/01/2007
Field of study

This paper gives an overview of current trends in manual indexing on the Web. Along with a general rise of user generated content there are more and more tagging systems that allow users to annotate digital resources with tags (keywords) and share their annotations with other users. Tagging is frequently seen in contrast to traditional knowledge organization systems or as something completely new. This paper shows that tagging should better be seen as a popular form of manual indexing on the Web. Difference between controlled and free indexing blurs with sufficient feedback mechanisms. A revised typology of tagging systems is presented that includes different user roles and knowledge organization systems with hierarchical relationships and vocabulary control. A detailed bibliography of current research in collaborative tagging is included.Comment: Preprint. 12 pages, 1 figure, 54 reference

arXiv.org e-Print Archive

E-LIS

Extracting tag hierarchies

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Vicsek Tamás
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Tagging items with descriptive annotations or keywords is a very natural way to compress and highlight information about the properties of the given entity. Over the years several methods have been proposed for extracting a hierarchy between the tags for systems with a "flat", egalitarian organization of the tags, which is very common when the tags correspond to free words given by numerous independent people. Here we present a complete framework for automated tag hierarchy extraction based on tag occurrence statistics. Along with proposing new algorithms, we are also introducing different quality measures enabling the detailed comparison of competing approaches from different aspects. Furthermore, we set up a synthetic, computer generated benchmark providing a versatile tool for testing, with a couple of tunable parameters capable of generating a wide range of test beds. Beside the computer generated input we also use real data in our studies, including a biological example with a pre-defined hierarchy between the tags. The encouraging similarity between the pre-defined and reconstructed hierarchy, as well as the seemingly meaningful hierarchies obtained for other real systems indicate that tag hierarchy extraction is a very promising direction for further research with a great potential for practical applications.Comment: 25 pages with 21 pages of supporting information, 25 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

ELTE Digital Institutional Repository (EDIT)

FigShare

The horse before the cart: improving the accuracy of taxonomic directions when building tag hierarchies

Author: Almoqhim Fahad
Millard David E.
Shadbolt Nigel
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 15/12/2015
Field of study

Content on the Web is huge and constantly growing, and building taxonomies for such content can help with navigation and organisation, but building taxonomies manually is costly and time-consuming. An alternative is to allow users to construct folksonomies: collective social classifications. Yet, folksonomies are inconsistent and their use for searching and browsing is limited. Approaches have been suggested for acquiring implicit hierarchical structures from folksonomies, however, but these approaches suffer from the ‘popularity-generality’ problem, in that popularity is assumed to be a proxy for generality, i.e. high-level taxonomic terms will occur more often than low-level ones. To tackle this problem, we propose in this paper an improved approach. It is based on the Heymann–Benz algorithm, and works by checking the taxonomic directions against a corpus of text. Our results show that popularity works as a proxy for generality in at most 90.91% of cases, but this can be improved to 95.45% using our approach, which should translate to higher-quality tag hierarchy structure

Southampton (e-Prints Soton)

Crossref

Knowledge Base Enrichment by Relation Learning from Social Tagging Data

Author: Coenen Frans
Dong Hang
Huang Kaizhu
Wang Wei
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

There has been considerable interest in transforming unstructured social tagging data into structured knowledge for semantic-based retrieval and recommendation. Research in this line mostly exploits data co-occurrence and often overlooks the complex and ambiguous meanings of tags. Furthermore, there have been few comprehensive evaluation studies regarding the quality of the discovered knowledge. We propose a supervised learning method to discover subsumption relations from tags. The key to this method is quantifying the probabilistic association among tags to better characterise their relations. We further develop an algorithm to organise tags into hierarchies based on the learned relations. Experiments were conducted using a large, publicly available dataset, Bibsonomy, and three popular, human-engineered or data-driven knowledge bases: DBpedia, Microsoft Concept Graph, and ACM Computing Classification System. We performed a comprehensive evaluation using different strategies: relation-level, ontology-level, and knowledge base enrichment based evaluation. The results clearly show that the proposed method can extract knowledge of better quality than the existing methods against the gold standard knowledge bases. The proposed approach can also enrich knowledge bases with new subsumption relations, having the potential to significantly reduce time and human effort for knowledge base maintenance and ontology evolution

University of Liverpool Repository

Edinburgh Research Explorer

Oxford University Research Archive

Hierarchical networks of scientific journals

Author: Enys Mones
Gergely Palla
Gergely Tibély
Péter Pollner
Tamás Vicsek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Academic journals are the repositories of mankind’s gradually accumulating knowledge of the surrounding world. Just as knowledge is organized into classes ranging from major disciplines, subjects and fields, to increasingly specific topics, journals can also be categorized into groups using various metric. In addition, they can be ranked according to their overall influence. However, according to recent studies, the impact, prestige and novelty of journals cannot be characterized by a single parameter such as, for example, the impact factor. To increase understanding of journal impact, the knowledge gap we set out to explore in our study is the evaluation of journal relevance using complex multi-dimensional measures. Thus, for the first time, our objective is to organize journals into multiple hierarchies based on citation data. The two approaches we use are designed to address this problem from different perspectives. We use a measure related to the notion of m- reaching centrality and find a network that shows a journal’s level of influence in terms of the direction and efficiency with which information spreads through the network. We find we can also obtain an alternative network using a suitably modified nested hierarchy extraction method applied to the same data. In this case, in a self-organized way, the journals become branches according to the major scientific fields, where the local structure of the branches reflect the hierarchy within the given field, with usually the most prominent journal (according to other measures) in the field chosen by the algorithm as the local root, and more specialized journals positioned deeper in the branch. This can make the navigation within different scientific fields and sub- fields very simple, and equivalent to navigating in the different branches of the nested hierarchy. We expect this to be particularly helpful, for example, when choosing the most appropriate journal for a given manuscript. According to our results, the two alternative hierarchies show a somewhat different, but also consistent, picture of the intricate relations between scientific journals, and, as such, they also provide a new perspective on how scientific knowledge is organized into networks

ELTE Digital Institutional Repository (EDIT)