Search CORE

746 research outputs found

Measuring Similarity in Large-Scale Folksonomies

Author: Capra Licia
De Meo Pasquale
Ferrara Emilio
Quattrone Giovanni
Publication venue
Publication date: 01/01/2011
Field of study

Social (or folksonomic) tagging has become a very popular way to describe content within Web 2.0 websites. Unlike\ud taxonomies, which overimpose a hierarchical categorisation of content, folksonomies enable end-users to freely create and choose the categories (in this case, tags) that best\ud describe some content. However, as tags are informally de-\ud ﬁned, continually changing, and ungoverned, social tagging\ud has often been criticised for lowering, rather than increasing, the efﬁciency of searching, due to the number of synonyms, homonyms, polysemy, as well as the heterogeneity of\ud users and the noise they introduce. To address this issue, a\ud variety of approaches have been proposed that recommend\ud users what tags to use, both when labelling and when looking for resources. As we illustrate in this paper, real world\ud folksonomies are characterized by power law distributions\ud of tags, over which commonly used similarity metrics, including the Jaccard coefﬁcient and the cosine similarity, fail\ud to compute. We thus propose a novel metric, speciﬁcally\ud developed to capture similarity in large-scale folksonomies,\ud that is based on a mutual reinforcement principle: that is,\ud two tags are deemed similar if they have been associated to\ud similar resources, and vice-versa two resources are deemed\ud similar if they have been labelled by similar tags. We offer an efﬁcient realisation of this similarity metric, and assess its quality experimentally, by comparing it against cosine similarity, on three large-scale datasets, namely Bibsonomy, MovieLens and CiteULike

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

CogPrints Cognitive Sciences Eprint Archive

Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata

Author: Getoor Lise
Lerman Kristina
Plangprasopchok Anon
Publication venue
Publication date: 01/01/2010
Field of study

Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a community organizes knowledge. For instance, we can aggregate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualizing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges, since it is sparse, shallow, ambiguous, noisy, and inconsistent. We describe an approach to folksonomy learning based on relational clustering, which exploits structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo-sharing site Flickr, and demonstrate that the proposed approach addresses the challenges. Moreover, comparing to previous work, the approach produces larger, more accurate folksonomies, and in addition, scales better.Comment: 10 pages, To appear in the Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD) 201

arXiv.org e-Print Archive

CiteSeerX

Exploring The Value Of Folksonomies For Creating Semantic Metadata

Author: Al-Khalifa Hend S.
Davis Hugh C.
Publication venue
Publication date: 01/01/2007
Field of study

Finding good keywords to describe resources is an on-going problem: typically we select such words manually from a thesaurus of terms, or they are created using automatic keyword extraction techniques. Folksonomies are an increasingly well populated source of unstructured tags describing web resources. This paper explores the value of the folksonomy tags as potential source of keyword metadata by examining the relationship between folksonomies, community produced annotations, and keywords extracted by machines. The experiment has been carried-out in two ways: subjectively, by asking two human indexers to evaluate the quality of the generated keywords from both systems; and automatically, by measuring the percentage of overlap between the folksonomy set and machine generated keywords set. The results of this experiment show that the folksonomy tags agree more closely with the human generated keywords than those automatically generated. The results also showed that the trained indexers preferred the semantics of folksonomy tags compared to keywords extracted automatically. These results can be considered as evidence for the strong relationship of folksonomies to the human indexer’s mindset, demonstrating that folksonomies used in the del.icio.us bookmarking service are a potential source for generating semantic metadata to annotate web resources

CiteSeerX

Southampton (e-Prints Soton)

Preliminary results in tag disambiguation using DBpedia

Author: Alani Harith
Corcho Oscar
Garcia Andres
Szomszor Martin
Publication venue
Publication date: 01/01/2009
Field of study

The availability of tag-based user-generated content for a variety of Web resources (music, photos, videos, text, etc.) has largely increased in the last years. Users can assign tags freely and then use them to share and retrieve information. However, tag-based sharing and retrieval is not optimal due to the fact that tags are plain text labels without an explicit or formal meaning, and hence polysemy and synonymy should be dealt with appropriately. To ameliorate these problems, we propose a context-based tag disambiguation algorithm that selects the meaning of a tag among a set of candidate DBpedia entries, using a common information retrieval similarity measure. The most similar DBpedia en-try is selected as the one representing the meaning of the tag. We describe and analyze some preliminary results, and discuss about current challenges in this area

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Archivo Digital UPM

Folks in Folksonomies: Social Link Prediction from Shared Metadata

Author: Barrat Alain
Cattuto Ciro
Markines Benjamin
Menczer Filippo
Schifanella Rossano
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Web 2.0 applications have attracted a considerable amount of attention because their open-ended nature allows users to create light-weight semantic scaffolding to organize and share content. To date, the interplay of the social and semantic components of social media has been only partially explored. Here we focus on Flickr and Last.fm, two social media systems in which we can relate the tagging activity of the users with an explicit representation of their social network. We show that a substantial level of local lexical and topical alignment is observable among users who lie close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local alignment between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with similar topical interests are more likely to be friends, and therefore semantic similarity measures among users based solely on their annotation metadata should be predictive of social links. We test this hypothesis on the Last.fm data set, confirming that the social network constructed from semantic similarity captures actual friendship more accurately than Last.fm's suggestions based on listening patterns.Comment: http://portal.acm.org/citation.cfm?doid=1718487.171852

arXiv.org e-Print Archive

CiteSeerX

HAL AMU

Extracting tag hierarchies

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Vicsek Tamás
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Tagging items with descriptive annotations or keywords is a very natural way to compress and highlight information about the properties of the given entity. Over the years several methods have been proposed for extracting a hierarchy between the tags for systems with a "flat", egalitarian organization of the tags, which is very common when the tags correspond to free words given by numerous independent people. Here we present a complete framework for automated tag hierarchy extraction based on tag occurrence statistics. Along with proposing new algorithms, we are also introducing different quality measures enabling the detailed comparison of competing approaches from different aspects. Furthermore, we set up a synthetic, computer generated benchmark providing a versatile tool for testing, with a couple of tunable parameters capable of generating a wide range of test beds. Beside the computer generated input we also use real data in our studies, including a biological example with a pre-defined hierarchy between the tags. The encouraging similarity between the pre-defined and reconstructed hierarchy, as well as the seemingly meaningful hierarchies obtained for other real systems indicate that tag hierarchy extraction is a very promising direction for further research with a great potential for practical applications.Comment: 25 pages with 21 pages of supporting information, 25 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

ELTE Digital Institutional Repository (EDIT)

FigShare

Finding Influential Users in Social Media Using Association Rule Learning

Author: Anton Borg
Au
Cha
Demšar
Erlandsson
Flach
Fredrik Erlandsson
Goethals
Henric Johnson
Hotho
Jankowski
Nancy
Piotr Bródka
Schmitz
Sheskin
Publication venue: 'MDPI AG'
Publication date: 01/01/2016
Field of study

Influential users play an important role in online social networks since users tend to have an impact on one other. Therefore, the proposed work analyzes users and their behavior in order to identify influential users and predict user participation. Normally, the success of a social media site is dependent on the activity level of the participating users. For both online social networking sites and individual users, it is of interest to find out if a topic will be interesting or not. In this article, we propose association learning to detect relationships between users. In order to verify the findings, several experiments were executed based on social network analysis, in which the most influential users identified from association rule learning were compared to the results from Degree Centrality and Page Rank Centrality. The results clearly indicate that it is possible to identify the most influential users using association rule learning. In addition, the results also indicate a lower execution time compared to state-of-the-art methods

arXiv.org e-Print Archive

Blekinge Institute of Technology

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line