196 research outputs found
The horse before the cart: improving the accuracy of taxonomic directions when building tag hierarchies
Content on the Web is huge and constantly growing, and building taxonomies for such content can help with navigation and organisation, but building taxonomies manually is costly and time-consuming. An alternative is to allow users to construct folksonomies: collective social classifications. Yet, folksonomies are inconsistent and their use for searching and browsing is limited. Approaches have been suggested for acquiring implicit hierarchical structures from folksonomies, however, but these approaches suffer from the ‘popularity-generality’ problem, in that popularity is assumed to be a proxy for generality, i.e. high-level taxonomic terms will occur more often than low-level ones. To tackle this problem, we propose in this paper an improved approach. It is based on the Heymann–Benz algorithm, and works by checking the taxonomic directions against a corpus of text. Our results show that popularity works as a proxy for generality in at most 90.91% of cases, but this can be improved to 95.45% using our approach, which should translate to higher-quality tag hierarchy structure
Preliminary results in tag disambiguation using DBpedia
The availability of tag-based user-generated content for a variety of Web resources (music, photos, videos, text, etc.) has largely increased in the last years. Users can assign tags freely and then use them to share and retrieve information. However, tag-based sharing and retrieval is not optimal due to the fact that tags are plain text labels without an explicit or formal meaning, and hence polysemy and synonymy should be dealt with appropriately. To ameliorate these problems, we propose a context-based tag disambiguation algorithm that selects the meaning of a tag among a set of candidate DBpedia entries, using a common information retrieval similarity measure. The most similar DBpedia en-try is selected as the one representing the meaning of the tag. We describe and analyze some preliminary results, and discuss about current challenges in this area
Semantically enriching folksonomies with FLOR
While the increasing popularity of folksonomies has lead to a vast quantity of tagged data, resource retrieval in these systems is limited by them being agnostic to the meaning (i.e., semantics) of tags. Our goal is to automatically enrich folksonomy tags (and implicitly the related resources) with formal semantics by associating them to relevant concepts defined in online ontologies. We introduce FLOR, a mechanism for automatic folksonomy enrichment by combining knowledge from WordNet and online ontologies.We experimentally tested FLOR on tag sets drawn
from 226 Flickr photos and obtained a precision value of 93% and an approximate recall of 49%
Improving search in folksonomies: a task based comparison of WordNet and ontologies
Search in folksonomies is hampered by the fact that the meaning of tags and their relations are not made explicit in the system. This is typically addressed by using knowledge sources (KS) to semantically enrich tagspaces, most notably WordNet and (online) ontologies. However, there is no insight of how the different characteristics of these KS contribute to search improvement in folksonomies. In this work we compare these two KS in the context of folksonomy search. We show that while WordNet leads to richer tag structures than online ontologies do, its fine-grained sense hierarchy renders these structures less effective in search compared to the ones generated from ontologies
Recommended from our members
Reusing Ontologies to Enrich Semantically User Content in Web2.0: A Case Study on Folksonomies
Semantic Web and Web2.0 emerged during the past decade promising to achieve new frontiers for the Web. On the one hand, the Semantic Web is an interlinked web of data, supported by ontological semantics and allowing for intelligent applications such as semantic search and integration of heterogeneous content across systems and applications. On the other hand, Web2.0 represents the new technologies and paradigms that revolutionised the user engagement in content creation and introduced novel means towards social interaction. Bridging the gap between Web2.0 and the Semantic Web has been proposed as a means to better manage and interact with the large amounts of user contributed content, which is a new challenge for Web2.0. This thesis focuses on a popular paradigm of Web2.0, folksonomies. In particular, we investigate the semantic enrichment of folksonomy tagspaces by reusing ontologies available in the Semantic Web. We identify the need for methods that automatically apply semantic descriptions to user generated content without requiring user intervention or alteration of the current tagging paradigm. We use an iterative approach in order to identify the characteristics of folksonomies and the attributes of knowledge sources that influence the semantic enrichment of tagspaces. We build on the results of our experimental studies to implement a folksonomy enrichment algorithm, that given an input tagspace, automatically creates a semantic structure that describes the meaning and relations of tags. We introduce measures for the evaluation of enriched tagspaces and finally, we propose a search algorithm that exploits the semantic structures to improve folksonomy search
Recommended from our members
What can be done with the Semantic Web? An overview of Watson-based applications
Thanks to the huge efforts deployed in the community for creating, building and generating semantic information for the Semantic Web, large amounts of machine processable knowledge are now openly available. Watson is an infrastructure component for the Semantic Web, a gateway that provides the necessary functions to support applications in using the Semantic Web. In this paper, we describe a number of applications relying on Watson, with the purpose of demonstrating what can be achieved with the Semantic Web nowadays and what sort of new, smart and useful features can be derived from the exploitation of this large, distributed and heterogeneous base of semantic information
Semantics, sensors, and the social web: The live social semantics experiments
The Live Social Semantics is an innovative application that encourages and guides social networking between researchers at conferences and similar events. The application integrates data and technologies from the Semantic Web, online social networks, and a face-to-face contact sensing platform. It helps researchers to find like-minded and influential researchers, to identify and meet people in their community of practice, and to capture and later retrace their real-world networking activities at conferences. The application was successfully deployed at two international conferences, attracting more than 300 users in total. This paper describes this application, and discusses and evaluates the results of its two deployment
Recommended from our members
Linking Data Across Universities: An Integrated Video Lectures Dataset
This paper presents our work and experience interlinking educational information across universities through the use of Linked Data principles and technologies. More specifically this paper is focused on selecting, extracting, structuring and interlinking information of video lectures produced by 27 different educational institutions. For this purpose, selected information from several websites and YouTube channels have been scraped and structured according to well-known vocabularies, like FOAF 1, or the W3C Ontology for Media Resources 2. To integrate this information, the extracted videos have been categorized under a common classification space, the taxonomy defined by the Open Directory Project 3. An evaluation of this categorization process has been conducted obtaining a 98% degree of coverage and 89% degree of correctness. As a result of this process a new Linked Data dataset has been released containing more than 14,000 video lectures from 27 different institutions and categorized under a common classification scheme
Enrichment and ranking of the YouTube tag space and integration with the Linked Data cloud
The increase of personal digital cameras with video functionality and video-enabled camera phones has increased the amount of user-generated videos on the Web. People are spending more and more time viewing online videos as a major source of entertainment and “infotainment”. Social websites allow users to assign shared free-form tags to user-generated multimedia resources, thus generating annotations for objects with a minimum amount of effort. Tagging allows communities to organise their multimedia items into browseable sets, but these tags may be poorly chosen and related tags may be omitted. Current techniques to retrieve, integrate and present this media to users are deficient and could do with improvement. In this paper, we describe a framework for semantic enrichment, ranking and integration of web video tags using Semantic Web technologies. Semantic enrichment of folksonomies can bridge the gap between the uncontrolled and flat structures typically found in user-generated content and structures provided by the Semantic Web. The enhancement of tag spaces with semantics has been accomplished through two major tasks: a tag space expansion and ranking step; and through concept matching and integration with the Linked Data cloud. We have explored social, temporal and spatial contexts to enrich and extend the existing tag space. The resulting semantic tag space is modelled via a local graph based on co-occurrence distances for ranking. A ranked tag list is mapped and integrated with the Linked Data cloud through the DBpedia resource repository. Multi-dimensional context filtering for tag expansion means that tag ranking is much easier and it provides less ambiguous tag to concept matching
Review of the state of the art: discovering and associating semantics to tags in folksonomies
This paper describes and compares the most relevant approaches for associating tags with semantics in order to make explicit the meaning of those tags. We identify a common set of steps that are usually considered across all these approaches and frame our descriptions according to them, providing a unified view of how each approach tackles the different problems that appear during the semantic association process. Furthermore, we provide some recommendations on (a) how and when to use each of the approaches according to the characteristics of the data source, and (b) how to improve results by leveraging the strengths of the different approaches
- …