241 research outputs found

    Large-Scale information extraction from textual definitions through deep syntactic and semantic analysis

    Get PDF
    We present DEFIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DEFIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations

    Enrichment and ranking of the YouTube tag space and integration with the Linked Data cloud

    Get PDF
    The increase of personal digital cameras with video functionality and video-enabled camera phones has increased the amount of user-generated videos on the Web. People are spending more and more time viewing online videos as a major source of entertainment and “infotainment”. Social websites allow users to assign shared free-form tags to user-generated multimedia resources, thus generating annotations for objects with a minimum amount of effort. Tagging allows communities to organise their multimedia items into browseable sets, but these tags may be poorly chosen and related tags may be omitted. Current techniques to retrieve, integrate and present this media to users are deficient and could do with improvement. In this paper, we describe a framework for semantic enrichment, ranking and integration of web video tags using Semantic Web technologies. Semantic enrichment of folksonomies can bridge the gap between the uncontrolled and flat structures typically found in user-generated content and structures provided by the Semantic Web. The enhancement of tag spaces with semantics has been accomplished through two major tasks: a tag space expansion and ranking step; and through concept matching and integration with the Linked Data cloud. We have explored social, temporal and spatial contexts to enrich and extend the existing tag space. The resulting semantic tag space is modelled via a local graph based on co-occurrence distances for ranking. A ranked tag list is mapped and integrated with the Linked Data cloud through the DBpedia resource repository. Multi-dimensional context filtering for tag expansion means that tag ranking is much easier and it provides less ambiguous tag to concept matching

    Review of the state of the art: discovering and associating semantics to tags in folksonomies

    Get PDF
    This paper describes and compares the most relevant approaches for associating tags with semantics in order to make explicit the meaning of those tags. We identify a common set of steps that are usually considered across all these approaches and frame our descriptions according to them, providing a unified view of how each approach tackles the different problems that appear during the semantic association process. Furthermore, we provide some recommendations on (a) how and when to use each of the approaches according to the characteristics of the data source, and (b) how to improve results by leveraging the strengths of the different approaches

    DBpedia Mashups

    Get PDF
    If you see Wikipedia as a main place where the knowledge of mankind is concentrated, then DBpedia – which is extracted from Wikipedia – is the best place to find machine representation of that knowledge. DBpedia constitutes a major part of the semantic data on the web. Its sheer size and wide coverage enables you to use it in many kind of mashups: it contains biographical, geographical, bibliographical data; as well as discographies, movie meta-data, technical specifications, and links to social media profiles and much more. Just like Wikipedia, DBpedia is a truly cross-language effort, e.g., it provides descriptions and other information in various languages. In this chapter we introduce its structure, contents, its connections to outside resources. We describe how the structured information in DBpedia is gathered, what you can expect from it and what are its characteristics and limitations. We analyze how other mashups exploit DBpedia and present best practices of its usage. In particular, we describe how Sztakipedia – an intelligent writing aid based on DBpedia – can help Wikipedia contributors to improve the quality and integrity of articles. DBpedia offers a myriad of ways to accessing the information it contains, ranging from SPARQL to bulk download. We compare the pros and cons of these methods. We conclude that DBpedia is an un-avoidable resource for pplications dealing with commonly known entities like notable persons, places; and for others looking for a rich hub connecting other semantic resources

    From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

    Get PDF
    Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

    DBpedia Mashups

    Get PDF
    If you see Wikipedia as a main place where the knowledge of mankind is concentrated, then DBpedia – which is extracted from Wikipedia – is the best place to find machine representation of that knowledge. DBpedia constitutes a major part of the semantic data on the web. Its sheer size and wide coverage enables you to use it in many kind of mashups: it contains biographical, geographical, bibliographical data; as well as discographies, movie meta-data, technical specifications, and links to social media profiles and much more. Just like Wikipedia, DBpedia is a truly cross-language effort, e.g., it provides descriptions and other information in various languages. In this chapter we introduce its structure, contents, its connections to outside resources. We describe how the structured information in DBpedia is gathered, what you can expect from it and what are its characteristics and limitations. We analyze how other mashups exploit DBpedia and present best practices of its usage. In particular, we describe how Sztakipedia – an intelligent writing aid based on DBpedia – can help Wikipedia contributors to improve the quality and integrity of articles. DBpedia offers a myriad of ways to accessing the information it contains, ranging from SPARQL to bulk download. We compare the pros and cons of these methods. We conclude that DBpedia is an un-avoidable resource for pplications dealing with commonly known entities like notable persons, places; and for others looking for a rich hub connecting other semantic resources

    An Ontology based Text-to-Picture Multimedia m-Learning System

    Get PDF
    Multimedia Text-to-Picture is the process of building mental representation from words associated with images. From the research aspect, multimedia instructional message items are illustrations of material using words and pictures that are designed to promote user realization. Illustrations can be presented in a static form such as images, symbols, icons, figures, tables, charts, and maps; or in a dynamic form such as animation, or video clips. Due to the intuitiveness and vividness of visual illustration, many text to picture systems have been proposed in the literature like, Word2Image, Chat with Illustrations, and many others as discussed in the literature review chapter of this thesis. However, we found that some common limitations exist in these systems, especially for the presented images. In fact, the retrieved materials are not fully suitable for educational purposes. Many of them are not context-based and didn’t take into consideration the need of learners (i.e., general purpose images). Manually finding the required pedagogic images to illustrate educational content for learners is inefficient and requires huge efforts, which is a very challenging task. In addition, the available learning systems that mine text based on keywords or sentences selection provide incomplete pedagogic illustrations. This is because words and their semantically related terms are not considered during the process of finding illustrations. In this dissertation, we propose new approaches based on the semantic conceptual graph and semantically distributed weights to mine optimal illustrations that match Arabic text in the children’s story domain. We combine these approaches with best keywords and sentences selection algorithms, in order to improve the retrieval of images matching the Arabic text. Our findings show significant improvements in modelling Arabic vocabulary with the most meaningful images and best coverage of the domain in discourse. We also develop a mobile Text-to-Picture System that has two novel features, which are (1) a conceptual graph visualization (CGV) and (2) a visual illustrative assessment. The CGV shows the relationship between terms associated with a picture. It enables the learners to discover the semantic links between Arabic terms and improve their understanding of Arabic vocabulary. The assessment component allows the instructor to automatically follow up the performance of learners. Our experiments demonstrate the efficiency of our multimedia text-to-picture system in enhancing the learners’ knowledge and boost their comprehension of Arabic vocabulary

    Content Recommendation Through Linked Data

    Get PDF
    Nowadays, people can easily obtain a huge amount of information from the Web, but often they have no criteria to discern it. This issue is known as information overload. Recommender systems are software tools to suggest interesting items to users and can help them to deal with a vast amount of information. Linked Data is a set of best practices to publish data on the Web, and it is the basis of the Web of Data, an interconnected global dataspace. This thesis discusses how to discover information useful for the user from the vast amount of structured data, and notably Linked Data available on the Web. The work addresses this issue by considering three research questions: how to exploit existing relationships between resources published on the Web to provide recommendations to users; how to represent the user and his context to generate better recommendations for the current situation; and how to effectively visualize the recommended resources and their relationships. To address the first question, the thesis proposes a new algorithm based on Linked Data which exploits existing relationships between resources to recommend related resources. The algorithm was integrated into a framework to deploy and evaluate Linked Data based recommendation algorithms. In fact, a related problem is how to compare them and how to evaluate their performance when applied to a given dataset. The user evaluation showed that our algorithm improves the rate of new recommendations, while maintaining a satisfying prediction accuracy. To represent the user and their context, this thesis presents the Recommender System Context ontology, which is exploited in a new context-aware approach that can be used with existing recommendation algorithms. The evaluation showed that this method can significantly improve the prediction accuracy. As regards the problem of effectively visualizing the recommended resources and their relationships, this thesis proposes a visualization framework for DBpedia (the Linked Data version of Wikipedia) and mobile devices, which is designed to be extended to other datasets. In summary, this thesis shows how it is possible to exploit structured data available on the Web to recommend useful resources to users. Linked Data were successfully exploited in recommender systems. Various proposed approaches were implemented and applied to use cases of Telecom Italia

    Collaborative recommendations with content-based filters for cultural activities via a scalable event distribution platform

    Get PDF
    Nowadays, most people have limited leisure time and the offer of (cultural) activities to spend this time is enormous. Consequently, picking the most appropriate events becomes increasingly difficult for end-users. This complexity of choice reinforces the necessity of filtering systems that assist users in finding and selecting relevant events. Whereas traditional filtering tools enable e.g. the use of keyword-based or filtered searches, innovative recommender systems draw on user ratings, preferences, and metadata describing the events. Existing collaborative recommendation techniques, developed for suggesting web-shop products or audio-visual content, have difficulties with sparse rating data and can not cope at all with event-specific restrictions like availability, time, and location. Moreover, aggregating, enriching, and distributing these events are additional requisites for an optimal communication channel. In this paper, we propose a highly-scalable event recommendation platform which considers event-specific characteristics. Personal suggestions are generated by an advanced collaborative filtering algorithm, which is more robust on sparse data by extending user profiles with presumable future consumptions. The events, which are described using an RDF/OWL representation of the EventsML-G2 standard, are categorized and enriched via smart indexing and open linked data sets. This metadata model enables additional content-based filters, which consider event-specific characteristics, on the recommendation list. The integration of these different functionalities is realized by a scalable and extendable bus architecture. Finally, focus group conversations were organized with external experts, cultural mediators, and potential end-users to evaluate the event distribution platform and investigate the possible added value of recommendations for cultural participation
    • …
    corecore