1,421 research outputs found

    Medical WordNet: A new methodology for the construction and validation of information resources for consumer health

    Get PDF
    A consumer health information system must be able to comprehend both expert and non-expert medical vocabulary and to map between the two. We describe an ongoing project to create a new lexical database called Medical WordNet (MWN), consisting of medically relevant terms used by and intelligible to non-expert subjects and supplemented by a corpus of natural-language sentences that is designed to provide medically validated contexts for MWN terms. The corpus derives primarily from online health information sources targeted to consumers, and involves two sub-corpora, called Medical FactNet (MFN) and Medical BeliefNet (MBN), respectively. The former consists of statements accredited as true on the basis of a rigorous process of validation, the latter of statements which non-experts believe to be true. We summarize the MWN / MFN / MBN project, and describe some of its applications

    Semantic levels of domain-independent commonsense knowledgebase for visual indexing and retrieval applications

    Get PDF
    Building intelligent tools for searching, indexing and retrieval applications is needed to congregate the rapidly increasing amount of visual data. This raised the need for building and maintaining ontologies and knowledgebases to support textual semantic representation of visual contents, which is an important block in these applications. This paper proposes a commonsense knowledgebase that forms the link between the visual world and its semantic textual representation. This domain-independent knowledge is provided at different levels of semantics by a fully automated engine that analyses, fuses and integrates previous commonsense knowledgebases. This knowledgebase satisfies the levels of semantic by adding two new levels: temporal event scenarios and psycholinguistic understanding. Statistical properties and an experiment evaluation, show coherency and effectiveness of the proposed knowledgebase in providing the knowledge needed for wide-domain visual applications

    Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture

    Full text link
    The World Wide Web holds a wealth of information in the form of unstructured texts such as customer reviews for products, events and more. By extracting and analyzing the expressed opinions in customer reviews in a fine-grained way, valuable opportunities and insights for customers and businesses can be gained. We propose a neural network based system to address the task of Aspect-Based Sentiment Analysis to compete in Task 2 of the ESWC-2016 Challenge on Semantic Sentiment Analysis. Our proposed architecture divides the task in two subtasks: aspect term extraction and aspect-specific sentiment extraction. This approach is flexible in that it allows to address each subtask independently. As a first step, a recurrent neural network is used to extract aspects from a text by framing the problem as a sequence labeling task. In a second step, a recurrent network processes each extracted aspect with respect to its context and predicts a sentiment label. The system uses pretrained semantic word embedding features which we experimentally enhance with semantic knowledge extracted from WordNet. Further features extracted from SenticNet prove to be beneficial for the extraction of sentiment labels. As the best performing system in its category, our proposed system proves to be an effective approach for the Aspect-Based Sentiment Analysis

    Map equation for link community

    Full text link
    Community structure exists in many real-world networks and has been reported being related to several functional properties of the networks. The conventional approach was partitioning nodes into communities, while some recent studies start partitioning links instead of nodes to find overlapping communities of nodes efficiently. We extended the map equation method, which was originally developed for node communities, to find link communities in networks. This method is tested on various kinds of networks and compared with the metadata of the networks, and the results show that our method can identify the overlapping role of nodes effectively. The advantage of this method is that the node community scheme and link community scheme can be compared quantitatively by measuring the unknown information left in the networks besides the community structure. It can be used to decide quantitatively whether or not the link community scheme should be used instead of the node community scheme. Furthermore, this method can be easily extended to the directed and weighted networks since it is based on the random walk.Comment: 9 pages,5 figure

    WordNet: An Electronic Lexical Reference System Based on Theories of Lexical Memory

    Get PDF
    Cet article fait la description de WordNet, système de référence électronique, dont le dessin est basé sur des théories psycholinguistiques concernant la mémoire lexicale et l’organisation mentale des mots.Les noms, les verbes et les adjectifs anglais sont organisés en groupes synonymes (les « synsets »), chacun représentant un concept lexical. Trois relations principales — l’hyponymie, la méronymie et l’antonymie — servent à établir les rapports conceptuels entre les « synsets ». Les présuppositions qui lient les verbes sont indiquées ainsi que leurs contextes syntaxiques et sémantiques.En tâchant de miroiter l’organisation mentale des concepts lexicaux, WordNet pourrait servir l’utilisateur sans formation en linguistique.This paper describes WordNet, an on-line lexical reference system whose design is based on psycholinguistic theories of human lexical organization and memory.English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Synonym sets are then related via three principal conceptual relations: hyponymy, meronymy, and antonymy. Verbs are additionally specified for presupposition relations that hold among them, and for their most common semantic/syntactic frames.By attempting to mirror the organization of the mental lexicon, WordNet strives to serve the linguistically unsophisticated user

    An Infrastructure for acquiring high quality semantic metadata

    Get PDF
    Because metadata that underlies semantic web applications is gathered from distributed and heterogeneous data sources, it is important to ensure its quality (i.e., reduce duplicates, spelling errors, ambiguities). However, current infrastructures that acquire and integrate semantic data have only marginally addressed the issue of metadata quality. In this paper we present our metadata acquisition infrastructure, ASDI, which pays special attention to ensuring that high quality metadata is derived. Central to the architecture of ASDI is a erification engine that relies on several semantic web tools to check the quality of the derived data. We tested our prototype in the context of building a semantic web portal for our lab, KMi. An experimental evaluation omparing the automatically extracted data against manual annotations indicates that the verification engine enhances the quality of the extracted semantic metadata

    The Long-Short Story of Movie Description

    Full text link
    Generating descriptions for videos has many applications including assisting blind people and human-robot interaction. The recent advances in image captioning as well as the release of large-scale movie description datasets such as MPII Movie Description allow to study this task in more depth. Many of the proposed methods for image captioning rely on pre-trained object classifier CNNs and Long-Short Term Memory recurrent networks (LSTMs) for generating descriptions. While image description focuses on objects, we argue that it is important to distinguish verbs, objects, and places in the challenging setting of movie description. In this work we show how to learn robust visual classifiers from the weak annotations of the sentence descriptions. Based on these visual classifiers we learn how to generate a description using an LSTM. We explore different design choices to build and train the LSTM and achieve the best performance to date on the challenging MPII-MD dataset. We compare and analyze our approach and prior work along various dimensions to better understand the key challenges of the movie description task

    Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

    No full text
    Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
    corecore