1,421 research outputs found
Medical WordNet: A new methodology for the construction and validation of information resources for consumer health
A consumer health information system must be able to comprehend both expert and non-expert medical vocabulary and to map between the two. We describe an ongoing
project to create a new lexical database called Medical WordNet (MWN), consisting of
medically relevant terms used by and intelligible to non-expert subjects and supplemented by a corpus of natural-language sentences that is designed to provide
medically validated contexts for MWN terms. The corpus derives primarily from online health information sources targeted to consumers, and involves two sub-corpora, called Medical FactNet (MFN) and Medical BeliefNet (MBN), respectively. The former consists of statements accredited as true on the basis of a rigorous process of validation, the latter of statements which non-experts believe to be true. We summarize the MWN / MFN / MBN project, and describe some of its applications
Semantic levels of domain-independent commonsense knowledgebase for visual indexing and retrieval applications
Building intelligent tools for searching, indexing and retrieval applications is needed to congregate the rapidly increasing amount of visual data. This raised the need for building and maintaining ontologies and knowledgebases to support textual semantic representation of visual contents, which is an important block in these applications. This paper proposes a commonsense knowledgebase that forms the link between the visual world and its semantic textual representation. This domain-independent knowledge is provided at different levels of semantics by a fully automated engine that analyses, fuses and integrates previous commonsense knowledgebases. This knowledgebase satisfies the levels of semantic by adding two new levels: temporal event scenarios and psycholinguistic understanding. Statistical properties and an experiment evaluation, show coherency and effectiveness of the proposed knowledgebase in providing the knowledge needed for wide-domain visual applications
Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture
The World Wide Web holds a wealth of information in the form of unstructured
texts such as customer reviews for products, events and more. By extracting and
analyzing the expressed opinions in customer reviews in a fine-grained way,
valuable opportunities and insights for customers and businesses can be gained.
We propose a neural network based system to address the task of Aspect-Based
Sentiment Analysis to compete in Task 2 of the ESWC-2016 Challenge on Semantic
Sentiment Analysis. Our proposed architecture divides the task in two subtasks:
aspect term extraction and aspect-specific sentiment extraction. This approach
is flexible in that it allows to address each subtask independently. As a first
step, a recurrent neural network is used to extract aspects from a text by
framing the problem as a sequence labeling task. In a second step, a recurrent
network processes each extracted aspect with respect to its context and
predicts a sentiment label. The system uses pretrained semantic word embedding
features which we experimentally enhance with semantic knowledge extracted from
WordNet. Further features extracted from SenticNet prove to be beneficial for
the extraction of sentiment labels. As the best performing system in its
category, our proposed system proves to be an effective approach for the
Aspect-Based Sentiment Analysis
Map equation for link community
Community structure exists in many real-world networks and has been reported
being related to several functional properties of the networks. The
conventional approach was partitioning nodes into communities, while some
recent studies start partitioning links instead of nodes to find overlapping
communities of nodes efficiently. We extended the map equation method, which
was originally developed for node communities, to find link communities in
networks. This method is tested on various kinds of networks and compared with
the metadata of the networks, and the results show that our method can identify
the overlapping role of nodes effectively. The advantage of this method is that
the node community scheme and link community scheme can be compared
quantitatively by measuring the unknown information left in the networks
besides the community structure. It can be used to decide quantitatively
whether or not the link community scheme should be used instead of the node
community scheme. Furthermore, this method can be easily extended to the
directed and weighted networks since it is based on the random walk.Comment: 9 pages,5 figure
WordNet: An Electronic Lexical Reference System Based on Theories of Lexical Memory
Cet article fait la description de WordNet, système de référence électronique, dont le dessin est basé sur des théories psycholinguistiques concernant la mémoire lexicale et l’organisation mentale des mots.Les noms, les verbes et les adjectifs anglais sont organisés en groupes synonymes (les « synsets »), chacun représentant un concept lexical. Trois relations principales — l’hyponymie, la méronymie et l’antonymie — servent à établir les rapports conceptuels entre les « synsets ». Les présuppositions qui lient les verbes sont indiquées ainsi que leurs contextes syntaxiques et sémantiques.En tâchant de miroiter l’organisation mentale des concepts lexicaux, WordNet pourrait servir l’utilisateur sans formation en linguistique.This paper describes WordNet, an on-line lexical reference system whose design is based on psycholinguistic theories of human lexical organization and memory.English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Synonym sets are then related via three principal conceptual relations: hyponymy, meronymy, and antonymy. Verbs are additionally specified for presupposition relations that hold among them, and for their most common semantic/syntactic frames.By attempting to mirror the organization of the mental lexicon, WordNet strives to serve the linguistically unsophisticated user
An Infrastructure for acquiring high quality semantic metadata
Because metadata that underlies semantic web applications is gathered from distributed and heterogeneous data sources, it is important to ensure its quality (i.e., reduce duplicates, spelling errors, ambiguities). However, current infrastructures that acquire and integrate semantic data have only marginally addressed the issue of metadata quality. In this paper we present our metadata acquisition infrastructure, ASDI, which pays special attention to ensuring that high quality metadata is derived. Central to the architecture of ASDI is a erification engine that relies on several semantic web tools to check the quality of the derived data. We tested our prototype in the context of building a semantic web portal for our lab, KMi. An experimental evaluation omparing the automatically extracted data against manual annotations indicates that the verification engine enhances the quality of the extracted semantic metadata
The Long-Short Story of Movie Description
Generating descriptions for videos has many applications including assisting
blind people and human-robot interaction. The recent advances in image
captioning as well as the release of large-scale movie description datasets
such as MPII Movie Description allow to study this task in more depth. Many of
the proposed methods for image captioning rely on pre-trained object classifier
CNNs and Long-Short Term Memory recurrent networks (LSTMs) for generating
descriptions. While image description focuses on objects, we argue that it is
important to distinguish verbs, objects, and places in the challenging setting
of movie description. In this work we show how to learn robust visual
classifiers from the weak annotations of the sentence descriptions. Based on
these visual classifiers we learn how to generate a description using an LSTM.
We explore different design choices to build and train the LSTM and achieve the
best performance to date on the challenging MPII-MD dataset. We compare and
analyze our approach and prior work along various dimensions to better
understand the key challenges of the movie description task
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation
Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
- …
