5,378 research outputs found
Word Sense Disambiguation for Ontology Learning
Ontology learning aims to automatically extract ontological concepts and relationships from related text repositories and is expected to be more efficient and scalable than manual ontology development. One of the challenging issues associated with ontology learning is word sense disambiguation (WSD). Most WSD research employs resources such as WordNet, text corpora, or a hybrid approach. Motivated by the large volume and richness of user-generated content in social media, this research explores the role of social media in ontology learning. Specifically, our approach exploits social media as a dynamic context rich data source for WSD. This paper presents a method and preliminary evidence for the efficacy of our proposed method for WSD. The research is in progress toward conducting a formal evaluation of the social media based method for WSD, and plans to incorporate the WSD routine into an ontology learning system in the future
Evaluating the semantic web: a task-based approach
The increased availability of online knowledge has led to the design of several algorithms that solve a variety of tasks by harvesting the Semantic Web, i.e. by dynamically selecting and exploring a multitude of online ontologies. Our hypothesis is that the performance of such novel algorithms implicity provides an insight into the quality of the used ontologies and thus opens the way to a task-based evaluation of the Semantic Web. We have investigated this hypothesis by studying the lessons learnt about online ontologies when used to solve three tasks: ontology matching, folksonomy enrichment, and word sense disambiguation. Our analysis leads to a suit of conclusions about the status of the Semantic Web, which highlight a number of strengths and weaknesses of the semantic information available online and complement the findings of other analysis of the Semantic Web landscape
Using Distributed Representations to Disambiguate Biomedical and Clinical Concepts
In this paper, we report a knowledge-based method for Word Sense
Disambiguation in the domains of biomedical and clinical text. We combine word
representations created on large corpora with a small number of definitions
from the UMLS to create concept representations, which we then compare to
representations of the context of ambiguous terms. Using no relational
information, we obtain comparable performance to previous approaches on the
MSH-WSD dataset, which is a well-known dataset in the biomedical domain.
Additionally, our method is fast and easy to set up and extend to other
domains. Supplementary materials, including source code, can be found at https:
//github.com/clips/yarnComment: 6 pages, 1 figure, presented at the 15th Workshop on Biomedical
Natural Language Processing, Berlin 201
- …