Search CORE

826 research outputs found

Word Sense Determination from Wikipedia Data Using Neural Networks

Author: Liu Qiao
Publication venue: SJSU ScholarWorks
Publication date: 01/10/2017
Field of study

Many words have multiple meanings. For example, “plant” can mean a type of living organism or a factory. Being able to determine the sense of such words is very useful in natural language processing tasks, such as speech synthesis, question answering, and machine translation. For the project described in this report, we used a modular model to classify the sense of words to be disambiguated. This model consisted of two parts: The first part was a neural-network-based language model to compute continuous vector representations of words from data sets created from Wikipedia pages. The second part classified the meaning of the given word without explicitly knowing what the meaning is. In this unsupervised word sense determination task, we did not need human-tagged training data or a dictionary of senses for each word. We tested the model with some naturally ambiguous words, and compared our experimental results with the related work by Schütze in 1998. Our model achieved similar accuracy as Schütze’s work for some words

SJSU ScholarWorks

Co-occurrence Vectors from Corpora vs. Distance Vectors from Dictionaries

Author: Nitta Yoshihiko
Niwa Yoshiki
Publication venue
Publication date: 01/01/1994
Field of study

A comparison was made of vectors derived by using ordinary co-occurrence statistics from large text corpora and of vectors derived by measuring the inter-word distances in dictionary definitions. The precision of word sense disambiguation by using co-occurrence vectors from the 1987 Wall Street Journal (20M total words) was higher than that by using distance vectors from the Collins English Dictionary (60K head words + 1.6M definition words). However, other experimental results suggest that distance vectors contain some different semantic information from co-occurrence vectors.Comment: 6 pages, appeared in the Proc. of COLING94 (pp. 304-309)

arXiv.org e-Print Archive

CiteSeerX

Crossref

Adaptive Word Sense Tagging on Chinese Corpus

Author: Chen Jen-Nan
Ker Sue-Jin
Publication venue: Logico-Linguistic Society of Japan
Publication date: 16/11/2005
Field of study

This study describes a general framework for adaptive word sense disambiguation. The proposed framework begins with knowledge acquisition from the relatively easy context of a corpus. The proposed framework heavily relies on the adaptive step that enriches the initial knowledge base with knowledge gleaned from the partially disambiguated text. Once adjusted to fit the text at hand, the knowledge base is applied to the text again to finalize the disambiguation decision. The effectiveness of this approach was examined through sentences from the Sinica corpus. Experimental results indicated that adaptation significantly improved the performance of WSD. Moreover, the adaptive approach, achieved an applicability improvement from 33.0% up to 74.9% with a comparable precision

Waseda University Repository

TR-2002011: Corpus-Based Ambiguity Resolution of Biomedical Terms Using Knowledge Bases and Machine Learning

Author: Liu Hongfang
Publication venue: CUNY Academic Works
Publication date: 01/01/2002
Field of study

City University of New York

Corpus-Based Techniques for Word Sense Disambiguation

Author: Levow Gina-Anne
Publication venue
Publication date: 01/01/1997
Field of study

The need for robust and easily extensible systems for word sense disambiguation coupled with successes in training systems for a variety of tasks using large on-line corpora has led to extensive research into corpus-based statistical approaches to this problem. Promising results have been achieved by vector space representations of context, clustering combined with a semantic knowledge base, and decision lists based on collocational relations. We evaluate these techniques with respect to three important criteria: how their definition of context affects their ability to incorporate different types of disambiguating information, how they define similarity among senses, and how easily they can generalize to new senses. The strengths and weaknesses of these systems provide guidance for future systems which must capture and model a variety of disambiguating information, both syntactic and semantic

CiteSeerX

DSpace@MIT

The interaction of knowledge sources in word sense disambiguation

Author: Brill Eric
Daelemans Walter
Daelemans Walter
Ide Nancy
Kilgarriff Adam
Marcus Mitchell
Mark Stevenson
Masterman Margaret
McRoy Susan
Yorick Wilks
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2001
Field of study

Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems

CiteSeerX

Crossref

White Rose Research Online

A Similarity Based Concordance Approach to Word Sense Disambiguation

Author: Guru Ramakrishnan B.
Publication venue: The Keep
Publication date: 01/01/2004
Field of study

This study attempts to solve the problem of Word Sense Disambiguation using a combination of statistical, probabilistic and word matching algorithms. These algorithms consider that words and sentences have some hidden similarities and that the polysemous words in any context should be assigned to a sense after each execution of the algorithm. The algorithm was tested with sufficient sample data and the efficiency of the disambiguation performance has proven to increase significantly after the inclusion of the concordance methodology

Eastern Illinois University

A Similarity Based Concordance Approach to Word Sense Disambiguation

Author: Guru Ramakrishnan B.
Publication venue: The Keep
Publication date: 01/01/2004
Field of study

Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

Author: Anantharam Pramod
Anantharam Pramod
Balasuriya Lakshika
Ferrucci David
Kimmig Angelika
McMahon Connor
Meng Lingling
Perera Sujan
Sheth Amit
Wijeratne Sanjaya
Wijeratne Sanjaya
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770

arXiv.org e-Print Archive

Crossref

Scholar Commons - Institutional Repository of the University of South Carolina

CORE