Search CORE

622 research outputs found

Word sense disambiguation and information retrieval

Author: Sanderson M.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/1994
Field of study

It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval (IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will increase. However, recent research into the application of a word sense disambiguator to an IR system failed to show any performance increase. From these results it has become clear that more basic research is needed to investigate the relationship between sense ambiguity, disambiguation, and IR. Using a technique that introduces additional sense ambiguity into a collection, this paper presents research that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an IR system when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of accuracy

White Rose Research Online

Word sense disambiguation and information retrieval

Author: Sanderson M.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/1914
Field of study

MIT Libraries Dome

White Rose Research Online

Sense resolution properties of logical imaging

Author: Crestani F.
Sanderson M.
van Rijsbergen C.J.
Publication venue: Taylor Graham Publishing
Publication date: 01/01/1995
Field of study

The evaluation of an implication by Imaging is a logical technique developed in the framework of modal logic. Its interpretation in the context of a “possible worlds” semantics is very appealing for IR. In 1994, Crestani and Van Rijsbergen proposed an interpretation of Imaging in the context of IR based on the assumption that “a term is a possibleworld”. This approach enables the exploitation of term– term relationshipswhich are estimated using an information theoretic measure. Recent analysis of the probability kinematics of Logical Imaging in IR have suggested that this technique has some interesting sense resolution properties. In this paper we will present this new line of research and we will relate it to more classical research into word senses

CiteSeerX

White Rose Research Online

The interaction of knowledge sources in word sense disambiguation

Author: Brill Eric
Daelemans Walter
Daelemans Walter
Ide Nancy
Kilgarriff Adam
Marcus Mitchell
Mark Stevenson
Masterman Margaret
McRoy Susan
Yorick Wilks
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2001
Field of study

Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems

CiteSeerX

Crossref

White Rose Research Online

Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning

Author: Mooney Raymond J.
Publication venue
Publication date: 01/01/1996
Field of study

This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX

Retrieving with good sense

Author: Sanderson M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2000
Field of study

Although always present in text, word sense ambiguity only recently became regarded as a problem to information retrieval which was potentially solvable. The growth of interest in word senses resulted from new directions taken in disambiguation research. This paper first outlines this research and surveys the resulting efforts in information retrieval. Although the majority of attempts to improve retrieval effectiveness were unsuccessful, much was learnt from the research. Most notably a notion of under what circumstance disambiguation may prove of use to retrieval

CiteSeerX

White Rose Research Online

Selective Sampling for Example-based Word Sense Disambiguation

Author: Fujii Atsushi
Inui Kentaro
Tanaka Hozumi
Tokunaga Takenobu
Publication venue
Publication date: 01/01/1998
Field of study

This paper proposes an efficient example sampling method for example-based word sense disambiguation systems. To construct a database of practical size, a considerable overhead for manual sense disambiguation (overhead for supervision) is required. In addition, the time complexity of searching a large-sized database poses a considerable problem (overhead for search). To counter these problems, our method selectively samples a smaller-sized effective subset from a given example set for use in word sense disambiguation. Our method is characterized by the reliance on the notion of training utility: the degree to which each example is informative for future example sampling when used for the training of the system. The system progressively collects examples by selecting those with greatest utility. The paper reports the effectiveness of our method through experiments on about one thousand sentences. Compared to experiments with other example sampling methods, our method reduced both the overhead for supervision and the overhead for search, without the degeneration of the performance of the system.Comment: 25 pages, 14 Postscript figure

arXiv.org e-Print Archive

CiteSeerX

Computational linguistics for metadata building: Aggregating text processing technologies for enhanced image access

Author: Abels Eileen
Beaudoin Joan E.
Jenemann Laura
Klavans Judith
Lin Jimmy
Lippincott Tom
Passonneau Rebecca
Sheffield Carolyn
Sidhu Tandeep
Soergel Dagobert
Yano Tae
Publication venue: DigitalCommons@WayneState
Publication date: 26/08/2008
Field of study

We present a system which applies text mining using computational linguistic techniques to automatically extract, categorize, disambiguate and filter metadata for image access. Candidate subject terms are identified through standard approaches; novel semantic categorization using machine learning and disambiguation using both WordNet and a domain specific thesaurus are applied. The resulting metadata can be manually edited by image catalogers or filtered by semi-automatic rules. We describe the implementation of this workbench created for, and evaluated by, image catalogers. We discuss the system\u27s current functionality, developed under the Computational Linguistics for Metadata Building (CLiMB) research project. The CLiMB Toolkit has been tested with several collections, including: Art Images for College Teaching (AICT), ARTStor, the National Gallery of Art (NGA), the Senate Museum, and from collaborative projects such as the Landscape Architecture Image Resource (LAIR) and the field guides of the Vernacular Architecture Group (VAG)

Digital Commons@Wayne State University