85 research outputs found
Word sense disambiguation and information retrieval
It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval
(IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will
increase. However, recent research into the application of a word sense disambiguator to an IR system failed
to show any performance increase. From these results it has become clear that more basic research is needed
to investigate the relationship between sense ambiguity, disambiguation, and IR.
Using a technique that introduces additional sense ambiguity into a collection, this paper presents research
that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have
on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an IR system
when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to
be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of
accuracy
Word sense disambiguation and information retrieval
It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval
(IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will
increase. However, recent research into the application of a word sense disambiguator to an IR system failed
to show any performance increase. From these results it has become clear that more basic research is needed
to investigate the relationship between sense ambiguity, disambiguation, and IR.
Using a technique that introduces additional sense ambiguity into a collection, this paper presents research
that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have
on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an IR system
when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to
be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of
accuracy
A Flexible Shallow Approach to Text Generation
In order to support the efficient development of NL generation systems, two
orthogonal methods are currently pursued with emphasis: (1) reusable, general,
and linguistically motivated surface realization components, and (2) simple,
task-oriented template-based techniques. In this paper we argue that, from an
application-oriented perspective, the benefits of both are still limited. In
order to improve this situation, we suggest and evaluate shallow generation
methods associated with increased flexibility. We advise a close connection
between domain-motivated and linguistic ontologies that supports the quick
adaptation to new tasks and domains, rather than the reuse of general
resources. Our method is especially designed for generating reports with
limited linguistic variations.Comment: LaTeX, 10 page
Three Approaches to Generating Texts in Different Styles
Natural Language Generation (nlg) systems generate texts in English and other human languages from non-linguistic input data. Usually there are a large number of possible texts that can communicate the input data, and nlg systems must choose one of these. We argue that style can be used by nlg systems to choose between possible texts, and explore how this can be done by (1) explicit stylistic parameters, (2) imitating a genre style, and (3) imitating an individual’s style
Generating readable texts for readers with low basic skills
Most NLG systems generate texts for readers with good reading ability, but SkillSum adapts its output for readers with poor literacy. Evaluation with lowskilled readers confirms that SkillSum's knowledge-based microplanning choices enhance readability. We also discuss future readability improvements
Knowledge Acquisition for Content Selection
An important part of building a natural-language generation (NLG) system is
knowledge acquisition, that is deciding on the specific schemas, plans, grammar
rules, and so forth that should be used in the NLG system. We discuss some
experiments we have performed with KA for content-selection rules, in the
context of building an NLG system which generates health-related material.
These experiments suggest that it is useful to supplement corpus analysis with
KA techniques developed for building expert systems, such as structured group
discussions and think-aloud protocols. They also raise the point that KA issues
may influence architectural design issues, in particular the decision on
whether a planning approach is used for content selection. We suspect that in
some cases, KA may be easier if other constructive expert-system techniques
(such as production rules, or case-based reasoning) are used to determine the
content of a generated text.Comment: To appear in the 1997 European NLG workshop. 10 pages, postscrip
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
Text generation from a knowledge base aims to translate knowledge triples to
natural language descriptions. Most existing methods ignore the faithfulness
between a generated text description and the original table, leading to
generated information that goes beyond the content of the table. In this paper,
for the first time, we propose a novel Transformer-based generation framework
to achieve the goal. The core techniques in our method to enforce faithfulness
include a new table-text optimal-transport matching loss and a table-text
embedding similarity loss based on the Transformer model. Furthermore, to
evaluate faithfulness, we propose a new automatic metric specialized to the
table-to-text generation problem. We also provide detailed analysis on each
component of our model in our experiments. Automatic and human evaluations show
that our framework can significantly outperform state-of-the-art by a large
margin.Comment: Accepted at ACL202
- …