40,659 research outputs found
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering (VQA) is a challenging task that has received
increasing attention from both the computer vision and the natural language
processing communities. Given an image and a question in natural language, it
requires reasoning over visual elements of the image and general knowledge to
infer the correct answer. In the first part of this survey, we examine the
state of the art by comparing modern approaches to the problem. We classify
methods by their mechanism to connect the visual and textual modalities. In
particular, we examine the common approach of combining convolutional and
recurrent neural networks to map images and questions to a common feature
space. We also discuss memory-augmented and modular architectures that
interface with structured knowledge bases. In the second part of this survey,
we review the datasets available for training and evaluating VQA systems. The
various datatsets contain questions at different levels of complexity, which
require different capabilities and types of reasoning. We examine in depth the
question/answer pairs from the Visual Genome project, and evaluate the
relevance of the structured annotations of images with scene graphs for VQA.
Finally, we discuss promising future directions for the field, in particular
the connection to structured knowledge bases and the use of natural language
processing models.Comment: 25 page
Human-Level Performance on Word Analogy Questions by Latent Relational Analysis
This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Similarity of Semantic Relations
There are at least two kinds of similarity. Relational similarity is
correspondence between relations, in contrast with attributional similarity,
which is correspondence between attributes. When two words have a high
degree of attributional similarity, we call them synonyms. When two pairs
of words have a high degree of relational similarity, we say that their
relations are analogous. For example, the word pair mason:stone is analogous
to the pair carpenter:wood. This paper introduces Latent Relational Analysis (LRA),
a method for measuring relational similarity. LRA has potential applications in many
areas, including information extraction, word sense disambiguation,
and information retrieval. Recently the Vector Space Model (VSM) of information
retrieval has been adapted to measuring relational similarity,
achieving a score of 47% on a collection of 374 college-level multiple-choice
word analogy questions. In the VSM approach, the relation between a pair of words is
characterized by a vector of frequencies of predefined patterns in a large corpus.
LRA extends the VSM approach in three ways: (1) the patterns are derived automatically
from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency
data, and (3) automatically generated synonyms are used to explore variations of the
word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the
average human score of 57%. On the related problem of classifying semantic relations, LRA
achieves similar gains over the VSM
A spiral model for adding automatic, adaptive authoring to adaptive hypermedia
At present a large amount of research exists into the design and implementation of adaptive systems. However, not many target the complex task of authoring in such systems, or their evaluation. In order to tackle these problems, we have looked into the causes of the complexity. Manual annotation has proven to be a bottleneck for authoring of adaptive hypermedia. One such solution is the reuse of automatically generated metadata. In our previous work we have proposed the integration of the generic Adaptive Hypermedia authoring environment, MOT ( My Online Teacher), and a semantic desktop environment, indexed by Beagle++. A prototype, Sesame2MOT Enricher v1, was built based upon this integration approach and evaluated. After the initial evaluations, a web-based prototype was built (web-based Sesame2MOT Enricher v2 application) and integrated in MOT v2, conforming with the findings of the first set of evaluations. This new prototype underwent another evaluation. This paper thus does a synthesis of the approach in general, the initial prototype, with its first evaluations, the improved prototype and the first results from the most recent evaluation round, following the next implementation cycle of the spiral model [Boehm, 88]
- …