3,796 research outputs found
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Using the Journalistic Metaphor to Design User Interfaces That Explain Sensor Data
Facilitating general access to data from sensor networks (including traffic, hydrology and other domains) increases their utility. In this paper we argue that the journalistic metaphor can be effectively used to automatically generate multimedia presentations that help non-expert users analyze and understand sensor data. The journalistic layout and style are familiar to most users. Furthermore, the journalistic approach of ordering information from most general to most specific helps users obtain a high-level understanding while providing them the freedom to choose the depth of analysis to which they want to go. We describe the general characteristics and architectural requirements for an interactive intelligent user interface for exploring sensor data that uses the journalistic metaphor. We also describe our experience in developing this interface in real-world domains (e.g., hydrology)
Generating Adaptive Presentations of Hydrologic Behavior
This paper describes a knowledge-based approach for summarizing and presenting the behavior of hydrologic networks. This approach has been designed for visualizing data from sensors and simulations in the context of emergencies caused by floods. It follows a solution for event summarization that exploits physical properties of the dynamic system to automatically generate summaries of relevant data. The summarized information is presented using different modes such as text, 2D graphics and 3D animations on virtual terrains. The presentation is automatically generated using a hierarchical planner with abstract presentation fragments corresponding to discourse patterns, taking into account the characteristics of the user who receives the information and constraints imposed by the communication devices (mobile phone, computer, fax, etc.). An application following this approach has been developed for a national hydrologic information infrastructure of Spain
Current Challenges and Visions in Music Recommender Systems Research
Music recommender systems (MRS) have experienced a boom in recent years,
thanks to the emergence and success of online streaming services, which
nowadays make available almost all music in the world at the user's fingertip.
While today's MRS considerably help users to find interesting music in these
huge catalogs, MRS research is still facing substantial challenges. In
particular when it comes to build, incorporate, and evaluate recommendation
strategies that integrate information beyond simple user--item interactions or
content-based descriptors, but dig deep into the very essence of listener
needs, preferences, and intentions, MRS research becomes a big endeavor and
related publications quite sparse.
The purpose of this trends and survey article is twofold. We first identify
and shed light on what we believe are the most pressing challenges MRS research
is facing, from both academic and industry perspectives. We review the state of
the art towards solving these challenges and discuss its limitations. Second,
we detail possible future directions and visions we contemplate for the further
evolution of the field. The article should therefore serve two purposes: giving
the interested reader an overview of current challenges in MRS research and
providing guidance for young researchers by identifying interesting, yet
under-researched, directions in the field
Combining data-driven MT systems for improved sign language translation
In this paper, we investigate the feasibility of combining two data-driven machine translation (MT) systems for the translation of sign languages (SLs). We take the MT systems of two prominent data-driven research groups, the MaTrEx system developed at DCU and the Statistical Machine
Translation (SMT) system developed at RWTH Aachen University, and apply their respective approaches to the task of translating Irish Sign Language and German Sign Language into English and German. In a set of experiments supported by automatic evaluation results, we show that
there is a definite value to the prospective merging of MaTrEx’s Example-Based MT chunks and distortion limit increase with RWTH’s constraint reordering
Automatic design of multimodal presentations
We describe our attempt to integrate multiple AI components such as planning, knowledge representation, natural language generation, and graphics generation into a functioning prototype called WIP that plans and coordinates multimodal presentations in which all material is generated by the system. WIP allows the generation of alternate presentations of the same content taking into account various contextual factors such as the user\u27s degree of expertise and preferences for a particular output medium or mode. The current prototype of WIP generates multimodal explanations and instructions for assembling, using, maintaining or repairing physical devices. This paper introduces the task, the functionality and the architecture of the WIP system. We show that in WIP the design of a multimodal document is viewed as a non-monotonic process that includes various revisions of preliminary results, massive replanning and plan repairs, and many negotiations between design and realization components in order to achieve an optimal division of work between text and graphics. We describe how the plan-based approach to presentation design can be exploited so that graphics generation influences the production of text and vice versa. Finally, we discuss the generation of cross-modal expressions that establish referential relationships between text and graphics elements
Text Generation Based on Generative Adversarial Nets with Latent Variable
In this paper, we propose a model using generative adversarial net (GAN) to
generate realistic text. Instead of using standard GAN, we combine variational
autoencoder (VAE) with generative adversarial net. The use of high-level latent
random variables is helpful to learn the data distribution and solve the
problem that generative adversarial net always emits the similar data. We
propose the VGAN model where the generative model is composed of recurrent
neural network and VAE. The discriminative model is a convolutional neural
network. We train the model via policy gradient. We apply the proposed model to
the task of text generation and compare it to other recent neural network based
models, such as recurrent neural network language model and SeqGAN. We evaluate
the performance of the model by calculating negative log-likelihood and the
BLEU score. We conduct experiments on three benchmark datasets, and results
show that our model outperforms other previous models
Movie Description
Audio Description (AD) provides linguistic descriptions of movies and allows
visually impaired people to follow a movie along with their peers. Such
descriptions are by design mainly visual and thus naturally form an interesting
data source for computer vision and computational linguistics. In this work we
propose a novel dataset which contains transcribed ADs, which are temporally
aligned to full length movies. In addition we also collected and aligned movie
scripts used in prior work and compare the two sources of descriptions. In
total the Large Scale Movie Description Challenge (LSMDC) contains a parallel
corpus of 118,114 sentences and video clips from 202 movies. First we
characterize the dataset by benchmarking different approaches for generating
video descriptions. Comparing ADs to scripts, we find that ADs are indeed more
visual and describe precisely what is shown rather than what should happen
according to the scripts created prior to movie production. Furthermore, we
present and compare the results of several teams who participated in a
challenge organized in the context of the workshop "Describing and
Understanding Video & The Large Scale Movie Description Challenge (LSMDC)", at
ICCV 2015
Commodity-based Freight Activity on Inland Waterways through the Fusion of Public Datasets for Multimodal Transportation Planning
Within the U.S., the 18.6 billion tons of goods currently moved along the multimodal transportation system are expected to grow 51% by 2045. Most of those goods are transported by roadways. However, several benefits can be realized by shippers and consumers by shifting freight to more efficient modes, such as inland waterways, or adopting a multimodal scheme. To support such freight growth sustainably and efficiently, federal legislation calls for the development of plans, methods, and tools to identify and prioritize future multimodal transportation infrastructure needs. However, given the historical mode-specific approach to freight data collection, analysis, and modeling, challenges remain to adopt a fully multimodal approach that integrates underrepresented modes, such as waterways, into multimodal forecasting tools to identify and prioritize transportation infrastructure needs. Examples of such challenges are data heterogeneity, confidentiality, limitations in terms of spatial and temporal coverage, high cost associated with data collection, subjectivity in surveys responses, etc. To overcome these challenges, this work fuses data across a variety of novel transportation sources to close existing gaps in freight data needed to support multimodal long-range freight planning. In particular, the objective of this work is to develop methods to allow integration of inland waterway transportation into commodity-based freight forecasting models, by leveraging Automatic Identification System (AIS) data. The following approaches are presented in this dissertation:
i) Maritime Automatic Identification System (AIS) data is mapped to a detailed inland navigable waterway network, allowing for an improved representation of waterway modes into multimodal freight travel demand models which currently suffer from unbalanced representation of waterways. Validation results show the model correctly identifies 84% stops at inland waterway ports and 83.5% of trips crossing locks.
ii) AIS and truck Global Positioning System (GPS) data are fused to a multimodal network to identify the area of impact of a freight investment, providing a single methodology and data source to compare and contrast diverse transportation infrastructure investments. This method identifies parallel truck and vessel flows indicating potential for modal shift.
iii) Truck GPS and maritime Lock Performance Monitoring System (LPMS) data are fused via a multi-commodity assignment model to characterize and quantify annual commodity throughput at port terminals on inland waterways, generating new data from public datasets, to support estimation of commodity-based freight fluidity performance measures. Results show that 84% of ports had less than a 20% difference between estimated and observed truck volumes.
iv) AIS, LPMS, and truck GPS datasets are fused to disaggregate estimated annual commodity port throughput to vessel trips on inland waterways. Vessel trips characterized by port of origin, destination, path, timestamp, and commodity carried, are mapped to a detailed inland waterway network, allowing for a detailed commodity flow analysis, previously unavailable in the public domain.
The novel, repeatable, data-driven methods and models proposed in this work are applied to the 43 freight port terminals located on the Arkansas River. These models help to evaluate network performance, identify and prioritize multimodal freight transportation infrastructure needs, and introduce a unique focus on modal shift towards inland waterway transportation
- …