3,796 research outputs found

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Using the Journalistic Metaphor to Design User Interfaces That Explain Sensor Data

    Get PDF
    Facilitating general access to data from sensor networks (including traffic, hydrology and other domains) increases their utility. In this paper we argue that the journalistic metaphor can be effectively used to automatically generate multimedia presentations that help non-expert users analyze and understand sensor data. The journalistic layout and style are familiar to most users. Furthermore, the journalistic approach of ordering information from most general to most specific helps users obtain a high-level understanding while providing them the freedom to choose the depth of analysis to which they want to go. We describe the general characteristics and architectural requirements for an interactive intelligent user interface for exploring sensor data that uses the journalistic metaphor. We also describe our experience in developing this interface in real-world domains (e.g., hydrology)

    Generating Adaptive Presentations of Hydrologic Behavior

    Get PDF
    This paper describes a knowledge-based approach for summarizing and presenting the behavior of hydrologic networks. This approach has been designed for visualizing data from sensors and simulations in the context of emergencies caused by floods. It follows a solution for event summarization that exploits physical properties of the dynamic system to automatically generate summaries of relevant data. The summarized information is presented using different modes such as text, 2D graphics and 3D animations on virtual terrains. The presentation is automatically generated using a hierarchical planner with abstract presentation fragments corresponding to discourse patterns, taking into account the characteristics of the user who receives the information and constraints imposed by the communication devices (mobile phone, computer, fax, etc.). An application following this approach has been developed for a national hydrologic information infrastructure of Spain

    Current Challenges and Visions in Music Recommender Systems Research

    Full text link
    Music recommender systems (MRS) have experienced a boom in recent years, thanks to the emergence and success of online streaming services, which nowadays make available almost all music in the world at the user's fingertip. While today's MRS considerably help users to find interesting music in these huge catalogs, MRS research is still facing substantial challenges. In particular when it comes to build, incorporate, and evaluate recommendation strategies that integrate information beyond simple user--item interactions or content-based descriptors, but dig deep into the very essence of listener needs, preferences, and intentions, MRS research becomes a big endeavor and related publications quite sparse. The purpose of this trends and survey article is twofold. We first identify and shed light on what we believe are the most pressing challenges MRS research is facing, from both academic and industry perspectives. We review the state of the art towards solving these challenges and discuss its limitations. Second, we detail possible future directions and visions we contemplate for the further evolution of the field. The article should therefore serve two purposes: giving the interested reader an overview of current challenges in MRS research and providing guidance for young researchers by identifying interesting, yet under-researched, directions in the field

    Combining data-driven MT systems for improved sign language translation

    Get PDF
    In this paper, we investigate the feasibility of combining two data-driven machine translation (MT) systems for the translation of sign languages (SLs). We take the MT systems of two prominent data-driven research groups, the MaTrEx system developed at DCU and the Statistical Machine Translation (SMT) system developed at RWTH Aachen University, and apply their respective approaches to the task of translating Irish Sign Language and German Sign Language into English and German. In a set of experiments supported by automatic evaluation results, we show that there is a definite value to the prospective merging of MaTrEx’s Example-Based MT chunks and distortion limit increase with RWTH’s constraint reordering

    Automatic design of multimodal presentations

    Get PDF
    We describe our attempt to integrate multiple AI components such as planning, knowledge representation, natural language generation, and graphics generation into a functioning prototype called WIP that plans and coordinates multimodal presentations in which all material is generated by the system. WIP allows the generation of alternate presentations of the same content taking into account various contextual factors such as the user\u27s degree of expertise and preferences for a particular output medium or mode. The current prototype of WIP generates multimodal explanations and instructions for assembling, using, maintaining or repairing physical devices. This paper introduces the task, the functionality and the architecture of the WIP system. We show that in WIP the design of a multimodal document is viewed as a non-monotonic process that includes various revisions of preliminary results, massive replanning and plan repairs, and many negotiations between design and realization components in order to achieve an optimal division of work between text and graphics. We describe how the plan-based approach to presentation design can be exploited so that graphics generation influences the production of text and vice versa. Finally, we discuss the generation of cross-modal expressions that establish referential relationships between text and graphics elements

    Text Generation Based on Generative Adversarial Nets with Latent Variable

    Full text link
    In this paper, we propose a model using generative adversarial net (GAN) to generate realistic text. Instead of using standard GAN, we combine variational autoencoder (VAE) with generative adversarial net. The use of high-level latent random variables is helpful to learn the data distribution and solve the problem that generative adversarial net always emits the similar data. We propose the VGAN model where the generative model is composed of recurrent neural network and VAE. The discriminative model is a convolutional neural network. We train the model via policy gradient. We apply the proposed model to the task of text generation and compare it to other recent neural network based models, such as recurrent neural network language model and SeqGAN. We evaluate the performance of the model by calculating negative log-likelihood and the BLEU score. We conduct experiments on three benchmark datasets, and results show that our model outperforms other previous models

    Movie Description

    Get PDF
    Audio Description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. In total the Large Scale Movie Description Challenge (LSMDC) contains a parallel corpus of 118,114 sentences and video clips from 202 movies. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are indeed more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in a challenge organized in the context of the workshop "Describing and Understanding Video & The Large Scale Movie Description Challenge (LSMDC)", at ICCV 2015

    Commodity-based Freight Activity on Inland Waterways through the Fusion of Public Datasets for Multimodal Transportation Planning

    Get PDF
    Within the U.S., the 18.6 billion tons of goods currently moved along the multimodal transportation system are expected to grow 51% by 2045. Most of those goods are transported by roadways. However, several benefits can be realized by shippers and consumers by shifting freight to more efficient modes, such as inland waterways, or adopting a multimodal scheme. To support such freight growth sustainably and efficiently, federal legislation calls for the development of plans, methods, and tools to identify and prioritize future multimodal transportation infrastructure needs. However, given the historical mode-specific approach to freight data collection, analysis, and modeling, challenges remain to adopt a fully multimodal approach that integrates underrepresented modes, such as waterways, into multimodal forecasting tools to identify and prioritize transportation infrastructure needs. Examples of such challenges are data heterogeneity, confidentiality, limitations in terms of spatial and temporal coverage, high cost associated with data collection, subjectivity in surveys responses, etc. To overcome these challenges, this work fuses data across a variety of novel transportation sources to close existing gaps in freight data needed to support multimodal long-range freight planning. In particular, the objective of this work is to develop methods to allow integration of inland waterway transportation into commodity-based freight forecasting models, by leveraging Automatic Identification System (AIS) data. The following approaches are presented in this dissertation: i) Maritime Automatic Identification System (AIS) data is mapped to a detailed inland navigable waterway network, allowing for an improved representation of waterway modes into multimodal freight travel demand models which currently suffer from unbalanced representation of waterways. Validation results show the model correctly identifies 84% stops at inland waterway ports and 83.5% of trips crossing locks. ii) AIS and truck Global Positioning System (GPS) data are fused to a multimodal network to identify the area of impact of a freight investment, providing a single methodology and data source to compare and contrast diverse transportation infrastructure investments. This method identifies parallel truck and vessel flows indicating potential for modal shift. iii) Truck GPS and maritime Lock Performance Monitoring System (LPMS) data are fused via a multi-commodity assignment model to characterize and quantify annual commodity throughput at port terminals on inland waterways, generating new data from public datasets, to support estimation of commodity-based freight fluidity performance measures. Results show that 84% of ports had less than a 20% difference between estimated and observed truck volumes. iv) AIS, LPMS, and truck GPS datasets are fused to disaggregate estimated annual commodity port throughput to vessel trips on inland waterways. Vessel trips characterized by port of origin, destination, path, timestamp, and commodity carried, are mapped to a detailed inland waterway network, allowing for a detailed commodity flow analysis, previously unavailable in the public domain. The novel, repeatable, data-driven methods and models proposed in this work are applied to the 43 freight port terminals located on the Arkansas River. These models help to evaluate network performance, identify and prioritize multimodal freight transportation infrastructure needs, and introduce a unique focus on modal shift towards inland waterway transportation
    corecore