Search CORE

174 research outputs found

Generating multiple summaries based on computational model of perspective

Author: Oh Alice H
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2008
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 87-92).Every story about an event offers a unique perspective about the event. A popular sporting event, such as a Major League Baseball game, is followed by several summary articles that show different points of view. The goal of this research is to build a computational model of perspective and build a system for automatically generating multiple summary articles showing different perspectives. My approach is to take a neutral summary article, reorder the content of that summary based on event features extracted from the description of the game, and produce two new summaries showing the local team perspectives. I will present an initial user survey that validated the hypothesis that content ordering has a significant effect on the users' perception of perspective. I will also discuss collecting and analyzing a parallel corpus of baseball game data and summary articles showing local team perspectives. I will then describe the reordering algorithm, the implementation of the system, and a user study to evaluate the output of the system.by Alice H. Oh.Ph.D

DSpace@MIT

The Best Explanation:Beyond Right and Wrong in Question Answering

Author: Johannsen Anders Trærup
Publication venue: Det Humanistiske Fakultet, Københavns Universitet
Publication date: 01/01/2013
Field of study

Copenhagen University Research Information System

TRANSLATING VISUALIZATION INTERACTION INTO NATURAL LANGUAGE

Author: Nafari Maryam
Publication venue
Publication date: 01/12/2014
Field of study

Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Recent advancements in data collection and storage call for more complex analytical tasks to make sense of readily available datasets. More complicated and sophisticated tools are needed to complete those tasks. However, as these visualization tools get more complicated, it becomes increasingly difficult to learn interaction sequences, recall past queries asked from a visualization, and correctly interpret visual states to forage the data. Moreover, the high interactivity of such tools increases the challenge of connecting low-level acquired information to higher-level analytical questions and hypotheses to support, reason, and eventually present insights. This makes studying the usability of complex interactive visualizations, both in the process of foraging and making sense of data, an essential part of visual analytic research. This research can be approached in at least two major ways. One can focus on studying new techniques and guidelines for designing interactive complex visualizations that are easy to use and understand. One can also focus on keeping the capabilities of existing complex visualizations, yet provide supporting capabilities that increases their usability. The latter is an emerging area of research in visual analytics, and is the focus of this dissertation. This dissertation describes six contributions to the field of visual analytics. The first contribution is an architecture of a query-to-question supporting system that automatically records user interactions and presents them contextually using natural written language. The architecture takes into account the domain knowledge of experts/designers and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a log of text that can be visualized. The second contribution is query-to-question (Q2Q), an implemented system that translates low-level user interactions into high-level analytical questions and presents them as a log of styled text that complements and effectively extends the functionality of visualization tools. The third contribution is a demonstration of the beneficial effects of accompanying a visualization with a textual translation of user interaction on the usability of visualizations. The presence of the translation interface produces considerable improvements in learnability, efficiency, and memorability of visualization in terms of speed and the length of interaction sequences that users perform, along with a modest decrease in error ratio. The fourth contribution is a set of design guidelines for translating user interactions into natural language, taking into account variation in user knowledge and roles, the types of data being visualized, and the types of interaction supported. The fifth contribution is a history organizer interface that enables users to organize their analytical process. The structured textual translations output from Q2Q are input into a history organizer tool (HOT) that imposes reordering, sequencing, and grouping of the translated interactions. HOT provides a reasoning framework for users to organize and present hypotheses and insight acquired from a visualization. The sixth contribution is a demonstration of the efficiency of a suite of arrangement options for organizing questions asked in a visualization. Integration of query translation and history organization improves users' speed, error ratio, and number of reordering actions performed during organization of translated interactions. Overall, this dissertation contributes to the analysis and discovery of user storytelling patterns and behaviours, thereby paving the way to the creation of more intelligent, effective, and user-oriented visual analysis presentation tools

SHAREOK repository

Personalized data generation and summarization using learned ranking models

Author: Saquil Yassir
Publication venue
Publication date: 17/01/2022
Field of study

OPUS

Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

Author
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

Tilburg University Repository

On the Role of Creativity in Sport

Author: Rasmussen Ludvig Johan Torp
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2019
Field of study

VBN

Automatic Image Captioning with Style

Author: Mathews Alexander Patrick
Publication venue
Publication date: 01/01/2018
Field of study

This thesis connects two core topics in machine learning, vision and language. The problem of choice is image caption generation: automatically constructing natural language descriptions of image content. Previous research into image caption generation has focused on generating purely descriptive captions; I focus on generating visually relevant captions with a distinct linguistic style. Captions with style have the potential to ease communication and add a new layer of personalisation. First, I consider naming variations in image captions, and propose a method for predicting context-dependent names that takes into account visual and linguistic information. This method makes use of a large-scale image caption dataset, which I also use to explore naming conventions and report naming conventions for hundreds of animal classes. Next I propose the SentiCap model, which relies on recent advances in artificial neural networks to generate visually relevant image captions with positive or negative sentiment. To balance descriptiveness and sentiment, the SentiCap model dynamically switches between two recurrent neural networks, one tuned for descriptive words and one for sentiment words. As the first published model for generating captions with sentiment, SentiCap has influenced a number of subsequent works. I then investigate the sub-task of modelling styled sentences without images. The specific task chosen is sentence simplification: rewriting news article sentences to make them easier to understand. For this task I design a neural sequence-to-sequence model that can work with limited training data, using novel adaptations for word copying and sharing word embeddings. Finally, I present SemStyle, a system for generating visually relevant image captions in the style of an arbitrary text corpus. A shared term space allows a neural network for vision and content planning to communicate with a network for styled language generation. SemStyle achieves competitive results in human and automatic evaluations of descriptiveness and style. As a whole, this thesis presents two complete systems for styled caption generation that are first of their kind and demonstrate, for the first time, that automatic style transfer for image captions is achievable. Contributions also include novel ideas for object naming and sentence simplification. This thesis opens up inquiries into highly personalised image captions; large scale visually grounded concept naming; and more generally, styled text generation with content control

The Australian National University

Recommended from our members

Generating Natural Language Summaries from Multiple On-Line Sources: Language Reuse and Regeneration

Author: Radev Dragomir R.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1999
Field of study

The abundance of news wire on the World-Wide Web has resulted in at least four major problems, which seem to present the most interesting challenges to users and researchers alike: size,heterogeneity, change, and conflicting information. Size: several hundred newspapers and news agencies maintain their Web sites with thousands of news stories in each. Heterogeneity: some of the data related to news is in structured format (e.g., tables); more exists in semi-structured format (e.g.,Web pages, encyclopedias, textual databases); while the rest of the data is in textual form (e.g., newswire). Change: most Web sites and certainly all news sources change on a daily basis. Disagreement: different sources present conflicting or at least different views of the same event. We have approached the second, third, and fourth of these four problems from the point of view of text generation. We have developed a system, {\scsummons}, which when coupled with appropriate information extraction technology, generates a specific genre of natural language summaries of a particular event (which we call briefings) in a restricted domain. The briefings are concise, they contain facts from multiple and heterogeneous sources, and incorporate evolving information, highlighting agreements and contradictions among sources on the same topic. We have developed novel techniques and algorithms for combining data from multiple sources at the conceptual level (using natural language understanding), for identifying new information on a given topic; and for presenting the information in natural language form to the user. We named the framework that we have developed for these problems {\em language reuse and regeneration} (LRR). Its novelty lies in the ability to produce text by collating together text already written by humans on the Web. The main features of LRR are: increased robustness through a simplified parsing/generation component, leverage on text already written by humans, and facilities for the inclusion of structured data in computer-generated text. The present thesis contains an introduction to LRR and its use inmulti-document summarization. We have paid special attention to the techniquesfor producing conceptual summaries of multiple sources, to the creation and useof a LRR-based lexicon for text generation, to a methodology used to identifynew and old information in threads of documents, and to the generation offluent natural language text using all the components above. The thesis contains evaluations of the different components of {\sc summons} aswell as certain aspects of LRR as a methodology. A review of the relevantliterature is included as a separate chapter

Columbia University Academic Commons