9,063 research outputs found
Developing a dataset for evaluating approaches for document expansion with images
Motivated by the adage that a âpicture is worth a thousand wordsâ it can be reasoned that automatically enriching the textual content of
a document with relevant images can increase the readability of a document. Moreover, features extracted from the additional image
data inserted into the textual content of a document may, in principle, be also be used by a retrieval engine to better match the topic of a
document with that of a given query. In this paper, we describe our approach of building a ground truth dataset to enable further research
into automatic addition of relevant images to text documents. The dataset is comprised of the official ImageCLEF 2010 collection (a
collection of images with textual metadata) to serve as the images available for automatic enrichment of text, a set of 25 benchmark
documents that are to be enriched, which in this case are childrenâs short stories, and a set of manually judged relevant images for each
query story obtained by the standard procedure of depth pooling. We use this benchmark dataset to evaluate the effectiveness of standard
information retrieval methods as simple baselines for this task. The results indicate that using the whole story as a weighted query,
where the weight of each query term is its tf-idf value, achieves an precision of 0.1714 within the top 5 retrieved images on an average
Text-to-picture tools, systems, and approaches: a survey
Text-to-picture systems attempt to facilitate high-level, user-friendly communication between humans and computers while promoting understanding of natural language. These systems interpret a natural language text and transform it into a visual format as pictures or images that are either static or dynamic. In this paper, we aim to identify current difficulties and the main problems faced by prior systems, and in particular, we seek to investigate the feasibility of automatic visualization of Arabic story text through multimedia. Hence, we analyzed a number of well-known text-to-picture systems, tools, and approaches. We showed their constituent steps, such as knowledge extraction, mapping, and image layout, as well as their performance and limitations. We also compared these systems based on a set of criteria, mainly natural language processing, natural language understanding, and input/output modalities. Our survey showed that currently emerging techniques in natural language processing tools and computer vision have made promising advances in analyzing general text and understanding images and videos. Furthermore, important remarks and findings have been deduced from these prior works, which would help in developing an effective text-to-picture system for learning and educational purposes. - 2019, The Author(s).This work was made possible by NPRP grant #10-0205-170346 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors
Recommended from our members
Supporting Story Synthesis: Bridging the Gap between Visual Analytics and Storytelling
Visual analytics usually deals with complex data and uses sophisticated algorithmic, visual, and interactive techniques. Findings of the analysis often need to be communicated to an audience that lacks visual analytics expertise. This requires analysis outcomes to be presented in simpler ways than that are typically used in visual analytics systems. However, not only analytical visualizations may be too complex for target audience but also the information that needs to be presented. Hence, there exists a gap on the path from obtaining analysis findings to communicating them, which involves two aspects: information and display complexity. We propose a general framework where data analysis and result presentation are linked by story synthesis, in which the analyst creates and organizes story contents. Differently, from the previous research, where analytic findings are represented by stored display states, we treat findings as data constructs. In story synthesis, findings are selected, assembled, and arranged in views using meaningful layouts that take into account the structure of information and inherent properties of its components. We propose a workflow for applying the proposed framework in designing visual analytics systems and demonstrate the generality of the approach by applying it to two domains, social media, and movement analysis
An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web
Although a great deal of attention has been paid to how conspiracy theories
circulate on social media and their factual counterpart conspiracies, there has
been little computational work done on describing their narrative structures.
We present an automated pipeline for the discovery and description of the
generative narrative frameworks of conspiracy theories on social media, and
actual conspiracies reported in the news media. We base this work on two
separate repositories of posts and news articles describing the well-known
conspiracy theory Pizzagate from 2016, and the New Jersey conspiracy Bridgegate
from 2013. We formulate a graphical generative machine learning model where
nodes represent actors/actants, and multi-edges and self-loops among nodes
capture context-specific relationships. Posts and news items are viewed as
samples of subgraphs of the hidden narrative network. The problem of
reconstructing the underlying structure is posed as a latent model estimation
problem. We automatically extract and aggregate the actants and their
relationships from the posts and articles. We capture context specific actants
and interactant relationships by developing a system of supernodes and
subnodes. We use these to construct a network, which constitutes the underlying
narrative framework. We show how the Pizzagate framework relies on the
conspiracy theorists' interpretation of "hidden knowledge" to link otherwise
unlinked domains of human interaction, and hypothesize that this multi-domain
focus is an important feature of conspiracy theories. While Pizzagate relies on
the alignment of multiple domains, Bridgegate remains firmly rooted in the
single domain of New Jersey politics. We hypothesize that the narrative
framework of a conspiracy theory might stabilize quickly in contrast to the
narrative framework of an actual one, which may develop more slowly as
revelations come to light.Comment: conspiracy theory, narrative structur
Foundations of Human-Aware Planning -- A Tale of Three Models
abstract: A critical challenge in the design of AI systems that operate with humans in the loop is to be able to model the intentions and capabilities of the humans, as well as their beliefs and expectations of the AI system itself. This allows the AI system to be "human- aware" -- i.e. the human task model enables it to envisage desired roles of the human in joint action, while the human mental model allows it to anticipate how its own actions are perceived from the point of view of the human. In my research, I explore how these concepts of human-awareness manifest themselves in the scope of planning or sequential decision making with humans in the loop. To this end, I will show (1) how the AI agent can leverage the human task model to generate symbiotic behavior; and (2) how the introduction of the human mental model in the deliberative process of the AI agent allows it to generate explanations for a plan or resort to explicable plans when explanations are not desired. The latter is in addition to traditional notions of human-aware planning which typically use the human task model alone and thus enables a new suite of capabilities of a human-aware AI agent. Finally, I will explore how the AI agent can leverage emerging mixed-reality interfaces to realize effective channels of communication with the human in the loop.Dissertation/ThesisDoctoral Dissertation Computer Science 201
- âŠ