36,275 research outputs found
A methodology for the capture and analysis of hybrid data: a case study of program debugging
No description supplie
Language-Based Image Editing with Recurrent Attentive Models
We investigate the problem of Language-Based Image Editing (LBIE). Given a
source image and a natural language description, we want to generate a target
image by editing the source image based on the description. We propose a
generic modeling framework for two sub-tasks of LBIE: language-based image
segmentation and image colorization. The framework uses recurrent attentive
models to fuse image and language features. Instead of using a fixed step size,
we introduce for each region of the image a termination gate to dynamically
determine after each inference step whether to continue extrapolating
additional information from the textual description. The effectiveness of the
framework is validated on three datasets. First, we introduce a synthetic
dataset, called CoSaL, to evaluate the end-to-end performance of our LBIE
system. Second, we show that the framework leads to state-of-the-art
performance on image segmentation on the ReferIt dataset. Third, we present the
first language-based colorization result on the Oxford-102 Flowers dataset.Comment: Accepted to CVPR 2018 as a Spotligh
Interpretation of Natural Language Rules in Conversational Machine Reading
Most work in machine reading focuses on question answering problems where the
answer is directly expressed in the text to read. However, many real-world
question answering problems require the reading of text not because it contains
the literal answer, but because it contains a recipe to derive an answer
together with the reader's background knowledge. One example is the task of
interpreting regulations to answer "Can I...?" or "Do I have to...?" questions
such as "I am working in Canada. Do I have to carry on paying UK National
Insurance?" after reading a UK government website about this topic. This task
requires both the interpretation of rules and the application of background
knowledge. It is further complicated due to the fact that, in practice, most
questions are underspecified, and a human assistant will regularly have to ask
clarification questions such as "How long have you been working abroad?" when
the answer cannot be directly derived from the question and text. In this
paper, we formalise this task and develop a crowd-sourcing strategy to collect
32k task instances based on real-world rules and crowd-generated questions and
scenarios. We analyse the challenges of this task and assess its difficulty by
evaluating the performance of rule-based and machine-learning baselines. We
observe promising results when no background knowledge is necessary, and
substantial room for improvement whenever background knowledge is needed.Comment: EMNLP 201
Teaching Machines to Read and Comprehend
Teaching machines to read natural language documents remains an elusive
challenge. Machine reading systems can be tested on their ability to answer
questions posed on the contents of documents that they have seen, but until now
large scale training and test datasets have been missing for this type of
evaluation. In this work we define a new methodology that resolves this
bottleneck and provides large scale supervised reading comprehension data. This
allows us to develop a class of attention based deep neural networks that learn
to read real documents and answer complex questions with minimal prior
knowledge of language structure.Comment: Appears in: Advances in Neural Information Processing Systems 28
(NIPS 2015). 14 pages, 13 figure
Pathologies of Neural Models Make Interpretations Difficult
One way to interpret neural model predictions is to highlight the most
important input features---for example, a heatmap visualization over the words
in an input sentence. In existing interpretation methods for NLP, a word's
importance is determined by either input perturbation---measuring the decrease
in model confidence when that word is removed---or by the gradient with respect
to that word. To understand the limitations of these methods, we use input
reduction, which iteratively removes the least important word from the input.
This exposes pathological behaviors of neural models: the remaining words
appear nonsensical to humans and are not the ones determined as important by
interpretation methods. As we confirm with human experiments, the reduced
examples lack information to support the prediction of any label, but models
still make the same predictions with high confidence. To explain these
counterintuitive results, we draw connections to adversarial examples and
confidence calibration: pathological behaviors reveal difficulties in
interpreting neural models trained with maximum likelihood. To mitigate their
deficiencies, we fine-tune the models by encouraging high entropy outputs on
reduced examples. Fine-tuned models become more interpretable under input
reduction without accuracy loss on regular examples.Comment: EMNLP 2018 camera read
The Limited Effect of Graphic Elements in Video and Augmented Reality on Children’s Listening Comprehension
There is currently significant interest in the use of instructional strategies in learning environments thanks to the emergence of new multimedia systems that combine text, audio, graphics and video, such as augmented reality (AR). In this light, this study compares the effectiveness of AR and video for listening comprehension tasks. The sample consisted of thirty-two elementary school students with different reading comprehension. Firstly, the experience, instructions and objectives were introduced to all the students. Next, they were divided into two groups to perform activities—one group performed an activity involving watching an Educational Video Story of the Laika dog and her Space Journey available by mobile devices app Blue Planet Tales, while the other performed an activity involving the use of AR, whose contents of the same history were visualized by means of the app Augment Sales. Once the activities were completed participants answered a comprehension test. Results (p = 0.180) indicate there are no meaningful differences between the lesson format and test performance. But there are differences between the participants of the AR group according to their reading comprehension level. With respect to the time taken to perform the comprehension test, there is no significant difference between the two groups but there is a difference between participants with a high and low level of comprehension. To conclude SUS (System Usability Scale) questionnaire was used to establish the measure usability for the AR app on a smartphone. An average score of 77.5 out of 100 was obtained in this questionnaire, which indicates that the app has fairly good user-centered design
- …