323,132 research outputs found
Human Mobility Question Answering (Vision Paper)
Question answering (QA) systems have attracted much attention from the
artificial intelligence community as they can learn to answer questions based
on the given knowledge source (e.g., images in visual question answering).
However, the research into question answering systems with human mobility data
remains unexplored. Mining human mobility data is crucial for various
applications such as smart city planning, pandemic management, and personalised
recommendation system. In this paper, we aim to tackle this gap and introduce a
novel task, that is, human mobility question answering (MobQA). The aim of the
task is to let the intelligent system learn from mobility data and answer
related questions. This task presents a new paradigm change in mobility
prediction research and further facilitates the research of human mobility
recommendation systems. To better support this novel research topic, this
vision paper also proposes an initial design of the dataset and a potential
deep learning model framework for the introduced MobQA task. We hope that this
paper will provide novel insights and open new directions in human mobility
research and question answering research
A Dataset and Baselines for Visual Question Answering on Art
Answering questions related to art pieces (paintings) is a difficult task, as
it implies the understanding of not only the visual information that is shown
in the picture, but also the contextual knowledge that is acquired through the
study of the history of art. In this work, we introduce our first attempt
towards building a new dataset, coined AQUA (Art QUestion Answering). The
question-answer (QA) pairs are automatically generated using state-of-the-art
question generation methods based on paintings and comments provided in an
existing art understanding dataset. The QA pairs are cleansed by crowdsourcing
workers with respect to their grammatical correctness, answerability, and
answers' correctness. Our dataset inherently consists of visual
(painting-based) and knowledge (comment-based) questions. We also present a
two-branch model as baseline, where the visual and knowledge questions are
handled independently. We extensively compare our baseline model against the
state-of-the-art models for question answering, and we provide a comprehensive
study about the challenges and potential future directions for visual question
answering on art
Benchmarks for Pir\'a 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change
Pir\'a is a reading comprehension dataset focused on the ocean, the Brazilian
coast, and climate change, built from a collection of scientific abstracts and
reports on these topics. This dataset represents a versatile language resource,
particularly useful for testing the ability of current machine learning models
to acquire expert scientific knowledge. Despite its potential, a detailed set
of baselines has not yet been developed for Pir\'a. By creating these
baselines, researchers can more easily utilize Pir\'a as a resource for testing
machine learning models across a wide range of question answering tasks. In
this paper, we define six benchmarks over the Pir\'a dataset, covering closed
generative question answering, machine reading comprehension, information
retrieval, open question answering, answer triggering, and multiple choice
question answering. As part of this effort, we have also produced a curated
version of the original dataset, where we fixed a number of grammar issues,
repetitions, and other shortcomings. Furthermore, the dataset has been extended
in several new directions, so as to face the aforementioned benchmarks:
translation of supporting texts from English into Portuguese, classification
labels for answerability, automatic paraphrases of questions and answers, and
multiple choice candidates. The results described in this paper provide several
points of reference for researchers interested in exploring the challenges
provided by the Pir\'a dataset.Comment: Accepted at Data Intelligence. Online ISSN 2641-435
Scalable Methodologies and Analyses for Modality Bias and Feature Exploitation in Language-Vision Multimodal Deep Learning
Multimodal machine learning benchmarks have exponentially grown in both capability and popularity over the last decade. Language-vision question-answering tasks such as Visual Question Answering (VQA) and Video Question Answering (video-QA) have ---thanks to their high difficulty--- become a particularly popular means through which to develop and test new modelling designs and methodology for multimodal deep learning. The challenging nature of VQA and video-QA tasks leaves plenty of room for innovation at every component of the deep learning pipeline: from dataset to modelling methodology. Such circumstances are ideal for innovating in the space of language-vision multimodality. Furthermore, the wider field is currently undergoing an incredible period of growth and increasing interest. I therefore aim to contribute to multiple key components of the VQA and video-QA pipeline, but specifically in a manner such that my contributions remain relevant, ‘scaling’ with the revolutionary new benchmark models and datasets of the near future instead of being rendered obsolete by them. The work in this thesis: highlights and explores the disruptive and problematic presence of language bias in the popular TVQA video-QA dataset, and proposes a dataset-invariant method to identify subsets that respond to different modalities; thoroughly explores the suitability of bilinear pooling as a language-vision fusion technique in video-QA, offering experimental and theoretical insight, and highlighting the parallels in multimodal processing with neurological theories; explores the nascent visual equivalent of languague modelling (`visual modelling') in order to boost the power of visual features; and proposes a dataset-invariant neurolinguistically-inspired labelling scheme for use in multimodal question-answering. I explore the positive and negative results that my experiments across this thesis yield. I conclude by discussing the limitations of my contributions, and conclude with proposals for future directions of study in the areas I contribute to
Recommended from our members
Grounded and Consistent Question Answering
This thesis describes advancements in question answering along three general directions: model architecture extensions, explainable question answering, and data augmentation.
Chapter 2 describes the first state-of-the-art model for the Natural Questions dataset based on pretrained transformers. Chapters 3 and 4 describe extensions to the model architecture designed to accommodate long textual inputs and multimodal text+image inputs, establishing new state-of-the-art results on the Natural Questions and on the VCR dataset.
Chapter 5 shows that significant improvements can be obtained with data augmentation on the SQuAD and Natural Questions dataset, introducing roundtrip consistency as a simple heuristic to improve the quality of synthetic data. In Chapters 6 and 7 we explore explainable question answering, demonstrating the usefulness of a new concrete kind of structured explanations, QED, and proposing a semantic analysis of why-questions in the Natural Questions, as a way of better understanding the nature of real world explanations.
Finally, in Chapters 8 and 9 we delve into more exploratory data augmentation techniques for question answering. We look respectively at how straight-through gradients can be utilized to optimize roundtrip consistency in a pipeline of models on the fly, and at how very recent large language models like PaLM can be used to generate synthetic question answering datasets for new languages given as few as five representative examples per language
Thread-level information for comment classification in community question answering
Community Question Answering (cQA) is a new application of QA in social contexts (e.g., fora). It presents new interesting challenges and research directions, e.g., exploiting the dependencies between the different comments of a thread to select the best answer for a given question. In this paper, we explored two ways of modeling such dependencies: (i) by designing specific features looking globally at the thread; and (ii) by applying structure prediction models. We trained and evaluated our models on data from SemEval-2015 Task 3 on Answer Selection in cQA. Our experiments show that: (i) the thread-level features consistently improve the performance for a variety of machine learning models, yielding state-of-the-art results; and (ii) sequential dependencies between the answer labels captured by structured prediction models are not enough to improve the results, indicating that more information is needed in the joint model
SUSTAINABLE IT-SPECIFIC HUMAN CAPITAL: COPING WITH THE THREAT OF PROFESSIONAL OBSOLESCENCE
This study contributes to research examining how IT professionals cope with the threat of professional obsolescence. In answering this question, this study draws on theories of occupational stress, specifically the theory of conservation of resources (Hobfoll 2002; Hobfoll and Freedy 1993), to relate the threat of professional obsolescence with IT professionals’ coping behaviors. This study extends the theory of conservation of resources in several directions such as theorizing and testing the job mobility intentions of turnover and turnaway as consequences; and by proposing organizational updating climate as a proximal contextual moderating factor. The results obtained from a large sample of IT professionals are both consistent with and contrary to theorized relationships. We also uncover several new findings pertaining to the role played by organization updating climate and its potential limit in supporting updating activities of IT professionals. We conclude this study with a discussion of the results and propose future research directions
- …