23,045 research outputs found
ChartCheck: An Evidence-Based Fact-Checking Dataset over Real-World Chart Images
Data visualizations are common in the real-world. We often use them in data
sources such as scientific documents, news articles, textbooks, and social
media to summarize key information in a visual form. Charts can also mislead
its audience by communicating false information or biasing them towards a
specific agenda. Verifying claims against charts is not a straightforward
process. It requires analyzing both the text and visual components of the
chart, considering characteristics such as colors, positions, and orientations.
Moreover, to determine if a claim is supported by the chart content often
requires different types of reasoning. To address this challenge, we introduce
ChartCheck, a novel dataset for fact-checking against chart images. ChartCheck
is the first large-scale dataset with 1.7k real-world charts and 10.5k
human-written claims and explanations. We evaluated the dataset on
state-of-the-art models and achieved an accuracy of 73.9 in the finetuned
setting. Additionally, we identified chart characteristics and reasoning types
that challenge the models
Advancing Multi-Modal Deep Learning: Towards Language-Grounded Visual Understanding
Using deep learning, computer vision now rivals people at object recognition and detection, opening doors to tackle new challenges in image understanding. Among these challenges, understanding and reasoning about language grounded visual content is of fundamental importance to advancing artificial intelligence. Recently, multiple datasets and algorithms have been created as proxy tasks towards this goal, with visual question answering (VQA) being the most widely studied. In VQA, an algorithm needs to produce an answer to a natural language question about an image. However, our survey of datasets and algorithms for VQA uncovered several sources of dataset bias and sub-optimal evaluation metrics that allowed algorithms to perform well by merely exploiting superficial statistical patterns. In this dissertation, we describe new algorithms and datasets that address these issues. We developed two new datasets and evaluation metrics that enable a more accurate measurement of abilities of a VQA model, and also expand VQA to include new abilities, such as reading text, handling out-of-vocabulary words, and understanding data-visualization. We also created new algorithms for VQA that have helped advance the state-of-the-art for VQA, including an algorithm that surpasses humans on two different chart question answering datasets about bar-charts, line-graphs and pie charts. Finally, we provide a holistic overview of several yet-unsolved challenges in not only VQA but vision and language research at large. Despite enormous progress, we find that a robust understanding and integration of vision and language is still an elusive goal, and much of the progress may be misleading due to dataset bias, superficial correlations and flaws in standard evaluation metrics. We carefully study and categorize these issues for several vision and language tasks and outline several possible paths towards development of safe, robust and trustworthy AI for language-grounded visual understanding
Introducing Implicit Bias: Why this Book Matters
Written by a diverse range of scholars, this accessible introductory volume asks: What is implicit bias? How does implicit bias compromise our knowledge of others and social reality? How does implicit bias affect us, as individuals and participants in larger social and political institutions, and what can we do to combat biases? An interdisciplinary enterprise, the volume brings together the philosophical perspective of the humanities with the perspective of the social sciences to develop rich lines of inquiry. It is written in a non-technical style, using relatable examples that help readers understand what implicit bias is, its significance, and the controversies surrounding it. Each chapter includes discussion questions and additional reading suggestions. A companion webpage contains teaching resources. The volume will be an invaluable resource for students—and researchers—seeking to understand criticisms surrounding implicit bias, as well as how one might answer them by adopting a more nuanced understanding of bias and its role in maintaining social injustice
Chatbot-Based Natural Language Interfaces for Data Visualisation: A Scoping Review
Rapid growth in the generation of data from various sources has made data visualisation a valuable tool for analysing data. However, visual analysis can be a challenging task, not only due to intricate dashboards but also when dealing with complex and multidimensional data. In this context, advances in Natural Language Processing technologies have led to the development of Visualisation-oriented Natural Language Interfaces (V-NLIs). In this paper, we carry out a scoping review that analyses synergies between the fields of Data Visualisation and Natural Language Interaction. Specifically, we focus on chatbot-based V-NLI approaches and explore and discuss three research questions. The first two research questions focus on studying how chatbot-based V-NLIs contribute to interactions with the Data and Visual Spaces of the visualisation pipeline, while the third seeks to know how chatbot-based V-NLIs enhance users' interaction with visualisations. Our findings show that the works in the literature put a strong focus on exploring tabular data with basic visualisations, with visual mapping primarily reliant on fixed layouts. Moreover, V-NLIs provide users with restricted guidance strategies, and few of them support high-level and follow-up queries. We identify challenges and possible research opportunities for the V-NLI community such as supporting high-level queries with complex data, integrating V-NLIs with more advanced systems such as Augmented Reality (AR) or Virtual Reality (VR), particularly for advanced visualisations, expanding guidance strategies beyond current limitations, adopting intelligent visual mapping techniques, and incorporating more sophisticated interaction methods
Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs
Building cross-model intelligence that can understand charts and communicate
the salient information hidden behind them is an appealing challenge in the
vision and language(V+L) community. The capability to uncover the underlined
table data of chart figures is a critical key to automatic chart understanding.
We introduce ChartT5, a V+L model that learns how to interpret table
information from chart images via cross-modal pre-training on plot table pairs.
Specifically, we propose two novel pre-training objectives: Masked Header
Prediction (MHP) and Masked Value Prediction (MVP) to facilitate the model with
different skills to interpret the table information. We have conducted
extensive experiments on chart question answering and chart summarization to
verify the effectiveness of the proposed pre-training strategies. In
particular, on the ChartQA benchmark, our ChartT5 outperforms the
state-of-the-art non-pretraining methods by over 8% performance gains.Comment: Accepted by Findings of ACL 202
- …