Search CORE

23,045 research outputs found

ChartCheck: An Evidence-Based Fact-Checking Dataset over Real-World Chart Images

Author: Akhtar Mubashara
Cocarascu Oana
Gupta Vivek
Simperl Elena
Subedi Nikesh
Tahmasebi Sahar
Publication venue
Publication date: 13/11/2023
Field of study

Data visualizations are common in the real-world. We often use them in data sources such as scientific documents, news articles, textbooks, and social media to summarize key information in a visual form. Charts can also mislead its audience by communicating false information or biasing them towards a specific agenda. Verifying claims against charts is not a straightforward process. It requires analyzing both the text and visual components of the chart, considering characteristics such as colors, positions, and orientations. Moreover, to determine if a claim is supported by the chart content often requires different types of reasoning. To address this challenge, we introduce ChartCheck, a novel dataset for fact-checking against chart images. ChartCheck is the first large-scale dataset with 1.7k real-world charts and 10.5k human-written claims and explanations. We evaluated the dataset on state-of-the-art models and achieved an accuracy of 73.9 in the finetuned setting. Additionally, we identified chart characteristics and reasoning types that challenge the models

arXiv.org e-Print Archive

Advancing Multi-Modal Deep Learning: Towards Language-Grounded Visual Understanding

Author: Kafle Kushal
Publication venue: RIT Scholar Works
Publication date: 24/02/2020
Field of study

Using deep learning, computer vision now rivals people at object recognition and detection, opening doors to tackle new challenges in image understanding. Among these challenges, understanding and reasoning about language grounded visual content is of fundamental importance to advancing artificial intelligence. Recently, multiple datasets and algorithms have been created as proxy tasks towards this goal, with visual question answering (VQA) being the most widely studied. In VQA, an algorithm needs to produce an answer to a natural language question about an image. However, our survey of datasets and algorithms for VQA uncovered several sources of dataset bias and sub-optimal evaluation metrics that allowed algorithms to perform well by merely exploiting superficial statistical patterns. In this dissertation, we describe new algorithms and datasets that address these issues. We developed two new datasets and evaluation metrics that enable a more accurate measurement of abilities of a VQA model, and also expand VQA to include new abilities, such as reading text, handling out-of-vocabulary words, and understanding data-visualization. We also created new algorithms for VQA that have helped advance the state-of-the-art for VQA, including an algorithm that surpasses humans on two different chart question answering datasets about bar-charts, line-graphs and pie charts. Finally, we provide a holistic overview of several yet-unsolved challenges in not only VQA but vision and language research at large. Despite enormous progress, we find that a robust understanding and integration of vision and language is still an elusive goal, and much of the progress may be misleading due to dataset bias, superficial correlations and flaws in standard evaluation metrics. We carefully study and categorize these issues for several vision and language tasks and outline several possible paths towards development of safe, robust and trustworthy AI for language-grounded visual understanding

RIT Scholar Works

Introducing Implicit Bias: Why this Book Matters

Author: Beeghly Erin
Madva Alex
Publication venue
Publication date: 01/01/2020
Field of study

Written by a diverse range of scholars, this accessible introductory volume asks: What is implicit bias? How does implicit bias compromise our knowledge of others and social reality? How does implicit bias affect us, as individuals and participants in larger social and political institutions, and what can we do to combat biases? An interdisciplinary enterprise, the volume brings together the philosophical perspective of the humanities with the perspective of the social sciences to develop rich lines of inquiry. It is written in a non-technical style, using relatable examples that help readers understand what implicit bias is, its significance, and the controversies surrounding it. Each chapter includes discussion questions and additional reading suggestions. A companion webpage contains teaching resources. The volume will be an invaluable resource for students—and researchers—seeking to understand criticisms surrounding implicit bias, as well as how one might answer them by adopting a more nuanced understanding of bias and its role in maintaining social injustice

PhilPapers

Chatbot-Based Natural Language Interfaces for Data Visualisation: A Scoping Review

Author: Kavaz Ecem
Puig Puig Anna
Rodríguez Santiago Inmaculada
Publication venue: 'MDPI AG'
Publication date: 01/06/2023
Field of study

Rapid growth in the generation of data from various sources has made data visualisation a valuable tool for analysing data. However, visual analysis can be a challenging task, not only due to intricate dashboards but also when dealing with complex and multidimensional data. In this context, advances in Natural Language Processing technologies have led to the development of Visualisation-oriented Natural Language Interfaces (V-NLIs). In this paper, we carry out a scoping review that analyses synergies between the fields of Data Visualisation and Natural Language Interaction. Specifically, we focus on chatbot-based V-NLI approaches and explore and discuss three research questions. The first two research questions focus on studying how chatbot-based V-NLIs contribute to interactions with the Data and Visual Spaces of the visualisation pipeline, while the third seeks to know how chatbot-based V-NLIs enhance users' interaction with visualisations. Our findings show that the works in the literature put a strong focus on exploring tabular data with basic visualisations, with visual mapping primarily reliant on fixed layouts. Moreover, V-NLIs provide users with restricted guidance strategies, and few of them support high-level and follow-up queries. We identify challenges and possible research opportunities for the V-NLI community such as supporting high-level queries with complex data, integrating V-NLIs with more advanced systems such as Augmented Reality (AR) or Virtual Reality (VR), particularly for advanced visualisations, expanding guidance strategies beyond current limitations, adopting intelligent visual mapping techniques, and incorporating more sophisticated interaction methods

Directory of Open Access Journals

Diposit Digital de la Universitat de Barcelona

Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs

Author: Chang Shih-Fu
Chen Long
Fung Yi R.
Ji Heng
Thomas Christopher
Zhou Mingyang
Publication venue
Publication date: 29/05/2023
Field of study

Building cross-model intelligence that can understand charts and communicate the salient information hidden behind them is an appealing challenge in the vision and language(V+L) community. The capability to uncover the underlined table data of chart figures is a critical key to automatic chart understanding. We introduce ChartT5, a V+L model that learns how to interpret table information from chart images via cross-modal pre-training on plot table pairs. Specifically, we propose two novel pre-training objectives: Masked Header Prediction (MHP) and Masked Value Prediction (MVP) to facilitate the model with different skills to interpret the table information. We have conducted extensive experiments on chart question answering and chart summarization to verify the effectiveness of the proposed pre-training strategies. In particular, on the ChartQA benchmark, our ChartT5 outperforms the state-of-the-art non-pretraining methods by over 8% performance gains.Comment: Accepted by Findings of ACL 202

arXiv.org e-Print Archive