Search CORE

7 research outputs found

Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation

Author: Andrearczyk Vincent
Dang-Nguyen Duc-Tien
Dicente Cid Yashin
Eickhoff Carsten
Farri Oladimeji
Garcia Seco De Herrera Alba
Gurrin Cathal
Hasan Sadid A
Ionescu Bogdan
Kovalev Vassili
Liauchuk Vitali
Ling Yuan
Liu Joey
Lungren Matthew
Lux Mathias
Müller Henning
Piras Luca
Riegler Michael
Villegas Mauricio
Zhou Liting
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

This paper presents an overview of the ImageCLEF 2018 evaluation campaign, an event that was organized as part of the CLEF (Conference and Labs of the Evaluation Forum) Labs 2018. ImageCLEF is an ongoing initiative (it started in 2003) that promotes the evaluation of technologies for annotation, indexing and retrieval with the aim of providing information access to collections of images in various usage scenarios and domains. In 2018, the 16th edition of ImageCLEF ran three main tasks and a pilot task: (1) a caption prediction task that aims at predicting the caption of a figure from the biomedical literature based only on the figure image; (2) a tuberculosis task that aims at detecting the tuberculosis type, severity and drug resistance from CT (Computed Tomography) volumes of the lung; (3) a LifeLog task (videos, images and other sources) about daily activities understanding and moment retrieval, and (4) a pilot task on visual question answering where systems are tasked with answering medical questions. The strong participation, with over 100 research groups registering and 31 submitting results for the tasks, shows an increasing interest in this benchmarking campaign

University of Essex Research Repository

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Archivio istituzionale della ricerca - Università di Cagliari

DCU Online Research Access Service

Overcoming Data Limitation in Medical Visual Question Answering

Author: Do Thanh-Toan
Do Tuong
Nguyen Binh D
Nguyen Binh X
Tjiputra Erman
Tran Quang D
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/09/2019
Field of study

Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training. Unfortunately, such large scale data is usually not available for medical domain. In this paper, we propose a novel medical VQA framework that overcomes the labeled data limitation. The proposed framework explores the use of the unsupervised Denoising Auto-Encoder (DAE) and the supervised Meta-Learning. The advantage of DAE is to leverage the large amount of unlabeled images while the advantage of Meta-Learning is to learn meta-weights that quickly adapt to VQA problem with limited labeled data. By leveraging the advantages of these techniques, it allows the proposed framework to be efficiently trained using a small labeled training set. The experimental results show that our proposed method significantly outperforms the state-of-the-art medical VQA. The source code is available at https://github.com/aioz-ai/MICCAI19-MedVQA

arXiv.org e-Print Archive

University of Liverpool Repository

Recommended from our members

Vision-Language Transformer for Interpretable Pathology Visual Question Answering

Author: Khushi M
Kim J
Naseem U
Publication venue: IEEE
Publication date: 31/03/2022
Field of study

Pathology visual question answering (PathVQA) attempts to answer a medical question posed by pathology images. Despite its great potential in healthcare, it is not widely adopted because it requires interactions on both the image (vision) and question (language) to generate an answer. Existing methods focused on treating vision and language features independently, which were unable to capture the high and low-level interactions that are required for VQA. Further, these methods failed to offer capabilities to interpret the retrieved answers, which are obscure to humans where the models’ interpretability to justify the retrieved answers has remained largely unexplored. Motivated by these limitations, we introduce a vision-language transformer that embeds vision (images) and language (questions) features for an interpretable PathVQA. We present an interpretable tra nsformer-based P ath- VQA (TraP-VQA), where we embed transformers’ encoder layers with vision and language features extracted using pre-trained CNN and domain-specific language model (LM), respectively. A decoder layer is then embedded to upsample the encoded features for the final prediction for PathVQA. Our experiments showed that our TraP-VQA outperformed the state-of-the-art comparative methods with public PathVQA dataset. Our experiments validated the robustness of our model on another medical VQA dataset, and the ablation study demonstrated the capability of our integrated transformer-based vision-language model for PathVQA. Finally, we present the visualization results of both text and images, which explain the reason for a retrieved answer in PathVQA.ARC (Grant Number: DP200103748)

Brunel University Research Archive

Recommended from our members

K-PathVQA: Knowledge-Aware Multimodal Representation for Pathology Visual Question Answering

Author: Dunn AG
Khushi M
Kim J
Naseem U
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 11/07/2023
Field of study

ARC (Grant Number: DP200103748

Brunel University Research Archive

Multiple Meta-model Quantifying for Medical Visual Question Answering

Author: Do Tuong
Nguyen Anh
Nguyen Binh X
Tjiputra Erman
Tran Minh
Tran Quang D
Publication venue: Springer International Publishing
Publication date: 01/01/2021
Field of study

Transfer learning is an important step to extract meaningful features and overcome the data limitation in the medical Visual Question Answering (VQA) task. However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized. In this paper, we present a new multiple meta-model quantifying method that effectively learns meta-annotation and leverages meaningful features to the medical VQA task. Our proposed method is designed to increase meta-data by auto-annotation, deal with noisy labels, and output meta-models which provide robust features for medical VQA tasks. Extensively experimental results on two public medical VQA datasets show that our approach achieves superior accuracy in comparison with other state-of-the-art methods, while does not require external data to train meta-models. Source code available at: https://github.com/aioz-ai/MICCAI21_MMQ

arXiv.org e-Print Archive

University of Liverpool Repository

Geographic information extraction from texts

Author: Hu Xuke
Hu Yingjie
Kersten Jens
Resch Bernd
Publication venue
Publication date: 05/12/2023
Field of study

A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

Institute of Transport Research:Publications