5 research outputs found

    Robust Visual Question Answering: Datasets, Methods, and Future Challenges

    Full text link
    Visual question answering requires a system to provide an accurate natural language answer given an image and a natural language question. However, it is widely recognized that previous generic VQA methods often exhibit a tendency to memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers. Therefore, these methods usually achieve high in-distribution but poor out-of-distribution performance. In recent years, various datasets and debiasing methods have been proposed to evaluate and enhance the VQA robustness, respectively. This paper provides the first comprehensive survey focused on this emerging fashion. Specifically, we first provide an overview of the development process of datasets from in-distribution and out-of-distribution perspectives. Then, we examine the evaluation metrics employed by these datasets. Thirdly, we propose a typology that presents the development process, similarities and differences, robustness comparison, and technical features of existing debiasing methods. Furthermore, we analyze and discuss the robustness of representative vision-and-language pre-training models on VQA. Finally, through a thorough review of the available literature and experimental analysis, we discuss the key areas for future research from various viewpoints.Comment: IEEE TPAMI (Under Review

    Adaptive loose optimization for robust question answering

    Full text link
    Question answering methods are well-known for leveraging data bias, such as the language prior in visual question answering and the position bias in machine reading comprehension (extractive question answering). Current debiasing methods often come at the cost of significant in-distribution performance to achieve favorable out-of-distribution generalizability, while non-debiasing methods sacrifice a considerable amount of out-of-distribution performance in order to obtain high in-distribution performance. Therefore, it is challenging for them to deal with the complicated changing real-world situations. In this paper, we propose a simple yet effective novel loss function with adaptive loose optimization, which seeks to make the best of both worlds for question answering. Our main technical contribution is to reduce the loss adaptively according to the ratio between the previous and current optimization state on mini-batch training data. This loose optimization can be used to prevent non-debiasing methods from overlearning data bias while enabling debiasing methods to maintain slight bias learning. Experiments on the visual question answering datasets, including VQA v2, VQA-CP v1, VQA-CP v2, GQA-OOD, and the extractive question answering dataset SQuAD demonstrate that our approach enables QA methods to obtain state-of-the-art in- and out-of-distribution performance in most cases. The source code has been released publicly in \url{https://github.com/reml-group/ALO}.Comment: 13 pages,8 figure

    SPatiotemporal-ENcoded acoustic radiation force imaging of focused ultrasound

    Get PDF
    Neuromodulation technology has provided novel therapeutic approaches for diseases caused by neural circuit dysfunction. Transcranial focused ultrasound (FU) is an emerging neuromodulation approach that combines noninvasiveness with relatively sharp focus, even in deep brain regions. It has numerous advantages such as high precision and good safety in neuromodulation, allowing for modulation of both peripheral and central nervous systems. To ensure accurate treatment targeting in FU neuromodulation, a magnetic resonance acoustic radiation force imaging (MR-ARFI) sequence is crucial for the visualization of the focal point. Currently, the commonly used 2D Spin Echo ARFI (2D SE-ARFI) sequence suffers from the long acquisition time, while the echo planar imaging ARFI (EPI-ARFI) sequence with a shorter acquisition time is vulnerable to the magnetic field inhomogeneities. To address these problems, we proposed a spatiotemporal-encoded acoustic radiation force imaging sequence (i.e., SE-SPEN-ARFI, shortened to SPEN-ARFI) in this study. The displacement at the focal spot obtained was highly consistent with that of the SE-ARFI sequence. Our research shows that SPEN-ARFI allows for rapid image acquisition and has less image distortions even under great field inhomogeneities. Therefore, a SPEN-ARFI sequence is a practical alternative for the treatment planning in ultrasound neuromodulation

    CYP4B1 is a prognostic biomarker and potential therapeutic target in lung adenocarcinoma.

    No full text
    CYP4B1 belongs to the mammalian CYP4 enzyme family and is predominantly expressed in the lungs of humans. It is responsible for the oxidative metabolism of a wide range of endogenous compounds and xenobiotics. In this study, using data from The Cancer Genome Atlas (TCGA) project and the Gene Expression Omnibus (GEO) database, a secondary analysis was performed to explore the expression profile of CYP4B1, as well as its prognostic value in patients with lung adenocarcinoma (LUAD). Based on the obtained results, a significantly decreased CYP4B1 expression was discovered in patients with LUAD when compared with their normal counterparts (p<0.05), and was linked to age younger than 65 years (p = 0.0041), history of pharmaceutical (p = 0.0127) and radiation (p = 0.0340) therapy, mutations in KRAS/EGFR/ALK (p = 0.0239), and living status of dead (p = 0.0026). Survival analysis indicated that the low CYP4B1 expression was an independent prognostic indicator of shorter survival in terms of overall survival (OS) and recurrence-free survival (RFS) in patients with LUAD. The copy number alterations (CNAs) and sites of cg23440155 and cg23414387 hypermethylation might contribute to the decreased CYP4B1 expression. Gene set enrichment analysis (GSEA) suggested that CYP4B1 might act as an oncogene in LUAD by preventing biological metabolism pathways of exogenous and endogenous compounds and enhancing DNA replication and cell cycle activities. In conclusion, CYP4B1 expression may serve as a valuable independent prognostic biomarker and a potential therapeutic target in patients with LUAD
    corecore