93 research outputs found

    Projected Spatiotemporal Dynamics of Drought under Global Warming in Central Asia

    Get PDF
    Drought, one of the most common natural disasters that have the greatest impact on human social life, has been extremely challenging to accurately assess and predict. With global warming, it has become more important to make accurate drought predictions and assessments. In this study, based on climate model data provided by the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP), we used the Palmer Drought Severity Index (PDSI) to analyze and project drought characteristics and their trends under two global warming scenarios—1.5 °C and 2.0 °C—in Central Asia. The results showed a marked decline in the PDSI in Central Asia under the influence of global warming, indicating that the drought situation in Central Asia would further worsen under both warming scenarios. Under the 1.5 °C warming scenario, the PDSI in Central Asia decreased first and then increased, and the change time was around 2080, while the PDSI values showed a continuous decline after 2025 in the 2.0 °C warming scenario. Under the two warming scenarios, the spatial characteristics of dry and wet areas in Central Asia are projected to change significantly in the future. In the 1.5 °C warming scenario, the frequency of drought and the proportion of arid areas in Central Asia were significantly higher than those under the 2.0 °C warming scenario. Using the Thornthwaite (TH) formula to calculate the PDSI produced an overestimation of drought, and the Penman–Monteith (PM) formula is therefore recommended to calculate the index

    This Looks Like Those: Illuminating Prototypical Concepts Using Multiple Visualizations

    Full text link
    We present ProtoConcepts, a method for interpretable image classification combining deep learning and case-based reasoning using prototypical parts. Existing work in prototype-based image classification uses a ``this looks like that'' reasoning process, which dissects a test image by finding prototypical parts and combining evidence from these prototypes to make a final classification. However, all of the existing prototypical part-based image classifiers provide only one-to-one comparisons, where a single training image patch serves as a prototype to compare with a part of our test image. With these single-image comparisons, it can often be difficult to identify the underlying concept being compared (e.g., ``is it comparing the color or the shape?''). Our proposed method modifies the architecture of prototype-based networks to instead learn prototypical concepts which are visualized using multiple image patches. Having multiple visualizations of the same prototype allows us to more easily identify the concept captured by that prototype (e.g., ``the test image and the related training patches are all the same shade of blue''), and allows our model to create richer, more interpretable visual explanations. Our experiments show that our ``this looks like those'' reasoning process can be applied as a modification to a wide range of existing prototypical image classification networks while achieving comparable accuracy on benchmark datasets

    Transforming the Interactive Segmentation for Medical Imaging

    Full text link
    The goal of this paper is to interactively refine the automatic segmentation on challenging structures that fall behind human performance, either due to the scarcity of available annotations or the difficulty nature of the problem itself, for example, on segmenting cancer or small organs. Specifically, we propose a novel Transformer-based architecture for Interactive Segmentation (TIS), that treats the refinement task as a procedure for grouping pixels with similar features to those clicks given by the end users. Our proposed architecture is composed of Transformer Decoder variants, which naturally fulfills feature comparison with the attention mechanisms. In contrast to existing approaches, our proposed TIS is not limited to binary segmentations, and allows the user to edit masks for arbitrary number of categories. To validate the proposed approach, we conduct extensive experiments on three challenging datasets and demonstrate superior performance over the existing state-of-the-art methods. The project page is: https://wtliu7.github.io/tis/.Comment: Accepted to MICCAI 202

    Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models

    Full text link
    When trained at a sufficient scale, self-supervised learning has exhibited a notable ability to solve a wide range of visual or language understanding tasks. In this paper, we investigate simple, yet effective approaches for adapting the pre-trained foundation models to the downstream task of interest, namely, open-vocabulary semantic segmentation. To this end, we make the following contributions: (i) we introduce Fusioner, with a lightweight, transformer-based fusion module, that pairs the frozen visual representation with language concept through a handful of image segmentation data. As a consequence, the model gains the capability of zero-shot transfer to segment novel categories; (ii) without loss of generality, we experiment on a broad range of self-supervised models that have been pre-trained with different schemes, e.g. visual-only models (MoCo v3, DINO), language-only models (BERT), visual-language model (CLIP), and show that, the proposed fusion approach is effective to any pair of visual and language models, even those pre-trained on a corpus of uni-modal data; (iii) we conduct thorough ablation studies to analyze the critical components in our proposed Fusioner, while evaluating on standard benchmarks, e.g. PASCAL-5i and COCO-20i , it surpasses existing state-of-the-art models by a large margin, despite only being trained on frozen visual and language features; (iv) to measure the model's robustness on learning visual-language correspondence, we further evaluate on synthetic dataset, named Mosaic-4, where images are constructed by mosaicking the samples from FSS-1000. Fusioner demonstrates superior performance over previous models.Comment: BMVC 2022 Ora

    Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation

    Full text link
    The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues. However, current fusion-based methods have the performance limitations due to the small receptive field of convolution and inadequate fusion of audio-visual features. To overcome these issues, we propose a novel \textbf{Au}dio-aware query-enhanced \textbf{TR}ansformer (AuTR) to tackle the task. Unlike existing methods, our approach introduces a multimodal transformer architecture that enables deep fusion and aggregation of audio-visual features. Furthermore, we devise an audio-aware query-enhanced transformer decoder that explicitly helps the model focus on the segmentation of the pinpointed sounding objects based on audio signals, while disregarding silent yet salient objects. Experimental results show that our method outperforms previous methods and demonstrates better generalization ability in multi-sound and open-set scenarios.Comment: arXiv admin note: text overlap with arXiv:2305.1101
    • …
    corecore