376 research outputs found

    Noise-Tolerant Unsupervised Adapter for Vision-Language Models

    Full text link
    Recent advances in large-scale vision-language models have achieved very impressive performance in various zero-shot image classification tasks. While prior studies have demonstrated significant improvements by introducing few-shot labelled target samples, they still require labelling of target samples, which greatly degrades their scalability while handling various visual recognition tasks. We design NtUA, a Noise-tolerant Unsupervised Adapter that allows learning superior target models with few-shot unlabelled target samples. NtUA works as a key-value cache that formulates visual features and predicted pseudo-labels of the few-shot unlabelled target samples as key-value pairs. It consists of two complementary designs. The first is adaptive cache formation that combats pseudo-label noises by weighting the key-value pairs according to their prediction confidence. The second is pseudo-label rectification, which corrects both pair values (i.e., pseudo-labels) and cache weights by leveraging knowledge distillation from large-scale vision language models. Extensive experiments show that NtUA achieves superior performance consistently across multiple widely adopted benchmarks

    MLAN: Multi-Level Adversarial Network for Domain Adaptive Semantic Segmentation

    Full text link
    Recent progresses in domain adaptive semantic segmentation demonstrate the effectiveness of adversarial learning (AL) in unsupervised domain adaptation. However, most adversarial learning based methods align source and target distributions at a global image level but neglect the inconsistency around local image regions. This paper presents a novel multi-level adversarial network (MLAN) that aims to address inter-domain inconsistency at both global image level and local region level optimally. MLAN has two novel designs, namely, region-level adversarial learning (RL-AL) and co-regularized adversarial learning (CR-AL). Specifically, RL-AL models prototypical regional context-relations explicitly in the feature space of a labelled source domain and transfers them to an unlabelled target domain via adversarial learning. CR-AL fuses region-level AL and image-level AL optimally via mutual regularization. In addition, we design a multi-level consistency map that can guide domain adaptation in both input space (i.e.i.e., image-to-image translation) and output space (i.e.i.e., self-training) effectively. Extensive experiments show that MLAN outperforms the state-of-the-art with a large margin consistently across multiple datasets.Comment: Submitted to P

    Efficient Test-Time Adaptation of Vision-Language Models

    Full text link
    Test-time adaptation with pre-trained vision-language models has attracted increasing attention for tackling distribution shifts during the test time. Though prior studies have achieved very promising performance, they involve intensive computation which is severely unaligned with test-time adaptation. We design TDA, a training-free dynamic adapter that enables effective and efficient test-time adaptation with vision-language models. TDA works with a lightweight key-value cache that maintains a dynamic queue with few-shot pseudo labels as values and the corresponding test-sample features as keys. Leveraging the key-value cache, TDA allows adapting to test data gradually via progressive pseudo label refinement which is super-efficient without incurring any backpropagation. In addition, we introduce negative pseudo labeling that alleviates the adverse impact of pseudo label noises by assigning pseudo labels to certain negative classes when the model is uncertain about its pseudo label predictions. Extensive experiments over two benchmarks demonstrate TDA's superior effectiveness and efficiency as compared with the state-of-the-art. The code has been released in \url{https://kdiaaa.github.io/tda/}.Comment: Accepted to CVPR 2024. The code has been released in \url{https://kdiaaa.github.io/tda/

    Nonlinear relativistic corrections to cosmological distances, redshift and gravitational lensing magnification. I - Key results

    Get PDF
    The next generation of telescopes will usher in an era of precision cosmology, capable of determining the cosmological model to beyond the percent level. For this to be effective, the theoretical model must be understood to at least the same level of precision. A range of subtle relativistic effects remain to be explored theoretically, and offer the potential for probing general relativity in this new regime. We present the distance-redshift relation to second order in cosmological perturbation theory for a general dark energy model. This relation determines the magnification of sources at high precision, as well as redshift space distortions in the mildly non-linear regime. We identify a range of new lensing effects, including: double-integrated and nonlinear integrated Sach-Wolfe contributions, transverse Doppler effects, lensing from the induced vector mode and gravitational wave backgrounds, in addition to lensing from the second-order potential. Modifications to Doppler lensing from redshift-space distortions are identified. Finally, we find a new double-coupling between the density fluctuations integrated along the line of sight, and gradients in the density fluctuations coupled to transverse velocities along the line of sight. These can be large and thus offer important new probes of gravitational lensing and general relativity. This paper accompanies arXiv:1402.1933, where a comprehensive derivation is given.Comment: 7 pages. v2 has significant presentational changes. v3 has new discussion on the magnitude of the corrections, plus minor corrections, and is the version to appear in CQ
    • …
    corecore