265 research outputs found

    Local Feature Matching Using Deep Learning: A Survey

    Full text link
    Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition. However, challenges persist in improving the accuracy and robustness of matching due to factors like viewpoint and lighting variations. In recent years, the introduction of deep learning models has sparked widespread exploration into local feature matching techniques. The objective of this endeavor is to furnish a comprehensive overview of local feature matching methods. These methods are categorized into two key segments based on the presence of detectors. The Detector-based category encompasses models inclusive of Detect-then-Describe, Joint Detection and Description, Describe-then-Detect, as well as Graph Based techniques. In contrast, the Detector-free category comprises CNN Based, Transformer Based, and Patch Based methods. Our study extends beyond methodological analysis, incorporating evaluations of prevalent datasets and metrics to facilitate a quantitative comparison of state-of-the-art techniques. The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration, underscoring its versatility and significance across various fields. Ultimately, we endeavor to outline the current challenges faced in this domain and furnish future research directions, thereby serving as a reference for researchers involved in local feature matching and its interconnected domains. A comprehensive list of studies in this survey is available at https://github.com/vignywang/Awesome-Local-Feature-Matching .Comment: Accepted by Information Fusion 2024. Project page: https://github.com/vignywang/Awesome-Local-Feature-Matchin

    Understanding Attention for Vision-and-Language Tasks

    Full text link
    Attention mechanism has been used as an important component across Vision-and-Language(VL) tasks in order to bridge the semantic gap between visual and textual features. While attention has been widely used in VL tasks, it has not been examined the capability of different attention alignment calculation in bridging the semantic gap between visual and textual clues. In this research, we conduct a comprehensive analysis on understanding the role of attention alignment by looking into the attention score calculation methods and check how it actually represents the visual region's and textual token's significance for the global assessment. We also analyse the conditions which attention score calculation mechanism would be more (or less) interpretable, and which may impact the model performance on three different VL tasks, including visual question answering, text-to-image generation, text-and-image matching (both sentence and image retrieval). Our analysis is the first of its kind and provides useful insights of the importance of each attention alignment score calculation when applied at the training phase of VL tasks, commonly ignored in attention-based cross modal models, and/or pretrained models. Our code is available at: https://github.com/adlnlp/Attention_VLComment: Accepted in COLING 202

    The Development of LLMs for Embodied Navigation

    Full text link
    In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-E

    The emergence of global phase coherence from local pairing in underdoped cuprates

    Full text link
    In conventional metal superconductors such as aluminum, the large number of weakly bounded Cooper pairs become phase coherent as soon as they start to form. The cuprate high critical temperature (TcT_c) superconductors, in contrast, belong to a distinctively different category. To account for the high TcT_c, the attractive pairing interaction is expected to be strong and the coherence length is short. Being doped Mott insulators, the cuprates are known to have low superfluid density, thus are susceptible to phase fluctuations. It has been proposed that pairing and phase coherence may occur separately in cuprates, and TcT_c corresponds to the phase coherence temperature controlled by the superfluid density. To elucidate the microscopic processes of pairing and phase ordering in cuprates, here we use scanning tunneling microscopy to image the evolution of electronic states in underdoped Bi2LaxSr2−xCuO6+δ\rm Bi_2La_xSr_{2-x}CuO_{6+{\delta}}. Even in the insulating sample, we observe a smooth crossover from the Mott insulator to superconductor-type spectra on small islands with chequerboard order and emerging quasiparticle interference patterns following the octet model. Each chequerboard plaquette contains approximately two holes, and exhibits a stripy internal structure that has strong influence on the superconducting features. Across the insulator to superconductor boundary, the local spectra remain qualitatively the same while the quasiparticle interferences become long-ranged. These results suggest that the chequerboard plaquette with internal stripes plays a crucial role on local pairing in cuprates, and the global phase coherence is established once its spatial occupation exceeds a threshold

    Emergent normal fluid in the superconducting ground state of overdoped cuprates

    Full text link
    The microscopic mechanism for the disappearance of superconductivity in overdoped cuprates is still under heated debate. Here we use scanning tunneling spectroscopy to investigate the evolution of quasiparticle interference phenomenon in Bi2Sr2CuO6+δ\rm Bi_2Sr_2CuO_{6+\delta} over a wide range of hole densities. We find that when the system enters the overdoped regime, a peculiar quasiparticle interference wavevector with quarter-circle pattern starts to emerge even at zero bias, and its intensity grows with increasing doping level. Its energy dispersion is incompatible with the octet model for d-wave superconductivity, but is highly consistent with the scattering interference of gapless normal carriers. The weight of the gapless quasiparticle interference is mainly located at the antinodes and is independent of temperature. We propose that the normal fluid emerges from the pair-breaking scattering between flat antinodal bands in the quantum ground state, which is the primary cause for the reduction of superfluid density and suppression of superconductivity in overdoped cuprates

    Catalytic removal of 1,2-dichloroethane over LaSrMnCoO6/H-ZSM-5 composite: insights into synergistic effect and pollutant-destruction mechanism

    Get PDF
    LaxSr2−xMnCoO6 materials with different Sr contents were prepared by a coprecipitation method, with LaSrMnCoO6 found to be the best catalyst for 1,2-dichloroethane (DCE) destruction (T90 = 509 °C). As such, a series of LaSrMnCoO6/H-ZSM-5 composite materials were rationally synthesized to further improve the catalytic activity of LaSrMnCoO6. As expected, the introduction of H-ZSM-5 could remarkably enlarge the surface area, increase the number of Lewis acid sites, and enhance the mobility of the surface adsorbed oxygen species, which consequently improved the catalytic activity of LaSrMnCoO6. Among all the composite materials, 10 wt% LaSrMnCoO6/H-ZSM-5 possessed the highest catalytic activity, with 90% of 1,2-DCE destructed at 337 °C, which is a temperature reduction of more than 70 °C and 170 °C compared with that of H-ZSM-5 (T90 = 411 °C) and LaSrMnCoO6 (T90 = 509 °C), respectively. Online product analysis revealed that CO2, CO, HCl, and Cl2 were the primary products in the oxidation of 1,2-DCE, while several unfavorable reaction by-products, such as vinyl chloride, 1,1,2-trichloroethane, trichloroethylene, perchloroethylene, and acetaldehyde, were also formed via dechlorination and dehydrochlorination processes. Based on the above results, the reaction path and mechanism of 1,2-DCE decomposition are proposed
    • …
    corecore