61 research outputs found

    SimSC: a simple framework for semantic correspondence with temperature learning

    Get PDF
    We propose SimSC, a remarkably simple framework, to address the problem of semantic matching only based on the feature backbone. We discover that when fine-tuning ImageNet pre-trained backbone on the semantic matching task, L2 normalization of the feature map, a standard procedure in feature matching, produces an overly smooth matching distribution and significantly hinders the fine-tuning process. By setting an appropriate temperature to the softmax, this over-smoothness can be alleviated and the quality of features can be substantially improved. We employ a learning module to predict the optimal temperature for fine-tuning feature backbones. This module is trained together with the backbone and the temperature is updated online. We evaluate our method on three public datasets and demonstrate that we can achieve accuracy on par with state-of-the-art methods under the same backbone without using a learned matching head. Our method is versatile and works on various types of backbones. We show that the accuracy of our framework can be easily improved by coupling it with more powerful backbones

    When LLMs step into the 3D world: a survey and meta-analysis of 3D tasks via multi-modal Large Language Models

    Get PDF
    As large language models (LLMs) evolve, their integration with 3D spatial data (3D-LLMs) has seen rapid progress, offering unprecedented capabilities for understanding and interacting with physical spaces. This survey provides a comprehensive overview of the methodologies enabling LLMs to process, understand, and generate 3D data. Highlighting the unique advantages of LLMs, such as in-context learning, step-by-step reasoning, open-vocabulary capabilities, and extensive world knowledge, we underscore their potential to significantly advance spatial comprehension and interaction within embodied Artificial Intelligence (AI) systems. Our investigation spans various 3D data representations, from point clouds to Neural Radiance Fields (NeRFs). It examines their integration with LLMs for tasks such as 3D scene understanding, captioning, question-answering, and dialogue, as well as LLM-based agents for spatial reasoning, planning, and navigation. The paper also includes a brief review of other methods that integrate 3D and language. The meta-analysis presented in this paper reveals significant progress yet underscores the necessity for novel approaches to harness the full potential of 3D-LLMs. Hence, with this paper, we aim to chart a course for future research that explores and expands the capabilities of 3D-LLMs in understanding and interacting with the complex 3D world. To support this survey, we have established a project page where papers related to our topic are organized and listed: https://github.com/ActiveVisionLab/Awesome-LLM-3D

    Ultrathin Oxide Films by Atomic Layer Deposition on Graphene

    Full text link
    In this paper, a method is presented to create and characterize mechanically robust, free standing, ultrathin, oxide films with controlled, nanometer-scale thickness using Atomic Layer Deposition (ALD) on graphene. Aluminum oxide films were deposited onto suspended graphene membranes using ALD. Subsequent etching of the graphene left pure aluminum oxide films only a few atoms in thickness. A pressurized blister test was used to determine that these ultrathin films have a Young's modulus of 154 \pm 13 GPa. This Young's modulus is comparable to much thicker alumina ALD films. This behavior indicates that these ultrathin two-dimensional films have excellent mechanical integrity. The films are also impermeable to standard gases suggesting they are pinhole-free. These continuous ultrathin films are expected to enable new applications in fields such as thin film coatings, membranes and flexible electronics.Comment: Nano Letters (just accepted

    Thermal oxidation of Ni films for p-type thin-film transistors

    No full text
    p-Type nanocrystal NiO-based thin-film transistors (TFTs) are fabricated by simply oxidizing thin Ni films at temperatures as low as 400 °C. The highest field-effect mobility in a linear region and the current on–off ratio are found to be 5.2 cm2 V−1 s−1 and 2.2 × 103, respectively. X-ray diffraction, transmission electron microscopy and electrical performances of the TFTs with “top contact” and “bottom contact” channels suggest that the upper parts of the Ni films are clearly oxidized. In contrast, the lower parts in contact with the gate dielectric are partially oxidized to form a quasi-discontinuous Ni layer, which does not fully shield the gate electric field, but still conduct the source and drain current. This simple method for producing p-type TFTs may be promising for the next-generation oxide-based electronic applications

    Impact of various dopant elements on the electronic structure of Cu₂ZnSnS₄ (CZTS) thin films:a DFT study

    No full text
    Abstract New structures made based on Cu₂ZnSnS₄ (CZTS) by substitutions with Cr, Ti, V, and Mo species were investigated via density functional theory. The total substitution of Zn by Cr and V leads to the vanishing of the bandgap, while n-type conductivity with a low bandgap of 0.19 eV was predicted in the case Ti. In addition, the conduction band minimum and valence band maximum overlapping were observed for the Mo/Sn ratio of 1/3. Therefore, our study suggests that even the low content of alternative cations in CZTS allows to control its band alignment. The obtained results can be helpful for designing CZTS-based intermediate layers to improve the quality of the back interface of the CZTS thin-film solar cells
    • 

    corecore