1,211 research outputs found

    FE-Fusion-VPR: Attention-based Multi-Scale Network Architecture for Visual Place Recognition by Fusing Frames and Events

    Full text link
    Traditional visual place recognition (VPR), usually using standard cameras, is easy to fail due to glare or high-speed motion. By contrast, event cameras have the advantages of low latency, high temporal resolution, and high dynamic range, which can deal with the above issues. Nevertheless, event cameras are prone to failure in weakly textured or motionless scenes, while standard cameras can still provide appearance information in this case. Thus, exploiting the complementarity of standard cameras and event cameras can effectively improve the performance of VPR algorithms. In the paper, we propose FE-Fusion-VPR, an attention-based multi-scale network architecture for VPR by fusing frames and events. First, the intensity frame and event volume are fed into the two-stream feature extraction network for shallow feature fusion. Next, the three-scale features are obtained through the multi-scale fusion network and aggregated into three sub-descriptors using the VLAD layer. Finally, the weight of each sub-descriptor is learned through the descriptor re-weighting network to obtain the final refined descriptor. Experimental results show that on the Brisbane-Event-VPR and DDD20 datasets, the Recall@1 of our FE-Fusion-VPR is 29.26% and 33.59% higher than Event-VPR and Ensemble-EventVPR, and is 7.00% and 14.15% higher than MultiRes-NetVLAD and NetVLAD. To our knowledge, this is the first end-to-end network that goes beyond the existing event-based and frame-based SOTA methods to fuse frame and events directly for VPR

    Spike-EVPR: Deep Spiking Residual Network with Cross-Representation Aggregation for Event-Based Visual Place Recognition

    Full text link
    Event cameras have been successfully applied to visual place recognition (VPR) tasks by using deep artificial neural networks (ANNs) in recent years. However, previously proposed deep ANN architectures are often unable to harness the abundant temporal information presented in event streams. In contrast, deep spiking networks exhibit more intricate spatiotemporal dynamics and are inherently well-suited to process sparse asynchronous event streams. Unfortunately, directly inputting temporal-dense event volumes into the spiking network introduces excessive time steps, resulting in prohibitively high training costs for large-scale VPR tasks. To address the aforementioned issues, we propose a novel deep spiking network architecture called Spike-EVPR for event-based VPR tasks. First, we introduce two novel event representations tailored for SNN to fully exploit the spatio-temporal information from the event streams, and reduce the video memory occupation during training as much as possible. Then, to exploit the full potential of these two representations, we construct a Bifurcated Spike Residual Encoder (BSR-Encoder) with powerful representational capabilities to better extract the high-level features from the two event representations. Next, we introduce a Shared & Specific Descriptor Extractor (SSD-Extractor). This module is designed to extract features shared between the two representations and features specific to each. Finally, we propose a Cross-Descriptor Aggregation Module (CDA-Module) that fuses the above three features to generate a refined, robust global descriptor of the scene. Our experimental results indicate the superior performance of our Spike-EVPR compared to several existing EVPR pipelines on Brisbane-Event-VPR and DDD20 datasets, with the average Recall@1 increased by 7.61% on Brisbane and 13.20% on DDD20.Comment: 14 pages, 10 figure

    Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

    Full text link
    Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences. To address this, we explore using strong LLMs as judges to evaluate these models on more open-ended questions. We examine the usage and limitations of LLM-as-a-judge, such as position and verbosity biases and limited reasoning ability, and propose solutions to migrate some of them. We then verify the agreement between LLM judges and human preferences by introducing two benchmarks: MT-bench, a multi-turn question set; and Chatbot Arena, a crowdsourced battle platform. Our results reveal that strong LLM judges like GPT-4 can match both controlled and crowdsourced human preferences well, achieving over 80\% agreement, the same level of agreement between humans. Hence, LLM-as-a-judge is a scalable and explainable way to approximate human preferences, which are otherwise very expensive to obtain. Additionally, we show our benchmark and traditional benchmarks complement each other by evaluating several variants of LLaMA/Vicuna. We will publicly release 80 MT-bench questions, 3K expert votes, and 30K conversations with human preferences from Chatbot Arena

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    Full text link
    Studying how people interact with large language models (LLMs) in real-world scenarios is increasingly important due to their widespread use in various applications. In this paper, we introduce LMSYS-Chat-1M, a large-scale dataset containing one million real-world conversations with 25 state-of-the-art LLMs. This dataset is collected from 210K unique IP addresses in the wild on our Vicuna demo and Chatbot Arena website. We offer an overview of the dataset's content, including its curation process, basic statistics, and topic distribution, highlighting its diversity, originality, and scale. We demonstrate its versatility through four use cases: developing content moderation models that perform similarly to GPT-4, building a safety benchmark, training instruction-following models that perform similarly to Vicuna, and creating challenging benchmark questions. We believe that this dataset will serve as a valuable resource for understanding and advancing LLM capabilities. The dataset is publicly available at https://huggingface.co/datasets/lmsys/lmsys-chat-1m

    Highly efficient and stable planar heterojunction solar cell based on sputtered and post-selenized Sb2Se3 thin film

    Get PDF
    Antimony selenide (Sb2Se3) is regarded as one of the key alternative absorber materials for conventional thin film solar cells due to its excellent optical and electrical properties. Here, we proposed a Sb2Se3 thin film solar cell fabricated using a two-step process magnetron sputtering followed by a post-selenization treatment, which enabled us to optimize the best quality of both the Sb2Se3 thin film and the Sb2Se3/CdS heterojunction interface. By tuning the selenization parameters, a Sb2Se3 thin film solar cell with high efficiency of 6.06% was achieved, the highest reported power conversion efficiency of sputtered Sb2Se3 planar heterojunction solar cells. Moreover, our device presented an outstanding open circuit voltage (VOC) of 494 mV which is superior to those reported Sb2Se3 solar cells. State and density of defects showed that proper selenization temperature could effectively passivate deep defects for the films and thus improve the device performance

    G-quadruplex formation in human telomeric (TTAGGG)4 sequence with complementary strand in close vicinity under molecularly crowded condition

    Get PDF
    Chromosomes in vertebrates are protected at both ends by telomere DNA composed of tandem (TTAGGG)n repeats. DNA replication produces a blunt-ended leading strand telomere and a lagging strand telomere carrying a single-stranded G-rich overhang at its end. The G-rich strand can form G-quadruplex structure in the presence of K+ or Na+. At present, it is not clear whether quadruplex can form in the double-stranded telomere region where the two complementary strands are constrained in close vicinity and quadruplex formation, if possible, has to compete with the formation of the conventional Watson–Crick duplex. In this work, we studied quadruplex formation in oligonucleotides and double-stranded DNA containing both the G- and C-rich sequences to better mimic the in vivo situation. Under such competitive condition only duplex was observed in dilute solution containing physiological concentration of K+. However, quadruplex could preferentially form and dominate over duplex structure under molecular crowding condition created by PEG as a result of significant quadruplex stabilization and duplex destabilization. This observation suggests quadruplex may potentially form or be induced at the blunt end of a telomere, which may present a possible alternative form of structures at telomere ends

    Simultaneous Formation of CH₃NH₃PbI₃ and electron transport layers using antisolvent method for efficient perovskite solar cells

    Get PDF
    A new antisolvent method was developed to prepare CH₃NH₃PbI₃ and electron transport layers for making efficient hybrid perovskite solar cells. By directly using [6,6]-phenyl-C61-butyric acid methyl ester in chlorobenzene solution as antisolvent, CH₃NH₃PbI₃ and electron transport layers were simultaneously formed in the films. This method not only simplifies the fabrication process of devices, but also produces uniform perovskite films and improves the interfacial structures between CH₃NH₃PbI₃ and electron transport layers. Large perovskite grains were observed in these films, with the average grain size of >1 μm. The so-formed CH₃NH₃PbI₃/electron transport layers demonstrated good optical and charge transport properties. And perovskite solar cells fabricated using these simultaneously-formed layers achieved a higher power conversion efficiency of 16.58% compared to conventional antisolvent method (14.92%). This method reduces nearly 80% usage of chlorobenzene during the fabrication, offering a more facile and environment-friendly approach to fabricate efficient perovskite solar cells than the conventional antisolvent method
    • …
    corecore