394 research outputs found

    BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo

    Full text link
    Bounded by the inherent ambiguity of depth perception, contemporary multi-view 3D object detection methods fall into the performance bottleneck. Intuitively, leveraging temporal multi-view stereo (MVS) technology is the natural knowledge for tackling this ambiguity. However, traditional attempts of MVS has two limitations when applying to 3D object detection scenes: 1) The affinity measurement among all views suffers expensive computational cost; 2) It is difficult to deal with outdoor scenarios where objects are often mobile. To this end, we propose BEVStereo++: by introducing a dynamic temporal stereo strategy, BEVStereo++ is able to cut down the harm that is brought by introducing temporal stereo when dealing with those two scenarios. Going one step further, we apply Motion Compensation Module and long sequence Frame Fusion to BEVStereo++, which shows further performance boosting and error reduction. Without bells and whistles, BEVStereo++ achieves state-of-the-art(SOTA) on both Waymo and nuScenes dataset

    Coupling interaction between the power coupler and the third harmonic superconducting cavity

    Get PDF
    Fermilab has developed a third harmonic superconducting cavity operating at the frequency of 3.9 GHz to improve the beam performance for the FLASH user facility at DESY. It is interesting to investigate the coupling interaction between the SRF cavity and the power coupler with or without beam loading. The coupling of the power coupler to the cavity needs to be determined to minimize the power consumption and guarantee the best performance for a given beam current. In this paper, we build and analyze an equivalent circuit model containing a series of lumped elements to represent the resonant system. An analytic solution of the required power from the generator as a function of the system parameters has also been given based on a vector diagram

    Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

    Full text link
    In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. It achieves 74.1\% NDS (state-of-the-art with single model) on nuScenes test set while maintaining faster inference speed. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code is released at https://github.com/junjie18/CMT
    corecore