631 research outputs found

    SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction

    Full text link
    3D occupancy prediction is an important task for the robustness of vision-centric autonomous driving, which aims to predict whether each point is occupied in the surrounding 3D space. Existing methods usually require 3D occupancy labels to produce meaningful results. However, it is very laborious to annotate the occupancy status of each voxel. In this paper, we propose SelfOcc to explore a self-supervised way to learn 3D occupancy using only video sequences. We first transform the images into the 3D space (e.g., bird's eye view) to obtain 3D representation of the scene. We directly impose constraints on the 3D representations by treating them as signed distance fields. We can then render 2D images of previous and future frames as self-supervision signals to learn the 3D representations. We propose an MVS-embedded strategy to directly optimize the SDF-induced weights with multiple depth proposals. Our SelfOcc outperforms the previous best method SceneRF by 58.7% using a single frame as input on SemanticKITTI and is the first self-supervised work that produces reasonable 3D occupancy for surround cameras on nuScenes. SelfOcc produces high-quality depth and achieves state-of-the-art results on novel depth synthesis, monocular depth estimation, and surround-view depth estimation on the SemanticKITTI, KITTI-2015, and nuScenes, respectively. Code: https://github.com/huang-yh/SelfOcc.Comment: Code is available at: https://github.com/huang-yh/SelfOc

    Ultra-high-linearity integrated lithium niobate electro-optic modulators

    Full text link
    Integrated lithium niobate (LN) photonics is a promising platform for future chip-scale microwave photonics systems owing to its unique electro-optic properties, low optical loss and excellent scalability. A key enabler for such systems is a highly linear electro-optic modulator that could faithfully covert analog electrical signals into optical signals. In this work, we demonstrate a monolithic integrated LN modulator with an ultrahigh spurious-free dynamic range (SFDR) of 120.04 dB Hz4/5 at 1 GHz, using a ring-assisted Mach-Zehnder interferometer configuration. The excellent synergy between the intrinsically linear electro-optic response of LN and an optimized linearization strategy allows us to fully suppress the cubic terms of third-order intermodulation distortions (IMD3) without active feedback controls, leading to ~ 20 dB improvement over previous results in the thin-film LN platform. Our ultra-high-linearity LN modulators could become a core building block for future large-scale functional microwave photonic integrated circuits, by further integration with other high-performance components like low-loss delay lines, tunable filters and phase shifters available on the LN platform

    Exploring Unified Perspective For Fast Shapley Value Estimation

    Full text link
    Shapley values have emerged as a widely accepted and trustworthy tool, grounded in theoretical axioms, for addressing challenges posed by black-box models like deep neural networks. However, computing Shapley values encounters exponential complexity in the number of features. Various approaches, including ApproSemivalue, KernelSHAP, and FastSHAP, have been explored to expedite the computation. We analyze the consistency of existing works and conclude that stochastic estimators can be unified as the linear transformation of importance sampling of feature subsets. Based on this, we investigate the possibility of designing simple amortized estimators and propose a straightforward and efficient one, SimSHAP, by eliminating redundant techniques. Extensive experiments conducted on tabular and image datasets validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values

    Control Strategy for Microgrid Inverter under Unbalanced Grid Voltage Conditions

    Get PDF

    OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

    Full text link
    Understanding how the 3D scene evolves is vital for making decisions in autonomous driving. Most existing methods achieve this by predicting the movements of object boxes, which cannot capture more fine-grained scene information. In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D Occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes. We propose to learn a world model based on 3D occupancy rather than 3D bounding boxes and segmentation maps for three reasons: 1) expressiveness. 3D occupancy can describe the more fine-grained 3D structure of the scene; 2) efficiency. 3D occupancy is more economical to obtain (e.g., from sparse LiDAR points). 3) versatility. 3D occupancy can adapt to both vision and LiDAR. To facilitate the modeling of the world evolution, we learn a reconstruction-based scene tokenizer on the 3D occupancy to obtain discrete scene tokens to describe the surrounding scenes. We then adopt a GPT-like spatial-temporal generative transformer to generate subsequent scene and ego tokens to decode the future occupancy and ego trajectory. Extensive experiments on the widely used nuScenes benchmark demonstrate the ability of OccWorld to effectively model the evolution of the driving scenes. OccWorld also produces competitive planning results without using instance and map supervision. Code: https://github.com/wzzheng/OccWorld.Comment: Code is available at: https://github.com/wzzheng/OccWorl

    Flexible Control Strategy for Grid-Connected Inverter under Unbalanced Grid Faults without PLL

    Get PDF
    corecore