236 research outputs found

    Photonic integrated circuit design in a foundry+fabless ecosystem

    Get PDF
    A foundry-based photonic ecosystem is expected to become necessary with increasing demand and adoption of photonics for commercial products. To make foundry-enabled photonics a real success, the photonic circuit design flow should adopt known concepts from analog and mixed signal electronics. Based on the similarities and differences between the existing photonic and the standardized electronics design flow, we project the needs and evolution of the photonic design flow, such as schematic driven design, accurate behavioral models, and yield prediction in the presence of fabrication variability

    Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips

    Full text link
    We tackle the task of reconstructing hand-object interactions from short video clips. Given an input video, our approach casts 3D inference as a per-video optimization and recovers a neural 3D representation of the object shape, as well as the time-varying motion and hand articulation. While the input video naturally provides some multi-view cues to guide 3D inference, these are insufficient on their own due to occlusions and limited viewpoint variations. To obtain accurate 3D, we augment the multi-view signals with generic data-driven priors to guide reconstruction. Specifically, we learn a diffusion network to model the conditional distribution of (geometric) renderings of objects conditioned on hand configuration and category label, and leverage it as a prior to guide the novel-view renderings of the reconstructed scene. We empirically evaluate our approach on egocentric videos across 6 object categories, and observe significant improvements over prior single-view and multi-view methods. Finally, we demonstrate our system's ability to reconstruct arbitrary clips from YouTube, showing both 1st and 3rd person interactions.Comment: Accepted to ICCV23 (Oral). Project Page: https://judyye.github.io/diffhoi-www

    Fast and accurate time-domain simulation of passive photonic systems

    Get PDF

    Numerical modeling of a linear photonic system for accurate and efficient time-domain simulations

    Get PDF
    In this paper, a novel modeling and simulation method for general linear, time-invariant, passive photonic devices and circuits is proposed. This technique, starting from the scattering parameters of the photonic system under study, builds a baseband equivalent state-space model that splits the optical carrier frequency and operates at baseband, thereby significantly reducing the modeling and simulation complexity without losing accuracy. Indeed, it is possible to analytically reconstruct the port signals of the photonic system under study starting from the time-domain simulation of the corresponding baseband equivalent model. However, such equivalent models are complex-valued systems and, in this scenario, the conventional passivity constraints are not applicable anymore. Hence, the passivity constraints for scattering parameters and state-space models of baseband equivalent systems are presented, which are essential for time-domain simulations. Three suitable examples demonstrate the feasibility, accuracy, and efficiency of the proposed method. (C) 2018 Chinese Laser Pres

    LPFormer: LiDAR Pose Estimation Transformer with Multi-Task Network

    Full text link
    In this technical report, we present the 1st place solution for the 2023 Waymo Open Dataset Pose Estimation challenge. Due to the difficulty of acquiring large-scale 3D human keypoint annotation, previous methods have commonly relied on 2D image features and 2D sequential annotations for 3D human pose estimation. In contrast, our proposed method, named LPFormer, uses only LiDAR as its input along with its corresponding 3D annotations. LPFormer consists of two stages: the first stage detects the human bounding box and extracts multi-level feature representations, while the second stage employs a transformer-based network to regress the human keypoints using these features. Experimental results on the Waymo Open Dataset demonstrate the top performance, and improvements even compared to previous multi-modal solutions.Comment: Technical report of the top solution for the Waymo Open Dataset Challenges 2023 - Pose Estimation. CVPR 2023 Workshop on Autonomous Drivin

    LidarMultiNet: Towards a Unified Multi-task Network for LiDAR Perception

    Full text link
    LiDAR-based 3D object detection, semantic segmentation, and panoptic segmentation are usually implemented in specialized networks with distinctive architectures that are difficult to adapt to each other. This paper presents LidarMultiNet, a LiDAR-based multi-task network that unifies these three major LiDAR perception tasks. Among its many benefits, a multi-task network can reduce the overall cost by sharing weights and computation among multiple tasks. However, it typically underperforms compared to independently combined single-task models. The proposed LidarMultiNet aims to bridge the performance gap between the multi-task network and multiple single-task networks. At the core of LidarMultiNet is a strong 3D voxel-based encoder-decoder architecture with a Global Context Pooling (GCP) module extracting global contextual features from a LiDAR frame. Task-specific heads are added on top of the network to perform the three LiDAR perception tasks. More tasks can be implemented simply by adding new task-specific heads while introducing little additional cost. A second stage is also proposed to refine the first-stage segmentation and generate accurate panoptic segmentation results. LidarMultiNet is extensively tested on both Waymo Open Dataset and nuScenes dataset, demonstrating for the first time that major LiDAR perception tasks can be unified in a single strong network that is trained end-to-end and achieves state-of-the-art performance. Notably, LidarMultiNet reaches the official 1st place in the Waymo Open Dataset 3D semantic segmentation challenge 2022 with the highest mIoU and the best accuracy for most of the 22 classes on the test set, using only LiDAR points as input. It also sets the new state-of-the-art for a single model on the Waymo 3D object detection benchmark and three nuScenes benchmarks.Comment: Full-length paper extending our previous technical report of the 1st place solution of the 2022 Waymo Open Dataset 3D Semantic Segmentation challenge, including evaluations on 5 major benchmarks. arXiv admin note: text overlap with arXiv:2206.1142

    LiDARFormer: A Unified Transformer-based Multi-task Network for LiDAR Perception

    Full text link
    There is a recent trend in the LiDAR perception field towards unifying multiple tasks in a single strong network with improved performance, as opposed to using separate networks for each task. In this paper, we introduce a new LiDAR multi-task learning paradigm based on the transformer. The proposed LiDARFormer utilizes cross-space global contextual feature information and exploits cross-task synergy to boost the performance of LiDAR perception tasks across multiple large-scale datasets and benchmarks. Our novel transformer-based framework includes a cross-space transformer module that learns attentive features between the 2D dense Bird's Eye View (BEV) and 3D sparse voxel feature maps. Additionally, we propose a transformer decoder for the segmentation task to dynamically adjust the learned features by leveraging the categorical feature representations. Furthermore, we combine the segmentation and detection features in a shared transformer decoder with cross-task attention layers to enhance and integrate the object-level and class-level features. LiDARFormer is evaluated on the large-scale nuScenes and the Waymo Open datasets for both 3D detection and semantic segmentation tasks, and it outperforms all previously published methods on both tasks. Notably, LiDARFormer achieves the state-of-the-art performance of 76.4% L2 mAPH and 74.3% NDS on the challenging Waymo and nuScenes detection benchmarks for a single model LiDAR-only method.Comment: ICRA 202
    corecore