Search CORE

72 research outputs found

Spectral Representation Learning for Conditional Moment Models

Author: Li Yueru
Luo Yucen
Schölkopf Bernhard
Wang Ziyu
Zhu Jun
Publication venue
Publication date: 28/12/2022
Field of study

Many problems in causal inference and economics can be formulated in the framework of conditional moment models, which characterize the target function through a collection of conditional moment restrictions. For nonparametric conditional moment models, efficient estimation often relies on preimposed conditions on various measures of ill-posedness of the hypothesis space, which are hard to validate when flexible models are used. In this work, we address this issue by proposing a procedure that automatically learns representations with controlled measures of ill-posedness. Our method approximates a linear representation defined by the spectral decomposition of a conditional expectation operator, which can be used for kernelized estimators and is known to facilitate minimax optimal estimation in certain settings. We show this representation can be efficiently estimated from data, and establish L2 consistency for the resulting estimator. We evaluate the proposed method on proximal causal inference tasks, exhibiting promising performance on high-dimensional, semi-synthetic data

arXiv.org e-Print Archive

Rapid Identification and Multiple Susceptibility Testing of Pathogens from Positive-Culture Sterile Body Fluids by a Combined MALDI-TOF Mass Spectrometry and Vitek Susceptibility System

Author: Bei Wang
Bing Zheng
Min Li
Yong Lin
Yueru Tian
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

Frontiers - Publisher Connector

LATR: 3D Lane Detection from Monocular Images with Transformer

Author: Cui Shuguang
Kun Tang
Li Zhen
Luo Yueru
Yan Xu
Zheng Chao
Zheng Chaoda
Publication venue
Publication date: 20/08/2023
Field of study

3D lane detection from monocular images is a fundamental yet challenging task in autonomous driving. Recent advances primarily rely on structural 3D surrogates (e.g., bird's eye view) built from front-view image features and camera parameters. However, the depth ambiguity in monocular images inevitably causes misalignment between the constructed surrogate feature map and the original image, posing a great challenge for accurate lane detection. To address the above issue, we present a novel LATR model, an end-to-end 3D lane detector that uses 3D-aware front-view features without transformed view representation. Specifically, LATR detects 3D lanes via cross-attention based on query and key-value pairs, constructed using our lane-aware query generator and dynamic 3D ground positional embedding. On the one hand, each query is generated based on 2D lane-aware features and adopts a hybrid embedding to enhance lane information. On the other hand, 3D space information is injected as positional embedding from an iteratively-updated 3D ground plane. LATR outperforms previous state-of-the-art methods on both synthetic Apollo, realistic OpenLane and ONCE-3DLanes by large margins (e.g., 11.4 gain in terms of F1 score on OpenLane). Code will be released at https://github.com/JMoonr/LATR .Comment: Accepted by ICCV2023 (Oral

arXiv.org e-Print Archive

M^2-3DLaneNet: Multi-Modal 3D Lane Detection

Author: Cui Shuguang
Kun Tang
Li Zhen
Luo Yueru
Mei Shuqi
Yan Xu
Zheng Chao
Zheng Chaoda
Publication venue
Publication date: 20/09/2022
Field of study

Estimating accurate lane lines in 3D space remains challenging due to their sparse and slim nature. In this work, we propose the M^2-3DLaneNet, a Multi-Modal framework for effective 3D lane detection. Aiming at integrating complementary information from multi-sensors, M^2-3DLaneNet first extracts multi-modal features with modal-specific backbones, then fuses them in a unified Bird's-Eye View (BEV) space. Specifically, our method consists of two core components. 1) To achieve accurate 2D-3D mapping, we propose the top-down BEV generation. Within it, a Line-Restricted Deform-Attention (LRDA) module is utilized to effectively enhance image features in a top-down manner, fully capturing the slenderness features of lanes. After that, it casts the 2D pyramidal features into 3D space using depth-aware lifting and generates BEV features through pillarization. 2) We further propose the bottom-up BEV fusion, which aggregates multi-modal features through multi-scale cascaded attention, integrating complementary information from camera and LiDAR sensors. Sufficient experiments demonstrate the effectiveness of M^2-3DLaneNet, which outperforms previous state-of-the-art methods by a large margin, i.e., 12.1% F1-score improvement on OpenLane dataset

arXiv.org e-Print Archive