416 research outputs found
Multi-View 3D Object Detection Network for Autonomous Driving
This paper aims at high-accuracy 3D object detection in autonomous driving
scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework
that takes both LIDAR point cloud and RGB images as input and predicts oriented
3D bounding boxes. We encode the sparse 3D point cloud with a compact
multi-view representation. The network is composed of two subnetworks: one for
3D object proposal generation and another for multi-view feature fusion. The
proposal network generates 3D candidate boxes efficiently from the bird's eye
view representation of 3D point cloud. We design a deep fusion scheme to
combine region-wise features from multiple views and enable interactions
between intermediate layers of different paths. Experiments on the challenging
KITTI benchmark show that our approach outperforms the state-of-the-art by
around 25% and 30% AP on the tasks of 3D localization and 3D detection. In
addition, for 2D detection, our approach obtains 10.3% higher AP than the
state-of-the-art on the hard data among the LIDAR-based methods.Comment: To appear in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving
Multi-view depth estimation has achieved impressive performance over various
benchmarks. However, almost all current multi-view systems rely on given ideal
camera poses, which are unavailable in many real-world scenarios, such as
autonomous driving. In this work, we propose a new robustness benchmark to
evaluate the depth estimation system under various noisy pose settings.
Surprisingly, we find current multi-view depth estimation methods or
single-view and multi-view fusion methods will fail when given noisy pose
settings. To address this challenge, we propose a single-view and multi-view
fused depth estimation system, which adaptively integrates high-confident
multi-view and single-view results for both robust and accurate depth
estimations. The adaptive fusion module performs fusion by dynamically
selecting high-confidence regions between two branches based on a wrapping
confidence map. Thus, the system tends to choose the more reliable branch when
facing textureless scenes, inaccurate calibration, dynamic objects, and other
degradation or challenging conditions. Our method outperforms state-of-the-art
multi-view and fusion methods under robustness testing. Furthermore, we achieve
state-of-the-art performance on challenging benchmarks (KITTI and DDAD) when
given accurate pose estimations. Project website:
https://github.com/Junda24/AFNet/.Comment: Accepted to CVPR 202
READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
For many real-world applications, the user-generated inputs usually contain
various noises due to speech recognition errors caused by linguistic
variations1 or typographical errors (typos). Thus, it is crucial to test model
performance on data with realistic input noises to ensure robustness and
fairness. However, little study has been done to construct such benchmarks for
Chinese, where various language-specific input noises happen in the real world.
In order to fill this important gap, we construct READIN: a Chinese multi-task
benchmark with REalistic And Diverse Input Noises. READIN contains four diverse
tasks and requests annotators to re-enter the original test data with two
commonly used Chinese input methods: Pinyin input and speech input. We designed
our annotation pipeline to maximize diversity, for example by instructing the
annotators to use diverse input method editors (IMEs) for keyboard noises and
recruiting speakers from diverse dialectical groups for speech noises. We
experiment with a series of strong pretrained language models as well as robust
training methods, we find that these models often suffer significant
performance drops on READIN even with robustness methods like data
augmentation. As the first large-scale attempt in creating a benchmark with
noises geared towards user-generated inputs, we believe that READIN serves as
an important complement to existing Chinese NLP benchmarks. The source code and
dataset can be obtained from https://github.com/thunlp/READIN.Comment: Preprin
Characterization of Soybean Protein Adhesives Modified by Xanthan Gum
The aim of this study was to provide a basis for the preparation of medical adhesives from soybean protein sources. Soybean protein (SP) adhesives mixed with different concentrations of xanthan gum (XG) were prepared. Their adhesive features were evaluated by physicochemical parameters and an in vitro bone adhesion assay. The results showed that the maximal adhesion strength was achieved in 5% SP adhesive with 0.5% XG addition, which was 2.6-fold higher than the SP alone. The addition of XG significantly increased the hydrogen bond and viscosity, as well as increased the β-sheet content but decreased the α-helix content in the second structure of protein. X-ray diffraction data showed significant interactions between SP molecules and XG. Scanning electron microscopy observations showed that the surface of SP adhesive modified by XG was more viscous and compact, which were favorable for the adhesion between the adhesive and bone. In summary, XG modification caused an increase in the hydrogen bonding and zero-shear viscosity of SP adhesives, leading to a significant increase in the bond strength of SP adhesives onto porcine bones
Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes
Multi-frame depth estimation generally achieves high accuracy relying on the
multi-view geometric consistency. When applied in dynamic scenes, e.g.,
autonomous driving, this consistency is usually violated in the dynamic areas,
leading to corrupted estimations. Many multi-frame methods handle dynamic areas
by identifying them with explicit masks and compensating the multi-view cues
with monocular cues represented as local monocular depth or features. The
improvements are limited due to the uncontrolled quality of the masks and the
underutilized benefits of the fusion of the two types of cues. In this paper,
we propose a novel method to learn to fuse the multi-view and monocular cues
encoded as volumes without needing the heuristically crafted masks. As unveiled
in our analyses, the multi-view cues capture more accurate geometric
information in static areas, and the monocular cues capture more useful
contexts in dynamic areas. To let the geometric perception learned from
multi-view cues in static areas propagate to the monocular representation in
dynamic areas and let monocular cues enhance the representation of multi-view
cost volume, we propose a cross-cue fusion (CCF) module, which includes the
cross-cue attention (CCA) to encode the spatially non-local relative
intra-relations from each source to enhance the representation of the other.
Experiments on real-world datasets prove the significant effectiveness and
generalization ability of the proposed method.Comment: Accepted by CVPR 2023. Code and models are available at:
https://github.com/ruili3/dynamic-multiframe-dept
Corticosteroid Activation of Atlantic Sea Lamprey Corticoid Receptor: Allosteric Regulation by the N-terminal Domain
Lampreys are jawless fish that evolved about 550 million years ago at the
base of the vertebrate line. Modern lampreys contain a corticoid receptor (CR),
the common ancestor of the glucocorticoid receptor (GR) and mineralocorticoid
receptor (MR), which first appear in cartilaginous fish, such as sharks. Until
recently, 344 amino acids at the amino terminus of adult lamprey CR were not
present in the lamprey CR sequence in GenBank. A search of the recently
sequenced lamprey germline genome identified two CR sequences, CR1 and CR2,
containing the 344 previously un-identified amino acids at the amino terminus.
CR1 also contains a novel four amino acid insertion in the DNA-binding domain
(DBD). We studied corticosteroid activation of CR1 and CR2 and found their
strongest response was to 11-deoxycorticosterone and 11-deoxycortisol, the two
circulating corticosteroids in lamprey. Based on steroid specificity, both CRs
are close to elephant shark MR and distant from elephant shark GR. HEK293 cells
transfected with full-length CR1 or CR2 and the MMTV promoter have about 3-fold
higher steroid-mediated activation compared to HEK293 cells transfected with
these CRs and the TAT3 promoter. Deletion of the amino-terminal domain (NTD) of
lamprey CR1 and CR2 to form truncated CRs decreased transcriptional activation
by about 70% in HEK293 cells transfected with MMTV, but increased transcription
by about 6-fold in cells transfected with TAT3, indicating that the promoter
has an important effect on NTD regulation of CR transcription by
corticosteroids.Comment: 27 pages, 6 figure
- …