21 research outputs found

    DevNet: Self-supervised Monocular Depth Learning via Density Volume Construction

    Full text link
    Self-supervised depth learning from monocular images normally relies on the 2D pixel-wise photometric relation between temporally adjacent image frames. However, they neither fully exploit the 3D point-wise geometric correspondences, nor effectively tackle the ambiguities in the photometric warping caused by occlusions or illumination inconsistency. To address these problems, this work proposes Density Volume Construction Network (DevNet), a novel self-supervised monocular depth learning framework, that can consider 3D spatial information, and exploit stronger geometric constraints among adjacent camera frustums. Instead of directly regressing the pixel value from a single image, our DevNet divides the camera frustum into multiple parallel planes and predicts the pointwise occlusion probability density on each plane. The final depth map is generated by integrating the density along corresponding rays. During the training process, novel regularization strategies and loss functions are introduced to mitigate photometric ambiguities and overfitting. Without obviously enlarging model parameters size or running time, DevNet outperforms several representative baselines on both the KITTI-2015 outdoor dataset and NYU-V2 indoor dataset. In particular, the root-mean-square-deviation is reduced by around 4% with DevNet on both KITTI-2015 and NYU-V2 in the task of depth estimation. Code is available at https://github.com/gitkaichenzhou/DevNet.Comment: Accepted by European Conference on Computer Vision 2022 (ECCV2022

    DynPoint: Dynamic Neural Point For View Synthesis

    Full text link
    The introduction of neural radiance fields has greatly improved the effectiveness of view synthesis for monocular videos. However, existing algorithms face difficulties when dealing with uncontrolled or lengthy scenarios, and require extensive training time specific to each new scenario. To tackle these limitations, we propose DynPoint, an algorithm designed to facilitate the rapid synthesis of novel views for unconstrained monocular videos. Rather than encoding the entirety of the scenario information into a latent representation, DynPoint concentrates on predicting the explicit 3D correspondence between neighboring frames to realize information aggregation. Specifically, this correspondence prediction is achieved through the estimation of consistent depth and scene flow information across frames. Subsequently, the acquired correspondence is utilized to aggregate information from multiple reference frames to a target frame, by constructing hierarchical neural point clouds. The resulting framework enables swift and accurate view synthesis for desired views of target frames. The experimental results obtained demonstrate the considerable acceleration of training time achieved - typically an order of magnitude - by our proposed method while yielding comparable outcomes compared to prior approaches. Furthermore, our method exhibits strong robustness in handling long-duration videos without learning a canonical representation of video content

    Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and Motion Estimation

    Full text link
    A truly generalizable approach to rigid segmentation and motion estimation is fundamental to 3D understanding of articulated objects and moving scenes. In view of the tightly coupled relationship between segmentation and motion estimates, we present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner. Our architecture comprises two lightweight and inter-connected heads that predict segmentation masks using point-level invariant features and motion estimates from SE(3) equivariant features without the prerequisites of category information. Our unified training strategy can be performed online while jointly optimizing the two predictions by exploiting the interrelations among scene flow, segmentation mask, and rigid transformations. We show experiments on four datasets as evidence of the superiority of our method both in terms of model performance and computational efficiency with only 0.25M parameters and 0.92G FLOPs. To the best of our knowledge, this is the first work designed for category-agnostic part-level SE(3) equivariance in dynamic point clouds

    Cytotoxic necrotizing factor 1 promotes bladder cancer angiogenesis through activating RhoC

    Full text link
    Uropathogenic Escherichia coli (UPEC), a leading cause of urinary tract infections, is associated with prostate and bladder cancers. Cytotoxic necrotizing factor 1 (CNF1) is a key UPEC toxin; however, its role in bladder cancer is unknown. In the present study, we found CNF1 induced bladder cancer cells to secrete vascular endothelial growth factor (VEGF) through activating Ras homolog family member C (RhoC), leading to subsequent angiogenesis in the bladder cancer microenvironment. We then investigated that CNF1- mediated RhoC activation modulated the stabilization of hypoxia- inducible factor 1α (HIF1α) to upregulate the VEGF. We demonstrated in vitro that active RhoC increased heat shock factor 1 (HSF1) phosphorylation, which induced the heat shock protein 90α (HSP90α) expression, leading to stabilization of HIF1α. Active RhoC elevated HSP90α, HIF1α, VEGF expression, and angiogenesis in the human bladder cancer xenografts. In addition, HSP90α, HIF1α, and VEGF expression were also found positively correlated with the human bladder cancer development. These results provide a potential mechanism through which UPEC contributes to bladder cancer progression, and may provide potential therapeutic targets for bladder cancer.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155984/1/fsb220522.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155984/2/fsb220522-sup-0001-Supinfo.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155984/3/fsb220522_am.pd

    Diagnostic Value of the Fimbriae Distribution Pattern in Localization of Urinary Tract Infection

    Get PDF
    Urinary tract infections (UTIs) are one of the most common infectious diseases. UTIs are mainly caused by uropathogenic Escherichia coli (UPEC), and are either upper or lower according to the infection site. Fimbriae are necessary for UPEC to adhere to the host uroepithelium, and are abundant and diverse in UPEC strains. Although great progress has been made in determining the roles of different types of fimbriae in UPEC colonization, the contributions of multiple fimbriae to site-specific attachment also need to be considered. Therefore, the distribution patterns of 22 fimbrial genes in 90 UPEC strains from patients diagnosed with upper or lower UTIs were analyzed using PCR. The distribution patterns correlated with the infection sites, an XGBoost model with a mean accuracy of 83.33% and a mean area under the curve (AUC) of the receiver operating characteristic (ROC) of 0.92 demonstrated that fimbrial gene distribution patterns could predict the localization of upper and lower UTIs

    Attitude Heading Reference System Using MEMS Inertial Sensors with Dual-Axis Rotation

    No full text
    This paper proposes a low cost and small size attitude and heading reference system based on MEMS inertial sensors. A dual-axis rotation structure with a proper rotary scheme according to the design principles is applied in the system to compensate for the attitude and heading drift caused by the large gyroscope biases. An optimization algorithm is applied to compensate for the installation angle error between the body frame and the rotation table’s frame. Simulations and experiments are carried out to evaluate the performance of the AHRS. The results show that the proper rotation could significantly reduce the attitude and heading drifts. Moreover, the new AHRS is not affected by magnetic interference. After the rotation, the attitude and heading are almost just oscillating in a range. The attitude error is about 3° and the heading error is less than 3° which are at least 5 times better than the non-rotation condition

    VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

    Full text link
    Recent learning-based approaches have achieved impressive results in the field of single-shot camera localization. However, how best to fuse multiple modalities (e.g., image and depth) and to deal with degraded or missing input are less well studied. In particular, we note that previous approaches towards deep fusion do not perform significantly better than models employing a single modality. We conjecture that this is because of the naive approaches to feature space fusion through summation or concatenation which do not take into account the different strengths of each modality. To address this, we propose an end-to-end framework, termed VMLoc, to fuse different sensor inputs into a common latent space through a variational Product-of-Experts (PoE) followed by attention-based fusion. Unlike previous multimodal variational works directly adapting the objective function of vanilla variational auto-encoder, we show how camera localization can be accurately estimated through an unbiased objective function based on importance weighting. Our model is extensively evaluated on RGB-D datasets and the results prove the efficacy of our model. The source code is available at https://github.com/Zalex97/VMLoc

    VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

    No full text
    Recent learning-based approaches have achieved impressive results in the field of single-shot camera localization. However, how best to fuse multiple modalities (e.g., image and depth) and to deal with degraded or missing input are less well studied. In particular, we note that previous approaches towards deep fusion do not perform significantly better than models employing a single modality. We conjecture that this is because of the naive approaches to feature space fusion through summation or concatenation which do not take into account the different strengths of each modality. To address this, we propose an end-to-end framework, termed VMLoc, to fuse different sensor inputs into a common latent space through a variational Product-of-Experts (PoE) followed by attention-based fusion. Unlike previous multimodal variational works directly adapting the objective function of vanilla variational auto-encoder, we show how camera localization can be accurately estimated through an unbiased objective function based on importance weighting. Our model is extensively evaluated on RGB-D datasets and the results prove the efficacy of our model. The source code is available at https://github.com/Zalex97/VMLoc
    corecore