5,453 research outputs found

    Interpretable Transformations with Encoder-Decoder Networks

    Full text link
    Deep feature spaces have the capacity to encode complex transformations of their input data. However, understanding the relative feature-space relationship between two transformed encoded images is difficult. For instance, what is the relative feature space relationship between two rotated images? What is decoded when we interpolate in feature space? Ideally, we want to disentangle confounding factors, such as pose, appearance, and illumination, from object identity. Disentangling these is difficult because they interact in very nonlinear ways. We propose a simple method to construct a deep feature space, with explicitly disentangled representations of several known transformations. A person or algorithm can then manipulate the disentangled representation, for example, to re-render an image with explicit control over parameterized degrees of freedom. The feature space is constructed using a transforming encoder-decoder network with a custom feature transform layer, acting on the hidden representations. We demonstrate the advantages of explicit disentangling on a variety of datasets and transformations, and as an aid for traditional tasks, such as classification.Comment: Accepted at ICCV 201

    Unsupervised Training for 3D Morphable Model Regression

    Full text link
    We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.Comment: CVPR 2018 version with supplemental material (http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html

    Computational Re-Photography

    Get PDF
    Rephotographers aim to recapture an existing photograph from the same viewpoint. A historical photograph paired with a well-aligned modern rephotograph can serve as a remarkable visualization of the passage of time. However, the task of rephotography is tedious and often imprecise, because reproducing the viewpoint of the original photograph is challenging. The rephotographer must disambiguate between the six degrees of freedom of 3D translation and rotation, and the confounding similarity between the effects of camera zoom and dolly. We present a real-time estimation and visualization technique for rephotography that helps users reach a desired viewpoint during capture. The input to our technique is a reference image taken from the desired viewpoint. The user moves through the scene with a camera and follows our visualization to reach the desired viewpoint. We employ computer vision techniques to compute the relative viewpoint difference. We guide 3D movement using two 2D arrows. We demonstrate the success of our technique by rephotographing historical images and conducting user studies

    NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

    Full text link
    Recent advances in neural reconstruction enable high-quality 3D object reconstruction from casually captured image collections. Current techniques mostly analyze their progress on relatively simple image collections where Structure-from-Motion (SfM) techniques can provide ground-truth (GT) camera poses. We note that SfM techniques tend to fail on in-the-wild image collections such as image search results with varying backgrounds and illuminations. To enable systematic research progress on 3D reconstruction from casual image captures, we propose NAVI: a new dataset of category-agnostic image collections of objects with high-quality 3D scans along with per-image 2D-3D alignments providing near-perfect GT camera parameters. These 2D-3D alignments allow us to extract accurate derivative annotations such as dense pixel correspondences, depth and segmentation maps. We demonstrate the use of NAVI image collections on different problem settings and show that NAVI enables more thorough evaluations that were not possible with existing datasets. We believe NAVI is beneficial for systematic research progress on 3D reconstruction and correspondence estimation. Project page: https://navidataset.github.ioComment: NeurIPS 2023 camera ready. Project page: https://navidataset.github.i

    Tibial internal rotation in combined anterior cruciate ligament and high-grade anterolateral ligament injury and its influence on ACL length

    Full text link
    BACKGROUND Assessment of combined anterolateral ligament (ALL) and anterior cruciate ligament (ACL) injury remains challenging but of high importance as the ALL is a contributing stabilizer of tibial internal rotation. The effect of preoperative static tibial internal rotation on ACL -length remains unknown. The aim of the study was analyze the effect of tibial internal rotation on ACL length in single-bundle ACL reconstructions and to quantify tibial internal rotation in combined ACL and ALL injuries. METHODS The effect of tibial internal rotation on ACL length was computed in a three-dimensional (3D) model of 10 healthy knees with 5° increments of tibial internal rotation from 0 to 30° resulting in 70 simulations. For each step ACL length was measured. ALL injury severity was graded by a blinded musculoskeletal radiologist in a retrospective analysis of 61 patients who underwent single-bundle ACL reconstruction. Preoperative tibial internal rotation was measured in magnetic resonance imaging (MRI) and its diagnostic performance was analyzed. RESULTS ACL length linearly increased 0.7 ± 0.1 mm (2.1 ± 0.5% of initial length) per 5° of tibial internal rotation from 0 to 30° in each patient. Seventeen patients (27.9%) had an intact ALL (grade 0), 10 (16.4%) a grade 1, 21 (34.4%) a grade 2 and 13 (21.3%) a grade 3 injury of the ALL. Patients with a combined ACL and ALL injury grade 3 had a median static tibial internal rotation of 8.8° (interquartile range (IQR): 8.3) compared to 5.6° (IQR: 6.6) in patients with an ALL injury (grade 0-2) (p = 0.03). A cut-off > 13.3° of tibial internal rotation predicted a high-grade ALL injury with a specificity of 92%, a sensitivity of 30%; area under the curve (AUC) 0.70 (95% CI: 0.54-0.85) (p = 0.03) and an accuracy of 79%. CONCLUSION ACL length linearly increases with tibial internal rotation from 0 to 30°. A combined ACL and high-grade ALL injury was associated with greater preoperative tibial internal rotation. This potentially contributes to unintentional graft laxity in ACL reconstructed patients, in particular with concomitant high-grade ALL tears. STUDY DESIGN Cohort study; Level of evidence, 3

    Category-Specific Object Reconstruction from a Single Image

    Full text link
    Object reconstruction from a single image -- in the wild -- is a problem where we can make progress and get meaningful results today. This is the main message of this paper, which introduces an automated pipeline with pixels as inputs and 3D surfaces of various rigid categories as outputs in images of realistic scenes. At the core of our approach are deformable 3D models that can be learned from 2D annotations available in existing object detection datasets, that can be driven by noisy automatic object segmentations and which we complement with a bottom-up module for recovering high-frequency shape details. We perform a comprehensive quantitative analysis and ablation study of our approach using the recently introduced PASCAL 3D+ dataset and show very encouraging automatic reconstructions on PASCAL VOC.Comment: First two authors contributed equally. To appear at CVPR 201

    Robust Change Detection Based on Neural Descriptor Fields

    Full text link
    The ability to reason about changes in the environment is crucial for robots operating over extended periods of time. Agents are expected to capture changes during operation so that actions can be followed to ensure a smooth progression of the working session. However, varying viewing angles and accumulated localization errors make it easy for robots to falsely detect changes in the surrounding world due to low observation overlap and drifted object associations. In this paper, based on the recently proposed category-level Neural Descriptor Fields (NDFs), we develop an object-level online change detection approach that is robust to partially overlapping observations and noisy localization results. Utilizing the shape completion capability and SE(3)-equivariance of NDFs, we represent objects with compact shape codes encoding full object shapes from partial observations. The objects are then organized in a spatial tree structure based on object centers recovered from NDFs for fast queries of object neighborhoods. By associating objects via shape code similarity and comparing local object-neighbor spatial layout, our proposed approach demonstrates robustness to low observation overlap and localization noises. We conduct experiments on both synthetic and real-world sequences and achieve improved change detection results compared to multiple baseline methods. Project webpage: https://yilundu.github.io/ndf_changeComment: 8 pages, 8 figures, and 2 tables. Accepted to IROS 2022. Project webpage: https://yilundu.github.io/ndf_chang
    • …
    corecore