5,453 research outputs found
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
Unsupervised Training for 3D Morphable Model Regression
We present a method for training a regression network from image pixels to 3D
morphable model coordinates using only unlabeled photographs. The training loss
is based on features from a facial recognition network, computed on-the-fly by
rendering the predicted faces with a differentiable renderer. To make training
from features feasible and avoid network fooling effects, we introduce three
objectives: a batch distribution loss that encourages the output distribution
to match the distribution of the morphable model, a loopback loss that ensures
the network can correctly reinterpret its own output, and a multi-view identity
loss that compares the features of the predicted 3D face and the input
photograph from multiple viewing angles. We train a regression network using
these objectives, a set of unlabeled photographs, and the morphable model
itself, and demonstrate state-of-the-art results.Comment: CVPR 2018 version with supplemental material
(http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html
Computational Re-Photography
Rephotographers aim to recapture an existing photograph from the same viewpoint. A historical photograph paired with a well-aligned modern rephotograph can serve as a remarkable visualization of the passage of time. However, the task of rephotography is tedious and often imprecise, because reproducing the viewpoint of the original photograph is challenging. The rephotographer must disambiguate between the six degrees of freedom of 3D translation and rotation, and the confounding similarity between the effects of camera zoom and dolly. We present a real-time estimation and visualization technique for rephotography that helps users reach a desired viewpoint during capture. The input to our technique is a reference image taken from the desired viewpoint. The user moves through the scene with a camera and follows our visualization to reach the desired viewpoint. We employ computer vision techniques to compute the relative viewpoint difference. We guide 3D movement using two 2D arrows. We demonstrate the success of our technique by rephotographing historical images and conducting user studies
NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations
Recent advances in neural reconstruction enable high-quality 3D object
reconstruction from casually captured image collections. Current techniques
mostly analyze their progress on relatively simple image collections where
Structure-from-Motion (SfM) techniques can provide ground-truth (GT) camera
poses. We note that SfM techniques tend to fail on in-the-wild image
collections such as image search results with varying backgrounds and
illuminations. To enable systematic research progress on 3D reconstruction from
casual image captures, we propose NAVI: a new dataset of category-agnostic
image collections of objects with high-quality 3D scans along with per-image
2D-3D alignments providing near-perfect GT camera parameters. These 2D-3D
alignments allow us to extract accurate derivative annotations such as dense
pixel correspondences, depth and segmentation maps. We demonstrate the use of
NAVI image collections on different problem settings and show that NAVI enables
more thorough evaluations that were not possible with existing datasets. We
believe NAVI is beneficial for systematic research progress on 3D
reconstruction and correspondence estimation. Project page:
https://navidataset.github.ioComment: NeurIPS 2023 camera ready. Project page:
https://navidataset.github.i
Tibial internal rotation in combined anterior cruciate ligament and high-grade anterolateral ligament injury and its influence on ACL length
BACKGROUND
Assessment of combined anterolateral ligament (ALL) and anterior cruciate ligament (ACL) injury remains challenging but of high importance as the ALL is a contributing stabilizer of tibial internal rotation. The effect of preoperative static tibial internal rotation on ACL -length remains unknown. The aim of the study was analyze the effect of tibial internal rotation on ACL length in single-bundle ACL reconstructions and to quantify tibial internal rotation in combined ACL and ALL injuries.
METHODS
The effect of tibial internal rotation on ACL length was computed in a three-dimensional (3D) model of 10 healthy knees with 5° increments of tibial internal rotation from 0 to 30° resulting in 70 simulations. For each step ACL length was measured. ALL injury severity was graded by a blinded musculoskeletal radiologist in a retrospective analysis of 61 patients who underwent single-bundle ACL reconstruction. Preoperative tibial internal rotation was measured in magnetic resonance imaging (MRI) and its diagnostic performance was analyzed.
RESULTS
ACL length linearly increased 0.7 ± 0.1 mm (2.1 ± 0.5% of initial length) per 5° of tibial internal rotation from 0 to 30° in each patient. Seventeen patients (27.9%) had an intact ALL (grade 0), 10 (16.4%) a grade 1, 21 (34.4%) a grade 2 and 13 (21.3%) a grade 3 injury of the ALL. Patients with a combined ACL and ALL injury grade 3 had a median static tibial internal rotation of 8.8° (interquartile range (IQR): 8.3) compared to 5.6° (IQR: 6.6) in patients with an ALL injury (grade 0-2) (p = 0.03). A cut-off > 13.3° of tibial internal rotation predicted a high-grade ALL injury with a specificity of 92%, a sensitivity of 30%; area under the curve (AUC) 0.70 (95% CI: 0.54-0.85) (p = 0.03) and an accuracy of 79%.
CONCLUSION
ACL length linearly increases with tibial internal rotation from 0 to 30°. A combined ACL and high-grade ALL injury was associated with greater preoperative tibial internal rotation. This potentially contributes to unintentional graft laxity in ACL reconstructed patients, in particular with concomitant high-grade ALL tears.
STUDY DESIGN
Cohort study; Level of evidence, 3
Category-Specific Object Reconstruction from a Single Image
Object reconstruction from a single image -- in the wild -- is a problem
where we can make progress and get meaningful results today. This is the main
message of this paper, which introduces an automated pipeline with pixels as
inputs and 3D surfaces of various rigid categories as outputs in images of
realistic scenes. At the core of our approach are deformable 3D models that can
be learned from 2D annotations available in existing object detection datasets,
that can be driven by noisy automatic object segmentations and which we
complement with a bottom-up module for recovering high-frequency shape details.
We perform a comprehensive quantitative analysis and ablation study of our
approach using the recently introduced PASCAL 3D+ dataset and show very
encouraging automatic reconstructions on PASCAL VOC.Comment: First two authors contributed equally. To appear at CVPR 201
Robust Change Detection Based on Neural Descriptor Fields
The ability to reason about changes in the environment is crucial for robots
operating over extended periods of time. Agents are expected to capture changes
during operation so that actions can be followed to ensure a smooth progression
of the working session. However, varying viewing angles and accumulated
localization errors make it easy for robots to falsely detect changes in the
surrounding world due to low observation overlap and drifted object
associations. In this paper, based on the recently proposed category-level
Neural Descriptor Fields (NDFs), we develop an object-level online change
detection approach that is robust to partially overlapping observations and
noisy localization results. Utilizing the shape completion capability and
SE(3)-equivariance of NDFs, we represent objects with compact shape codes
encoding full object shapes from partial observations. The objects are then
organized in a spatial tree structure based on object centers recovered from
NDFs for fast queries of object neighborhoods. By associating objects via shape
code similarity and comparing local object-neighbor spatial layout, our
proposed approach demonstrates robustness to low observation overlap and
localization noises. We conduct experiments on both synthetic and real-world
sequences and achieve improved change detection results compared to multiple
baseline methods. Project webpage: https://yilundu.github.io/ndf_changeComment: 8 pages, 8 figures, and 2 tables. Accepted to IROS 2022. Project
webpage: https://yilundu.github.io/ndf_chang
- …