4,639 research outputs found
ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems
In this paper we present ActiveStereoNet, the first deep learning solution
for active stereo systems. Due to the lack of ground truth, our method is fully
self-supervised, yet it produces precise depth with a subpixel precision of
of a pixel; it does not suffer from the common over-smoothing issues;
it preserves the edges; and it explicitly handles occlusions. We introduce a
novel reconstruction loss that is more robust to noise and texture-less
patches, and is invariant to illumination changes. The proposed loss is
optimized using a window-based cost aggregation with an adaptive support weight
scheme. This cost aggregation is edge-preserving and smooths the loss function,
which is key to allow the network to reach compelling results. Finally we show
how the task of predicting invalid regions, such as occlusions, can be trained
end-to-end without ground-truth. This component is crucial to reduce blur and
particularly improves predictions along depth discontinuities. Extensive
quantitatively and qualitatively evaluations on real and synthetic data
demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary
Material
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Combination of correlation measures for dense stereo matching
Dans le cadre de la mise en correspondance dense de pixels, nous étudions la fusion de différentes mesures de corrélation. En s'appuyant sur les travaux précédents, nous utilisons les mesures les plus représentatives parmi 5 familles de mesures : corrélation croisée, mesures classiques, similarité de gradients, statistiques non-paramétriques et statistiques robustes. Plus précisément, notre étude met en évidence la possibilité d'améliorer la mise en correspondance en combinant différentes mesures de corrélation que l'on souhaite complémentaires. En particulier, nous démontrons la supériorité de la combinaison des mesures de corrélation suivantes : Gradient Correlation (GC) et Smooth Median Absolute Deviation measure (SMAD). Enfin, nous introduisons un algorithme de fusion qui permet de combiner automatiquement ces 2 mesures
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
Robust Visual Correspondence: Theory and Applications
Visual correspondence represents one of the most important tasks in computer vision. Given two sets of pixels (i.e. two images), it aims at finding corresponding pixel pairs belonging to the two sets (homologous pixels). As a matter of fact, visual correspondence is commonly employed in fields such as stereo correspondence, change detection, image registration, motion estimation, pattern matching, image vector quantization. The visual correspondence task can be extremely challenging in presence of disturbance factors which typically affect images. A common source of disturbances can be related to photometric distortions between the images under comparison. These can be ascribed to the camera sensors employed in the image acquisition process (due to dynamic variations of camera parameters such as auto-exposure and auto-gain, or to the use of different cameras), or can be induced by external factors such as changes of the amount of light emitted by the sources or viewing of non-lambertian surfaces at different angles. All of these factors tend to produce brightness changes in corresponding pixels of the two images that can not be neglected in real applications implying visual correspondence between images acquired from different spatial points (e.g. stereo vision) and/or different time instants (e.g. pattern matching, change detection). In addition to photometric distortions, differences between corresponding pixels can also be due to the noise introduced by camera sensors. Finally, the acquisition of images from different spatial points or different time instants can also induce occlusions. Evaluation assessments have also been proposed which compared visual correspondence approaches for tasks such as stereo correspondence (Chambon & Crouzil, 2003), image registration (Zitova & Flusser, 2003) and image motion (Giachetti, 2000)
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
- …