13 research outputs found
Confidence-based cost modulation for stereo matching
We present a novel operator to be applied at raw
matching costs in the context of low level vision tasks
such as stereo matching or optical \ufb02ow. It aims at im-
proving matching reliability by ef\ufb01ciently modulating
pixel-wise pairing costs, injecting a con\ufb01dence backed
bias before the aggregation step. It works analyzing a
noisy estimate of the correspondances in order to fa-
vor or prune potential matches. We test the operator by
developing a local, realtime stereo matching algorithm
and showing that our solution can drastically clean the
resulting depth map while also reducing border bleed-
ing. Its good performance is also evaluated quanti-
tavely by testing the algorithm against the popular Mid-
dlebury benchmark where our local greedy implemen-
tation is able to obtain results comparable to those of
n\ua8 aive global approaches
Meaningful Matches in Stereovision
This paper introduces a statistical method to decide whether two blocks in a
pair of of images match reliably. The method ensures that the selected block
matches are unlikely to have occurred "just by chance." The new approach is
based on the definition of a simple but faithful statistical "background model"
for image blocks learned from the image itself. A theorem guarantees that under
this model not more than a fixed number of wrong matches occurs (on average)
for the whole image. This fixed number (the number of false alarms) is the only
method parameter. Furthermore, the number of false alarms associated with each
match measures its reliability. This "a contrario" block-matching method,
however, cannot rule out false matches due to the presence of periodic objects
in the images. But it is successfully complemented by a parameterless
"self-similarity threshold." Experimental evidence shows that the proposed
method also detects occlusions and incoherent motions due to vehicles and
pedestrians in non simultaneous stereo.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence 99,
Preprints (2011) 1-1
Image registration using Optical Flow method
Registrace medicínských obrazů je technika, která se rozvíjí ruku v ruce s novými hybridními diagnostickými zobrazovacími modalitami. V dnešní době je prováděna registrace jak monomodálních, tak multimodálních obrazových dat. První část této diplomové práce je zaměřena na problematiku registrace obrazů. Konkrétně na jejich transformace, interpolace, kriteriální funkce a optimalizace. V následující části je představena realizovaná metoda Optical Flow a použitý algoritmus „Demon“ pro registraci obrazů. Dále se práce zaměřuje na prezentaci výsledného programu a realizovaného GUI. V poslední části diplomové práce je testována funkčnost navrženého programu na běžných obrazech i reálných CT snímcích.Registration of medical images is technique, which is developing with new hybrid diagnostic imaging systems. Nowadays is a trend in image registration focused on monomodality and multimodality images registration. The first part of this master’s thesis is intent on present basic information about image registration. Concretely is intending on image transformation, interpolation, criteria function and at optimalization. Further part present realized Optical Flow technique and used “Demon” algorithm for image registration. The next part is focused on presentation of program solution and GUI. In the last section of this master’s thesis is verifying of created program on usual images and real CT images.
On the confidence of stereo matching in a deep-learning era: a quantitative evaluation
Stereo matching is one of the most popular techniques to estimate dense depth
maps by finding the disparity between matching pixels on two, synchronized and
rectified images. Alongside with the development of more accurate algorithms,
the research community focused on finding good strategies to estimate the
reliability, i.e. the confidence, of estimated disparity maps. This information
proves to be a powerful cue to naively find wrong matches as well as to improve
the overall effectiveness of a variety of stereo algorithms according to
different strategies. In this paper, we review more than ten years of
developments in the field of confidence estimation for stereo matching. We
extensively discuss and evaluate existing confidence measures and their
variants, from hand-crafted ones to the most recent, state-of-the-art learning
based methods. We study the different behaviors of each measure when applied to
a pool of different stereo algorithms and, for the first time in literature,
when paired with a state-of-the-art deep stereo network. Our experiments,
carried out on five different standard datasets, provide a comprehensive
overview of the field, highlighting in particular both strengths and
limitations of learning-based strategies.Comment: TPAMI final versio
Occlusion handling in correlation-based matching
In binocular stereovision, the accuracy of the 3D reconstruction depends on the accuracy of matching results.
Consequently, matching is an important task. Our first goal is to present a state of the art of matching methods. We
define a generic and complete algorithm based on essential components to describe most of the matching methods.
Occlusions are one of the most important difficulties and we also present a state of the art of methods dealing with
occlusions. Finally, we propose matching methods using two correlation measures to take into account occlusions. The
results highlight the best method that merges two disparity maps obtained with two different measures.En stéréovision binoculaire, la mise en correspondance est une étape cruciale pour réaliser la reconstruction
3D de la scène. De très nombreuses publications traitent ce problème. Ainsi, le premier objectif est de
proposer un état de l'art des méthodes de mise en correspondance. Nous synthétisons cette étude en
présentant un algorithme générique complet faisant intervenir des éléments constituants permettant de
décrire les différentes étapes de la recherche de correspondances. Une des plus grandes difficultés, au cours
de l'appariement, provient des occultations. C'est pourquoi le second objectif est de présenter un état de l'art
des méthodes qui prennent en compte cette difficulté. Enfin, le dernier objectif est de présenter de nouvelles
méthodes hybrides, dans le cadre des méthodes locales à base de corrélation. Nous nous appuyons sur
l'utilisation de deux mesures de corrélation permettant de mieux prendre en compte le problème des
occultations. Les résultats mettent en évidence la meilleure méthode qui consiste à fusionner deux cartes de
disparités obtenues avec des mesures différentes
Learning-based stereo matching for 3D reconstruction
Stereo matching has been widely adopted for 3D reconstruction of real world
scenes and has enormous applications in the fields of Computer Graphics, Vision,
and Robotics. Being an ill-posed problem, estimating accurate disparity maps is a
challenging task. However, humans rely on binocular vision to perceive 3D environments
and can estimate 3D information more rapidly and robustly than many active
and passive sensors that have been developed. One of the reasons is that human brains
can utilize prior knowledge to understand the scene and to infer the most reasonable
depth hypothesis even when the visual cues are lacking. Recent advances in machine
learning have shown that the brain's discrimination power can be mimicked using deep
convolutional neural networks. Hence, it is worth investigating how learning-based
techniques can be used to enhance stereo matching for 3D reconstruction.
Toward this goal, a sequence of techniques were developed in this thesis: a novel
disparity filtering approach that selects accurate disparity values through analyzing
the corresponding cost volumes using 3D neural networks; a robust semi-dense stereo
matching algorithm that utilizes two neural networks for computing matching cost
and performing confidence-based filtering; a novel network structure that learns global
smoothness constraints and directly performs multi-view stereo matching based on
global information; and finally a point cloud consolidation method that uses a neural
network to reproject noisy data generated by multi-view stereo matching under
different viewpoints. Qualitative and quantitative comparisons with existing works
demonstrate the respective merits of these presented techniques
Advances in 3D reconstruction
La tesi affronta il problema della ricostruzione di scene tridimensionali a partire da insiemi non strutturati di fotografie delle stesse. Lo stato dell'arte viene avanzato su diversi fronti: il primo contributo consiste in una formulazione robusta del problema di struttura e moto basata su di un approccio gerarchico, contrariamente a quello sequenziale prevalente in letteratura. Questa metodologia abbatte di un ordine di grandezza il costo computazionale complessivo, risulta inerentemente parallelizzabile, minimizza il progressivo accumulo degli errori e elimina la cruciale dipendenza dalla scelta della coppia di viste iniziale comune a tutte le formulazioni concorrenti. Un secondo contributo consiste nello sviluppo di una nuova procedura di autocalibrazione, particolarmente robusta e adatta al contesto del problema di moto e struttura. La soluzione proposta consiste in una procedura in forma chiusa per il recupero del piano all'infinito data una stima dei parametri intrinseci di almeno due camere. Questo metodo viene utilizzato per la ricerca esaustiva dei parametri interni, il cui spazio di ricerca Š strutturalmente limitato dalla finitezza dei dispositivi di acquisizione. Si Š indagato infine come visualizzare in maniera efficiente e gradevole i risultati di ricostruzione ottenuti: a tale scopo sono stati sviluppati algoritmi per il calcolo della disparit… stereo e procedure per la visualizzazione delle ricostruzione come insiemi di piani tessiturati automaticamente estratti, ottenendo una rappresentazione fedele, compatta e semanticamente significativa. Ogni risultato Š stato corredato da una validazione sperimentale rigorosa, con verifiche sia qualitative che quantitative.The thesis tackles the problem of 3D reconstruction of scenes from unstructured picture datasets. State of the art is advanced on several aspects: the first contribute consists in a robust formulation of the structure and motion problem based on a hierarchical approach, as opposed to the sequential one prevalent in literature. This methodology reduces the total computational complexity by one order of magnitude, is inherently parallelizable, minimizes the error accumulation causing drift and eliminates the crucial dependency from the choice of the initial couple of views which is common to all competing approaches. A second contribute consists in the discovery of a novel slef-calibration procedure, very robust and tailored to the structure and motion task. The proposed solution is a closed-form procedure for the recovery of the plane at infinity given a rough estimate of focal parameters of at least two cameras. This method is employed for the exaustive search of internal parameters, whise space is inherently bounded from the finiteness of acquisition devices. Finally, we inevstigated how to visualize in a efficient and compelling way the obtained reconstruction results: to this effect several algorithms for the computation of stereo disparity are presented. Along with procedures for the automatic extraction of support planes, they have been employed to obtain a faithful, compact and semantically significant representation of the scene as a collection of textured planes, eventually augmented by depth information encoded in relief maps. Every result has been verified by a rigorous experimental validation, comprising both qualitative and quantitative comparisons
Machine Learning techniques applied to stereo vision
Stereo is a popular technique enabling fast and dense depth estimation from two or more images.
Its success is mainly due to its easiness of deployment, requiring only a couple or multiple synchronized image sensors, accurately calibrated to solve the matching problem between pixels on one of the images (named reference) and the other (named target). The absence of active technologies (e.g. pattern projection, laser scanners etc..) make this solution deployable on almost every scenario. Despite the wide literature concerning stereo, it still represents an open problem because of very challenging conditions such as poor illumination, reflective surfaces, occlusions and other elements occurring in real environments.
Two main trends in stereo vision acquired popularity in the last years: confidence estimation and machine learning. Both proved to be very effective, pushing forward the state-of-the-art of dense disparity estimation.
In this thesis, we combine these two trends to improve both confidence estimation and disparity inference, by defining more effective and easier to deploy confidence measures and proposing new approaches to leverage on them for more accurate depth prediction.
All the experiments are validated on three popular datasets, KITTI 2012, KITTI 2015 and Middlebury v3, following the commonly adopted methodologies and protocol to compare our proposals with previous works representing the state-of-the-art in stereo vision
Mise en correspondance stéréoscopique d'images couleur en présence d'occultations
This work deals with stereo-vision and more precisely matching of pixels using correlation measures. Matching is an important task in computer vision, the accuracy of the three-dimensional reconstruction depending on the accuracy of the matching. The problems of matching are: intensity distortions, noises, untextured areas, foreshortening and occlusions. Our research concerns matching color images and takes into account the problem of occlusions.First, we distinguish the different elements that can compose a matching algorithm. This description allows us to introduce a classification of matching methods into four families : local methods, global methods, mixed methods and multi-pass methods.Second, we set up an evaluation and comparison protocol based on fourteen image pairs, five evaluation areas and ten criteria. This protocol also provides disparity, ambiguity, inaccuracy and correct disparity maps. This protocol enables us to study the behavior of the methods we proposed.Third, forty correlation measures are classified into five families : cross-correlation-based measures, classical statistics-based measures, derivative-based measures, non-parametric measures and robust measures. We also propose six new measures based on robust statistics. The results show us the most robust measures near occlusions : the robust measures including the six new measures.Fourth, we propose to generalize dense correlation-based matching to color by choosing a color system and by generalizing the correlation measures to color. Ten color systems have been evaluated and three different methods have been compared : to compute the correlation with each color component and then to merge the results; to process a principal component analysis and then to compute the correlation with the first principal component; to compute the correlation directly with colors. We can conclude that the fusion method is the best.Finally, in order to take into account the problem of occlusions, we present new algorithms that use two correlation measures: a classic measure in non-occluded area and a robust measure in the whole occlusion area. We introduce four different methods: edge detection methods, weighted correlation methods, post-detection methods and fusion method. This latter method is the most efficient.Cette thèse se situe dans le cadre de la vision par ordinateur et concerne plus précisément l'étape de mise en correspondance de pixels en stéréovision binoculaire. Cette étape consiste à retrouver les pixels homologues dans deux images d'une même scène, prises de deux points de vue différents. Une des manières de réaliser la mise en correspondance est de faire appel à des mesures de corrélation. Les algorithmes utilisés se heurtent alors aux difficultés suivantes : les changements de luminosité, les bruits, les raccourcissements, les zones peu texturées et les occultations. Les travaux qui ont été réalisés sont une étude sur les méthodes à base de corrélation, en prenant en compte le problème des occultations et l'utilisation d'images couleur.Dans un premier chapitre, nous établissons un état de l'art des méthodes de mise en correspondance de pixels. Nous donnons un modèle générique des méthodes s'appuyant sur la définition d'éléments constituants. Nous distinguons alors quatre catégories de méthodes : les méthodes locales, les méthodes globales, les méthodes mixtes et les méthodes à multiples passages. Le second chapitre aborde le problème de l'évaluation des méthodes de mise en correspondance de pixels. Après avoir donné un état de l'art des protocoles existants, nous proposons un protocole d'évaluation et de comparaison qui prend en compte des images avec vérité terrain et qui distingue différentes zones d'occultations. Dans le troisième chapitre, nous proposons une taxonomie des mesures de corrélation regroupées en cinq familles : les mesures de corrélation croisée, les mesures utilisant des outils de statistiques classiques, les mesures utilisant les dérivées des images, les mesures s'appuyant sur des outils des statistiques non paramétriques et les mesures exploitant des outils des statistiques robustes. Parmi cette dernière famille, nous proposons dix-sept mesures. Les résultats obtenus avec notre protocole montrent que ces mesures obtiennent les meilleurs résultats dans les zones d'occultations. Le quatrième chapitre concerne la généralisation à la couleur des méthodes de mise en correspondance à base de corrélation. Après avoir présenté les systèmes de représentation de la couleur que nous testons, nous abordons la généralisation des méthodes à base de corrélation en passant par l'adaptation des mesures de corrélation à la couleur. Nous proposons trois méthodes différentes : fusion des résultats sur chaque composante, utilisation d'une analyse en composante principale et utilisation d'une mesure de corrélation couleur. Les résultats obtenus avec notre protocole mettent en évidence la meilleure méthode qui consiste à fusionner les scores de corrélation. Dans le dernier chapitre, pour prendre en compte les occultations, nous proposons des méthodes hybrides qui s'appuient sur l'utilisation de deux mesures de corrélation : une mesure classique dans les zones sans occultation et une mesure robuste dans les zones d'occultations. Nous distinguons quatre types de méthodes à base de détection de contours, de corrélation pondérée, de post-détection des occultations et de fusion de cartes de disparités. Les résultats obtenus avec notre protocole montrent que la méthode la plus performante consiste à fusionner deux cartes de disparités