10 research outputs found

    Volumetric reconstruction of rigid objects from image sequences.

    Get PDF
    Thesis (M.Sc.Eng.)-University of KwaZulu-Natal, Durban, 2012.Live video communications over bandwidth constrained ad-hoc radio networks necessitates high compression rates. To this end, a model based video communication system that incorporates flexible and accurate 3D modelling and reconstruction is proposed in part. Model-based video coding (MBVC) is known to provide the highest compression rates, but usually compromises photorealism and object detail. High compression ratios are achieved at the encoder by extracting and transmit- ting only the parameters which describe changes to object orientation and motion within the scene. The decoder uses the received parameters to animate reconstructed objects within the synthesised scene. This is scene understanding rather than video compression. 3D reconstruction of objects and scenes present at the encoder is the focus of this research. 3D Reconstruction is accomplished by utilizing the Patch-based Multi-view Stereo (PMVS) frame- work of Yasutaka Furukawa and Jean Ponce. Surface geometry is initially represented as a sparse set of orientated rectangular patches obtained from matching feature correspondences in the input images. To increase reconstruction density these patches are iteratively expanded, and filtered using visibility constraints to remove outliers. Depending on the availability of segmentation in- formation, there are two methods for initialising a mesh model from the reconstructed patches. The first method initialises the mesh from the object's visual hull. The second technique initialises the mesh directly from the reconstructed patches. The resulting mesh is then refined by enforcing patch reconstruction consistency and regularization constraints for each vertex on the mesh. To improve robustness to outliers, two enhancements to the above framework are proposed. The first uses photometric consistency during feature matching to increase the probability of selecting the correct matching point first. The second approach estimates the orientation of the patch such that its photometric discrepancy score for each of its visible images is minimised prior to optimisation. The overall reconstruction algorithm is shown to be flexible and robust in that it can reconstruct 3D models for objects and scenes. It is able to automatically detect and discard outliers and may be initialised by simple visual hulls. The demonstrated ability to account for surface orientation of the patches during photometric consistency computations is a key performance criterion. Final results show that the algorithm is capable of accurately reconstructing objects containing fine surface details, deep concavities and regions without salient textures

    Imagerie 3D de l'anatomie interne d'une souris par dynamique de fluorescence

    Get PDF
    L'imagerie médicale sur petits animaux est d'une grande utilité en recherche préclinique, car elle permet d'imager in vivo et en 3D l'intérieur de l'animal. Ceci sert au développement de nouveaux médicaments et au suivi de l'évolution de certaines pathologies. En effet, les techniques d'imagerie éliminent la nécessité de sacrifier les animaux, ce qui permet le suivi de processus biomoléculaires sur un même individu et l'obtention de données statistiquement plus significatives. Cependant, l'information moléculaire recueillie s'avère généralement de faible résolution spatiale, notamment en imagerie optique à cause de la diffusion de la lumière, et donc difficile à localiser dans le corps de l'animal. Le jumelage de modalités d'imagerie complémentaires permet donc d'obtenir des images anatomiques et moléculaires superposées, mais cela s'avère toutefois relativement coûteux. Le projet présenté vise à améliorer une technique d'imagerie 2D toute optique à faible coût permettant d'obtenir une carte approximative 3D des organes internes d'une souris. Cette technique devrait permettre le recalage spatial automatique d'informations moléculaires obtenues sur le même appareil, bien que cela n'ait pas encore été démontré. L'amélioration apportée par le projet consiste à obtenir des images anatomiques 3D, plutôt que 2D, en utilisant une caméra tournante et des techniques de vision numérique stéréo. Pour ce faire, la technique existante est d'abord reproduite. Celle-ci consiste à injecter de l'ICG , un marqueur fluorescent non spécifique qui demeure confiné au réseau vasculaire une fois injecté, à une souris anesthésiée. De par leurs métabolismes distincts et le temps que met l'ICG à atteindre chacun d'eux, la dynamique de fluorescence varie entre les organes, mais demeure relativement uniforme à l'intérieur d'un même organe. Certains organes peuvent donc être segmentés par des techniques appropriées de traitement de signal, telles l'analyse en composantes principales et la régression par moindres carrés non négative. Un système d'imagerie à caméra rotative comme le QOS® de Quidd permet d'obtenir des images 2D segmentées de l'anatomie. interne de l'animal selon plusieurs plans de vue. Ces plans de vue servent à reconstruire l'information anatomique en 3D par des techniques de vision numérique. La procédure pourrait être répétée avec un ou plusieurs marqueurs fluorescents fonctionnalisés dans le but d'obtenir des images moléculaires 3D du même animal et de les superposer aux images anatomiques 3D. La technique développée devrait ainsi permettre d'obtenir à faible coût et de manière toute optique des images 3D anatomiques et moléculaires recalées spatialement automatiquement

    Specialised global methods for binocular and trinocular stereo matching

    Get PDF
    The problem of estimating depth from two or more images is a fundamental problem in computer vision, which is commonly referred as to stereo matching. The applications of stereo matching range from 3D reconstruction to autonomous robot navigation. Stereo matching is particularly attractive for applications in real life because of its simplicity and low cost, especially compared to costly laser range finders/scanners, such as for the case of 3D reconstruction. However, stereo matching has its very unique problems like convergence issues in the optimisation methods, and challenges to find matches accurately due to changes in lighting conditions, occluded areas, noisy images, etc. It is precisely because of these challenges that stereo matching continues to be a very active field of research. In this thesis we develop a binocular stereo matching algorithm that works with rectified images (i.e. scan lines in two images are aligned) to find a real valued displacement (i.e. disparity) that best matches two pixels. To accomplish this our research has developed techniques to efficiently explore a 3D space, compare potential matches, and an inference algorithm to assign the optimal disparity to each pixel in the image. The proposed approach is also extended to the trinocular case. In particular, the trinocular extension deals with a binocular set of images captured at the same time and a third image displaced in time. This approach is referred as to t +1 trinocular stereo matching, and poses the challenge of recovering camera motion, which is addressed by a novel technique we call baseline recovery. We have extensively validated our binocular and trinocular algorithms using the well known KITTI and Middlebury data sets. The performance of our algorithms is consistent across different data sets, and its performance is among the top performers in the KITTI and Middlebury datasets. The time-stamped results of our algorithms as reported in this thesis can be found at: • LCU on Middlebury V2 (https://web.archive.org/web/20150106200339/http://vision.middlebury. edu/stereo/eval/). • LCU on Middlebury V3 (https://web.archive.org/web/20150510133811/http://vision.middlebury. edu/stereo/eval3/). • LPU on Middlebury V3 (https://web.archive.org/web/20161210064827/http://vision.middlebury. edu/stereo/eval3/). • LPU on KITTI 2012 (https://web.archive.org/web/20161106202908/http://cvlibs.net/datasets/ kitti/eval_stereo_flow.php?benchmark=stereo). • LPU on KITTI 2015 (https://web.archive.org/web/20161010184245/http://cvlibs.net/datasets/ kitti/eval_scene_flow.php?benchmark=stereo). • TBR on KITTI 2012 (https://web.archive.org/web/20161230052942/http://cvlibs.net/datasets/ kitti/eval_stereo_flow.php?benchmark=stereo)

    Material Recognition Meets 3D Reconstruction : Novel Tools for Efficient, Automatic Acquisition Systems

    Get PDF
    For decades, the accurate acquisition of geometry and reflectance properties has represented one of the major objectives in computer vision and computer graphics with many applications in industry, entertainment and cultural heritage. Reproducing even the finest details of surface geometry and surface reflectance has become a ubiquitous prerequisite in visual prototyping, advertisement or digital preservation of objects. However, today's acquisition methods are typically designed for only a rather small range of material types. Furthermore, there is still a lack of accurate reconstruction methods for objects with a more complex surface reflectance behavior beyond diffuse reflectance. In addition to accurate acquisition techniques, the demand for creating large quantities of digital contents also pushes the focus towards fully automatic and highly efficient solutions that allow for masses of objects to be acquired as fast as possible. This thesis is dedicated to the investigation of basic components that allow an efficient, automatic acquisition process. We argue that such an efficient, automatic acquisition can be realized when material recognition "meets" 3D reconstruction and we will demonstrate that reliably recognizing the materials of the considered object allows a more efficient geometry acquisition. Therefore, the main objectives of this thesis are given by the development of novel, robust geometry acquisition techniques for surface materials beyond diffuse surface reflectance, and the development of novel, robust techniques for material recognition. In the context of 3D geometry acquisition, we introduce an improvement of structured light systems, which are capable of robustly acquiring objects ranging from diffuse surface reflectance to even specular surface reflectance with a sufficient diffuse component. We demonstrate that the resolution of the reconstruction can be increased significantly for multi-camera, multi-projector structured light systems by using overlappings of patterns that have been projected under different projector poses. As the reconstructions obtained by applying such triangulation-based techniques still contain high-frequency noise due to inaccurately localized correspondences established for images acquired under different viewpoints, we furthermore introduce a novel geometry acquisition technique that complements the structured light system with additional photometric normals and results in significantly more accurate reconstructions. In addition, we also present a novel method to acquire the 3D shape of mirroring objects with complex surface geometry. The aforementioned investigations on 3D reconstruction are accompanied by the development of novel tools for reliable material recognition which can be used in an initial step to recognize the present surface materials and, hence, to efficiently select the subsequently applied appropriate acquisition techniques based on these classified materials. In the scope of this thesis, we therefore focus on material recognition for scenarios with controlled illumination as given in lab environments as well as scenarios with natural illumination that are given in photographs of typical daily life scenes. Finally, based on the techniques developed in this thesis, we provide novel concepts towards efficient, automatic acquisition systems

    Contribució als mètodes d'obtenció i representació de vistes d'objectes reals per aplicacions interactives.

    Get PDF
    En aquesta tesi s'han realitzat una sèrie d'experiments per tal de cercar, identificar, caracteritzar i comparar diversos mètodes d'obtenció de vistes d'objectes reals per aplicacions interactives de realitat augmentada, telepresència o altres que puguin idear-se en el futur. Durant el desenvolupament dels mètodes trobats, de naturalesa diversa, han sorgit dificultats que han fet aprofundir aquest treball en l'àmbit de la geometria de la síntesi de vistes, la reconstrucció de l'estructura tridimensional dels objectes, l'acceleració de certs algoritmes amb l'ajut del maquinari existent o la portabilitat de les dades a través de la xarxa.Concretament, s'han identificat tres mètodes que poden satisfer els requeriments plantejats. El primer, accés a vistes d'objectes comprimides en fitxers, es basa en l'organització de les dades presentada, la capacitat de compressió dels algoritmes i el suport del maquinari a la tasca de descompressió. El segon mètode, reconstrucció tridimensional i projecció emprant el coprocessador gràfic, aprofita les altes prestacions d'aquests últims, impulsats per les necessitats del mercat. El tercer, selecció d'un conjunt representatiu de vistes i interpolació entre elles, aprofita les propietats de la rectificació de tres vistes i l'exactitud de la interpolació de vistes si es disposa d'un mapa de disparitat prou dens. Aquesta necessitat ha connectat aquest mètode amb el segon, al que cal el model tridimensional reconstruït, ja que hi ha una equivalència entre les dues expressions de la informació. Per la comparació dels resultats dels mètodes estudiats, s'han seguit tres criteris: - El primer, òbviament, el de la qualitat de les vistes obtingudes de l'objecte. Ha calgut identificar les principals fonts d'error en els processos i cercar uns avaluadors d'aquest error. A més d'aquests numèrics se n'han cercat de subjectius ja que el destinatari de les vistes serà un ésser humà. - El temps d'obtenció d'una vista (important per la interactivitat), projectat a les plataformes tecnològiques existents o previsibles.- La mida de les dades necessàries per cadascun dels mètodes, que limitarà la portabilitat de la visualització dels objectes. Durant la realització d'aquesta tesi, s'han realitzat algunes contribucions, la majoria d'elles ja publicades, que es poden resumir en:- Disseny d'una metodologia per la representació d'objectes a partir de conjunts de vistes i mètodes de síntesi. En aquesta metodologia s'ha presentat un protocol per l'adquisició i ordenació de les dades, idees per la selecció del conjunt mínim de vistes, un criteri per gravar la mínima informació necessària, ajuts a l'obtenció de la informació tridimensional de l'escena necessària, i un algoritme ràpid i general de síntesi de vistes.- Supressió de les restriccions geomètriques del mètode de síntesi per rectificació de tres vistes, permetent generalitzar la ubicació de la càmera virtual i optimitzar la distància del pla de reprojecció per maximitzar l'àrea de la vista interpolada.- Especificació de l'algoritme de síntesi de vistes pel mètode de rectificació de tres vistes, de forma que es pugui implementar amb processadors tipus DSP o conjunts d'instruccions específiques dels processadors CISC, per assolir les necessitats de les aplicacions interactives.- Presentació d'un mètode de refinament de models tridimensionals obtinguts per space carving mitjançant estereovisió. El mètode combina dues tècniques conegudes de visió per ordinador obtenint un millor resultat en la reconstrucció tridimensional.- Acceleració del mètode de reconstrucció tridimensional per projecció de vòxels amb la utilització de mapes de distància, estructures en arbre i el coprocessador gràfic present en els computadors personals. Els resultats obtinguts en la tesi s'han adaptat per aplicar-se a un projecte de simulació de situacions de conducció en carreteres amb realitat augmentada, desenvolupat per la UPC i la Universitat de Toronto i un segon de representació remota de vistes d'objectes arqueològics, desenvolupat per la UPC, la UB i un conjunt d'universitats estrangeres.This work shows a set of experiments to identify, characterize and compare several methods for obtaining real object views in interactive applications such as augmented reality, telepresence or other future developments. While developing those methods, the problems found have induced a deeper study of view synthesis geometry, object three-dimensional structure, hardware process acceleration and data portability through the internet. Three methods have been found that meet the ends of this work. First, file-stored object-view access rests in data organization, compression- algorithm capabilities and hardware support to decompression. The second method, 3D object reconstruction and graphic coprocessor based projection, takes advantage of market driven GPU evolution. Finally, the representative view-set selection and interpolation method, uses the three-view-rectification properties and the precision of view interpolation when a dense-enough disparity map is available. This last requirement binds the second and third methods in the need of a good 3D object reconstruction, 3D models and disparity maps being two expressions for the same data. To compare the method results three criteria have been used: - Object view image quality. It has been necessary to identify the main source of errors on the processes and to find some evaluators for those errors. As the results of the process are images to be shown to humans, with those numerical evaluators a subjective evaluator has been used too. - Process time, important for the interactivity aim, calculated with current technology but projected to next foreseeable platforms. - Amount of data that must be recorded in each method, which will be a key point for portability. While developing this work, some contributions have been achieved and they can be summarized as:- Design of a methodology to represent any object view from a reduced set of views and synthesis methods. The methodology includes a protocol to acquire and organize data, ideas for a minimum view set selection, a criterion to record the minimum amount of data, improvements in obtaining the three-dimensional structure of the scene and a fast and general synthesis algorithm.- Suppression of some geometric restrictions in the three-view-rectification method, allowing a more general positioning for the virtual camera and a maximization of the virtual image area through the distance to the reprojection plane.- A complete specification for the modified three-view rectification and view interpolation method allowing its implementation with DSP or MMX instructions to achieve the requirements of interactive applications. - Presentation of a method to refine three-dimensional models obtained by space carving through stereovision. This method combines two well-known computer vision techniques to achieve a better result in 3D reconstruction. - Acceleration of the space carving 3D reconstruction method with the use of an octree voxel organization, projection to special distance maps and taking advantage of the GPU performance to speed-up the projection. Some of the results of this work will be used in the construction of an augmented reality driving simulator (UPC- University of Toronto) and the implementation of a remote viewer of archaeological objects (UPC - UB - other universities)

    Learning and recovering 3D surface deformations

    Get PDF
    Recovering the 3D deformations of a non-rigid surface from a single viewpoint has applications in many domains such as sports, entertainment, and medical imaging. Unfortunately, without any knowledge of the possible deformations that the object of interest can undergo, it is severely under-constrained, and extremely different shapes can have very similar appearances when reprojected onto an image plane. In this thesis, we first exhibit the ambiguities of the reconstruction problem when relying on correspondences between a reference image for which we know the shape and an input image. We then propose several approaches to overcoming these ambiguities. The core idea is that some a priori knowledge about how a surface can deform must be introduced to solve them. We therefore present different ways to formulate that knowledge that range from very generic constraints to models specifically designed for a particular object or material. First, we propose generally applicable constraints formulated as motion models. Such models simply link the deformations of the surface from one image to the next in a video sequence. The obvious advantage is that they can be used independently of the physical properties of the object of interest. However, to be effective, they require the presence of texture over the whole surface, and, additionally, do not prevent error accumulation from frame to frame. To overcome these weaknesses, we propose to introduce statistical learning techniques that let us build a model from a large set of training examples, that is, in our case, known 3D deformations. The resulting model then essentially performs linear or non-linear interpolation between the training examples. Following this approach, we first propose a linear global representation that models the behavior of the whole surface. As is the case with all statistical learning techniques, the applicability of this representation is limited by the fact that acquiring training data is far from trivial. A large surface can undergo many subtle deformations, and thus a large amount of training data must be available to build an accurate model. We therefore propose an automatic way of generating such training examples in the case of inextensible surfaces. Furthermore, we show that the resulting linear global models can be incorporated into a closed-form solution to the shape recovery problem. This lets us not only track deformations from frame to frame, but also reconstruct surfaces from individual images. The major drawback of global representations is that they can only model the behavior of a specific surface, which forces us to re-train a new model for every new shape, even though it is made of a material observed before. To overcome this issue, and simultaneously reduce the amount of required training data, we propose local deformation models. Such models describe the behavior of small portions of a surface, and can be combined to form arbitrary global shapes. For this purpose, we study both linear and non-linear statistical learning methods, and show that, whereas the latter are better suited for traking deformations from frame to frame, the former can also be used for reconstruction from a single image

    Calibration of non-conventional imaging systems

    Get PDF

    Spatial integration in computer-augmented realities

    Get PDF
    In contrast to virtual reality, which immerses the user in a wholly computergenerated perceptual environment, augmented reality systems superimpose virtual entities on the user's view of the real world. This concept promises to fulfil new applications in a wide range of fields, but there are some challenging issues to be resolved. One issue relates to achieving accurate registration of virtual and real worlds. Accurate spatial registration is not only required with respect to lateral positioning, but also in depth. A limiting problem with existing optical-see-through displays, typically used for augmenting reality, is that they are incapable of displaying a full range of depth cues. Most significantly, they are unable to occlude real background and hence cannot produce interposition depth cueing. Neither are they able to modify the real-world view in the ways required to produce convincing common illumination effects such as virtual shadows across real surfaces. Also, at present, there are no wholly satisfactory ways of determining suitable common illumination models with which to determine the real-virtual light interactions necessary for producing such depth cues. This thesis establishes that interpositioning is essential for appropriate estimation of depth in augmented realities, and that the presence of shadows provides an important refining cue. It also extends the concept of a transparency alpha-channel to allow optical-see-through systems to display appropriate depth cues. The generalised theory of the approach is described mathematically and algorithms developed to automate generation of display-surface images. Three practical physical display strategies are presented; using a transmissive mask, selective lighting using digital projection, and selective reflection using digital micromirror devices. With respect to obtaining a common illumination model, all current approaches require either . prior knowledge of the light sources illuminating the real scene, or involve inserting some kind of probe into the scene with which to determine real light source position, shape, and intensity. This thesis presents an alternative approach that infers a plausible illumination from a limited view of the scene.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available
    corecore