65 research outputs found
Sistemas automáticos de informação e segurança para apoio na condução de veículos
Doutoramento em Engenharia MecânicaO objeto principal desta tese é o estudo de algoritmos de processamento
e representação automáticos de dados, em particular de informação
obtida por sensores montados a bordo de veículos (2D e
3D), com aplicação em contexto de sistemas de apoio à condução.
O trabalho foca alguns dos problemas que, quer os sistemas de condução
automática (AD), quer os sistemas avançados de apoio à condução
(ADAS), enfrentam hoje em dia. O documento é composto por
duas partes. A primeira descreve o projeto, construção e desenvolvimento
de três protótipos robóticos, incluindo pormenores associados
aos sensores montados a bordo dos robôs, algoritmos e arquitecturas
de software. Estes robôs foram utilizados como plataformas de ensaios
para testar e validar as técnicas propostas. Para além disso, participaram
em várias competições de condução autónoma tendo obtido
muito bons resultados. A segunda parte deste documento apresenta
vários algoritmos empregues na geração de representações intermédias
de dados sensoriais. Estes podem ser utilizados para melhorar
técnicas já existentes de reconhecimento de padrões, deteção ou navegação,
e por este meio contribuir para futuras aplicações no âmbito dos
AD ou ADAS. Dado que os veículos autónomos contêm uma grande
quantidade de sensores de diferentes naturezas, representações intermédias
são particularmente adequadas, pois podem lidar com problemas
relacionados com as diversas naturezas dos dados (2D, 3D, fotométrica,
etc.), com o carácter assíncrono dos dados (multiplos sensores
a enviar dados a diferentes frequências), ou com o alinhamento
dos dados (problemas de calibração, diferentes sensores a disponibilizar
diferentes medições para um mesmo objeto). Neste âmbito,
são propostas novas técnicas para a computação de uma representação
multi-câmara multi-modal de transformação de perspectiva inversa,
para a execução de correcção de côr entre imagens de forma a
obter mosaicos de qualidade, ou para a geração de uma representação
de cena baseada em primitivas poligonais, capaz de lidar com grandes
quantidades de dados 3D e 2D, tendo inclusivamente a capacidade
de refinar a representação à medida que novos dados sensoriais são
recebidos.The main object of this thesis is the study of algorithms for automatic information
processing and representation, in particular information provided
by onboard sensors (2D and 3D), to be used in the context of
driving assistance. The work focuses on some of the problems facing
todays Autonomous Driving (AD) systems and Advanced Drivers Assistance
Systems (ADAS). The document is composed of two parts.
The first part describes the design, construction and development of
three robotic prototypes, including remarks about onboard sensors, algorithms
and software architectures. These robots were used as test
beds for testing and validating the developed techniques; additionally,
they have participated in several autonomous driving competitions with
very good results. The second part of this document presents several
algorithms for generating intermediate representations of the raw
sensor data. They can be used to enhance existing pattern recognition,
detection or navigation techniques, and may thus benefit future
AD or ADAS applications. Since vehicles often contain a large amount
of sensors of different natures, intermediate representations are particularly
advantageous; they can be used for tackling problems related
with the diverse nature of the data (2D, 3D, photometric, etc.), with the
asynchrony of the data (multiple sensors streaming data at different
frequencies), or with the alignment of the data (calibration issues, different
sensors providing different measurements of the same object).
Within this scope, novel techniques are proposed for computing a multicamera
multi-modal inverse perspective mapping representation, executing
color correction between images for obtaining quality mosaics, or
to produce a scene representation based on polygonal primitives that
can cope with very large amounts of 3D and 2D data, including the
ability of refining the representation as new information is continuously
received
SHREC'20: Shape correspondence with non-isometric deformations
Estimating correspondence between two shapes continues to be a challenging problem in geometry processing. Most current methods assume deformation to be near-isometric, however this is often not the case. For this paper, a collection of shapes of different animals has been curated, where parts of the animals (e.g., mouths, tails & ears) correspond yet are naturally non-isometric. Ground-truth correspondences were established by asking three specialists to independently label corresponding points on each of the models with respect to a previously labelled reference model. We employ an algorithmic strategy to select a single point for each correspondence that is representative of the proposed labels. A novel technique that characterises the sparsity and distribution of correspondences is employed to measure the performance of ten shape correspondence methods
Recommended from our members
Detailed and Practical 3D Reconstruction with Advanced Photometric Stereo Modelling
Object 3D reconstruction has always been one of the main objectives of computer vision. After many decades of research, most techniques are still unsuccessful at recovering high resolution surfaces, especially for objects with limited surface texture. Moreover, most shiny materials are particularly hard to reconstruct.
Photometric Stereo (PS), which operates by capturing multiple images under changing illumination has traditionally been one of the most successful techniques at recovering a large amount of surface details, by exploiting the relationship between shading and local shape. However, using PS has been highly impractical because most approaches are only applicable in a very controlled lab setting and limited to objects experiencing diffuse reflection.
Nevertheless, recent advances in differential modelling have made complicated Photometric Stereo models possible and variational optimisations for these kinds of models show remarkable resilience to real world imperfections such as non-Gaussian noise and other outliers. Thus, a highly accurate, photometric-based reconstruction system is now possible.
The contribution of this thesis is threefold. First of all, the Photometric Stereo model is extended in order to be able to deal with arbitrary ambient lighting. This is a step towards acquisition in a non-fully controlled lab setting. Secondly, the need for a priori knowledge of the light source brightness and attenuation characteristics is relaxed as an alternating optimisation procedure is proposed which is able to estimate these parameters. This extension allows for quick acquisition with inexpensive LEDs that exhibit unpredictable illumination characteristics (flickering etc). Finally, a volumetric parameterisation is proposed which allows one to tackle the multi-view Photometric Stereo problem in a similar manner, in a simple unified differential model. This final extension allows for complete object reconstruction merging information from multiple images taken from multiple viewpoints and variable illumination.
The theoretical work in this thesis is experimentally evaluated in a number of challenging real world experiments, with data captured by custom-made hardware. In addition, the applicability of the generality of the proposed models is demonstrated by presenting a differential model for the shape of polarisation problem, which leads to a unified optimisation problem, fusing information from both methods. This allows for the acquisition of geometrical information about objects such as semi-transparent glass, hitherto hard to deal with
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular
2D image observations is a long-standing and actively researched area of
computer vision and graphics. It is an ill-posed inverse problem,
since--without additional prior assumptions--it permits infinitely many
solutions leading to accurate projection to the input 2D images. Non-rigid
reconstruction is a foundational building block for downstream applications
like robotics, AR/VR, or visual content creation. The key advantage of using
monocular cameras is their omnipresence and availability to the end users as
well as their ease of use compared to more sophisticated camera set-ups such as
stereo or multi-view systems. This survey focuses on state-of-the-art methods
for dense non-rigid 3D reconstruction of various deformable objects and
composite scenes from monocular videos or sets of monocular views. It reviews
the fundamentals of 3D reconstruction and deformation modeling from 2D image
observations. We then start from general methods--that handle arbitrary scenes
and make only a few prior assumptions--and proceed towards techniques making
stronger assumptions about the observed objects and types of deformations (e.g.
human faces, bodies, hands, and animals). A significant part of this STAR is
also devoted to classification and a high-level comparison of the methods, as
well as an overview of the datasets for training and evaluation of the
discussed techniques. We conclude by discussing open challenges in the field
and the social aspects associated with the usage of the reviewed methods.Comment: 25 page
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular2D image observations is a long-standing and actively researched area ofcomputer vision and graphics. It is an ill-posed inverse problem,since--without additional prior assumptions--it permits infinitely manysolutions leading to accurate projection to the input 2D images. Non-rigidreconstruction is a foundational building block for downstream applicationslike robotics, AR/VR, or visual content creation. The key advantage of usingmonocular cameras is their omnipresence and availability to the end users aswell as their ease of use compared to more sophisticated camera set-ups such asstereo or multi-view systems. This survey focuses on state-of-the-art methodsfor dense non-rigid 3D reconstruction of various deformable objects andcomposite scenes from monocular videos or sets of monocular views. It reviewsthe fundamentals of 3D reconstruction and deformation modeling from 2D imageobservations. We then start from general methods--that handle arbitrary scenesand make only a few prior assumptions--and proceed towards techniques makingstronger assumptions about the observed objects and types of deformations (e.g.human faces, bodies, hands, and animals). A significant part of this STAR isalso devoted to classification and a high-level comparison of the methods, aswell as an overview of the datasets for training and evaluation of thediscussed techniques. We conclude by discussing open challenges in the fieldand the social aspects associated with the usage of the reviewed methods.<br
Modelling appearance and geometry from images
Acquisition of realistic and relightable 3D models of large outdoor structures, such as buildings, requires the modelling of detailed geometry and visual appearance. Recovering these material characteristics can be very time consuming and needs specially dedicated equipment. Alternatively, surface detail can be conveyed by textures recovered from images, whose appearance is only valid under the originally photographed viewing and lighting conditions. Methods to easily capture locally detailed geometry, such as cracks in stone walls, and visual appearance require control of lighting conditions, which are usually restricted to small portions of surfaces captured at close range.This thesis investigates the acquisition of high-quality models from images, using simple photographic equipment and modest user intervention. The main focus of this investigation is on approximating detailed local depth information and visual appearance, obtained using a new image-based approach, and combining this with gross-scale 3D geometry. This is achieved by capturing these surface characteristics in small accessible regions and transferring them to the complete façade. This approach yields high-quality models, imparting the illusion of measured reflectance. In this thesis, we first present two novel algorithms for surface detail and visual appearance transfer, where these material properties are captured for small exemplars, using an image-based technique. Second, we develop an interactive solution to solve the problems of performing the transfer over both a large change in scale and to the different materials contained in a complete façade. Aiming to completely automate this process, a novel algorithm to differentiate between materials in the façade and associate them with the correct exemplars is introduced with promising results. Third, we present a new method for texture reconstruction from multiple images that optimises texture quality, by choosing the best view for every point and minimising seams. Material properties are transferred from the exemplars to the texture map, approximating reflectance and meso-structure. The combination of these techniques results in a complete working system capable of producing realistic relightable models of full building façades, containing high-resolution geometry and plausible visual appearance.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
3D facial performance capture from monocular RGB video.
3D facial performance capture is an essential technique for animation production in featured films, video gaming, human computer interaction, VR/AR asset creation and digital heritage, which all have huge impact on our daily life. Traditionally, dedicated hardware such as depth sensors, laser scanners and camera arrays have been developed to acquire depth information for such purpose. However, such sophisticated instruments can only be operated by trained professionals. In recent years, the wide spread availability of mobile devices, and the increased interest of casual untrained users in applications such as image, video editing, virtual and facial model creation, have sparked interest in 3D facial reconstruction from 2D RGB input. Due to the depth ambiguity and facial appearance variation, 3D facial performance capture and modelling from 2D images are inherently ill-posed problems. However, with strong prior knowledge of the human face, it is possible to accurately infer the true 3D facial shape and performance from multiple observations captured with different viewing angles. Various 3D from 2D methods have been proposed and proven to work well in controlled environments. Nevertheless there are still many unexplored issues in uncontrolled in-the-wild environments. In order to achieve the same level of performance in controlled environments, interfering factors in uncontrolled environments such as varying illumination, partial occlusion and facial variation not captured by prior knowledge would require the development of new techniques. This thesis addresses existing challenges and proposes novel methods involving 2D landmark detection, 3D facial reconstruction and 3D performance tracking, which are validated through theoretical research and experimental studies. 3D facial performance tracking is a multidisciplinary problem involving many areas such as computer vision, computer graphics and machine learning. To deal with the large variations within a single image, we present new machine learning techniques for facial landmark detection based on our observation of the facial features in challenging scenarios to increase the robustness. To take advantage of the evidence aggregated from multiple observations, we present new robust and efficient optimisation techniques that impose consistency constrains that help filter out outliers. To exploit the person-specific model generation, temporal and spatial coherence in continuous video input, we present new methods to improve the performance via optimisation. In order to track the 3D facial performance, the fundamental prerequisite for good results is the accurate underlying 3D model of the actor. In this thesis, we present new methods that are targeted at 3D facial geometry reconstruction, which are more efficient than existing generic 3D geometry reconstruction methods. Evaluation and validation were obtained and analysed from substantial experiment, which shows the proposed methods in this thesis outperform the state-of-the-art methods and enable us to generate high quality results with less constraints
Deformable shape matching
Deformable shape matching has become an important building block in academia as well as in industry. Given two three dimensional shapes A and B the deformation function f aligning A with B has to be found. The function is discretized by a set of corresponding point pairs. Unfortunately, the computation cost of a brute-force search of correspondences is exponential. Additionally, to be of any practical use the algorithm has to be able to deal with data coming directly from 3D scanner devices which suffers from acquisition problems like noise, holes as well as missing any information about topology. This dissertation presents novel solutions for solving shape matching: First, an algorithm estimating correspondences using a randomized search strategy is shown. Additionally, a planning step dramatically reducing the matching costs is incorporated. Using ideas of these both contributions, a method for matching multiple shapes at once is shown. The method facilitates the reconstruction of shape and motion from noisy data acquired with dynamic 3D scanners. Considering shape matching from another perspective a solution is shown using Markov Random Fields (MRF). Formulated as MRF, partial as well as full matches of a shape can be found. Here, belief propagation is utilized for inference computation in the MRF. Finally, an approach significantly reducing the space-time complexity of belief propagation for a wide spectrum of computer vision tasks is presented.Anpassung deformierbarer Formen ist zu einem wichtigen Baustein in der akademischen Welt sowie in der Industrie geworden. Gegeben zwei dreidimensionale Formen A und B, suchen wir nach einer Verformungsfunktion f, die die Deformation von A auf B abbildet. Die Funktion f wird durch eine Menge von korrespondierenden Punktepaaren diskretisiert. Leider sind die Berechnungskosten für eine Brute-Force-Suche dieser Korrespondenzen exponentiell. Um zusätzlich von einem praktischen Nutzen zu sein, muss der Suchalgorithmus in der Lage sein, mit Daten, die direkt aus 3D-Scanner kommen, umzugehen. Bedauerlicherweise leiden diese Daten unter Akquisitionsproblemen wie Rauschen, Löcher sowie fehlender Topologieinformation. In dieser Dissertation werden neue Lösungen für das Problem der Formanpassung präsentiert. Als erstes wird ein Algorithmus gezeigt, der die Korrespondenzen mittels einer randomisierten Suchstrategie schätzt. Zusätzlich wird anhand eines automatisch berechneten Schätzplanes die Geschwindigkeit der Suchstrategie verbessert. Danach wird ein Verfahren gezeigt, dass die Anpassung mehrerer Formen gleichzeitig bewerkstelligen kann. Diese Methode ermöglicht es, die Bewegung, sowie die eigentliche Struktur des Objektes aus verrauschten Daten, die mittels dynamischer 3D-Scanner aufgenommen wurden, zu rekonstruieren. Darauffolgend wird das Problem der Formanpassung aus einer anderen Perspektive betrachtet und als Markov-Netzwerk (MRF) reformuliert. Dieses ermöglicht es, die Formen auch stückweise aufeinander abzubilden. Die eigentliche Lösung wird mittels Belief Propagation berechnet. Schließlich wird ein Ansatz gezeigt, der die Speicher-Zeit-Komplexität von Belief Propagation für ein breites Spektrum von Computer-Vision Problemen erheblich reduziert
- …