336 research outputs found

    Automatic 3D Building Detection and Modeling from Airborne LiDAR Point Clouds

    Get PDF
    Urban reconstruction, with an emphasis on man-made structure modeling, is an active research area with broad impact on several potential applications. Urban reconstruction combines photogrammetry, remote sensing, computer vision, and computer graphics. Even though there is a huge volume of work that has been done, many problems still remain unsolved. Automation is one of the key focus areas in this research. In this work, a fast, completely automated method to create 3D watertight building models from airborne LiDAR (Light Detection and Ranging) point clouds is presented. The developed method analyzes the scene content and produces multi-layer rooftops, with complex rigorous boundaries and vertical walls, that connect rooftops to the ground. The graph cuts algorithm is used to separate vegetative elements from the rest of the scene content, which is based on the local analysis about the properties of the local implicit surface patch. The ground terrain and building rooftop footprints are then extracted, utilizing the developed strategy, a two-step hierarchical Euclidean clustering. The method presented here adopts a divide-and-conquer scheme. Once the building footprints are segmented from the terrain and vegetative areas, the whole scene is divided into individual pendent processing units which represent potential points on the rooftop. For each individual building region, significant features on the rooftop are further detected using a specifically designed region-growing algorithm with surface smoothness constraints. The principal orientation of each building rooftop feature is calculated using a minimum bounding box fitting technique, and is used to guide the refinement of shapes and boundaries of the rooftop parts. Boundaries for all of these features are refined for the purpose of producing strict description. Once the description of the rooftops is achieved, polygonal mesh models are generated by creating surface patches with outlines defined by detected vertices to produce triangulated mesh models. These triangulated mesh models are suitable for many applications, such as 3D mapping, urban planning and augmented reality

    Wavelet based similarity measurement algorithm for seafloor morphology

    Get PDF
    Thesis (S.M. in Naval Architecture and Marine Engineering and S.M. in Mechanical Engineering)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2006.Includes bibliographical references (leaves 71-73).The recent expansion of systematic seafloor exploration programs such as geophysical research, seafloor mapping, search and survey, resource assessment and other scientific, commercial and military applications has created a need for rapid and robust methods of processing seafloor imagery. Given the existence of a large library of seafloor images, a fast automated image classifier algorithm is needed to determine changes in seabed morphology over time. The focus of this work is the development of a robust Similarity Measurement (SM) algorithm to address the above problem. Our work uses a side-scan sonar image library for experimentation and testing. Variations of an underwater vehicle's height above the sea floor and of its pitch and roll angles cause distortion in the data obtained, such that transformations to align the data should include rotation, translation, anisotropic scaling and skew. In order to deal with these problems, we propose to use the Wavelet transform for similarity detection. Wavelets have been widely used during the last three decades in image processing. Since the Wavelet transform allows a multi-resolution decomposition, it is easier to identify the similarities between two images by examining the energy distribution at each decomposition level.(cont.) The energy distribution in the frequency domain at the output of the high pass and low pass filter banks identifies the texture discrimination. Our approach uses a statistical framework, involving fitting the Wavelet coefficients into a generalized Gaussian density distribution. The next step involves use of the Kullback-Leibner entropy metric to measure the distance between Wavelet coefficient distributions. To select the top N most likely matching images, the database images are ranked based on the minimum Kullback-Leibner distance. The statistical approach is effective in eliminating rotation, mis-registration and skew problems by working in the Wavelet domain. It's recommended that further work focuses on choosing the best Wavelet packet to increase the robustness of the algorithm developed in this thesis.by Ilkay Darilmaz.S.M.in Naval Architecture and Marine Engineering and S.M.in Mechanical Engineerin

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    Resolution enhancement of thermal infrared images via high resolution class-map and statistical methods

    Get PDF
    Remote sensing from long stand-off distances offers numerous advantages. As our ability to extract information from data has increased, so has the need for high spatial resolution. Such results are often not available due to technological or financial limitations on the detectors which scan the scene and produce the imagery. Therefore, for many years to come, spatial resolution enhancement using additional data from a variety of sources shall remain popular and cost-effective. Work has been on-going at the Rochester Institute of Technology\u27s Center for Imaging Science in the spatial resolution enhancement of thermal infrared imagery. Background thermal imaging theory is presented and the most recent work by the Digital Imaging and Remote Sensing group is reviewed. A literature search of materials published on the topic since 1985 is included: numerous methods and techniques are presented. Based upon these concepts several areas of study were carried out. All investigations undertaken were confined to cases that ensure radiometric fidelity across image processing operations, since derivation of accurate temperature or emissivity maps necessitate this requirement. Given a low spatial resolution thermal band, these methods produced a high resolution estimate thereof based on enhancement using: (1) a single panchromatic band, (2) a high resolution class-map derived from multi-spectral bands and (3) a statistically based combination of multi-spectral bands

    Proceedings of the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    This book is a collection of 15 reviewed technical reports summarizing the presentations at the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory. The covered topics include image processing, optical signal processing, visual inspection, pattern recognition and classification, human-machine interaction, world and situation modeling, autonomous system localization and mapping, information fusion, and trust propagation in sensor networks

    From plain visualisation to vibration sensing: using a camera to control the flexibilities in the ITER remote handling equipment

    Get PDF
    Thermonuclear fusion is expected to play a key role in the energy market during the second half of this century, reaching 20% of the electricity generation by 2100. For many years, fusion scientists and engineers have been developing the various technologies required to build nuclear power stations allowing a sustained fusion reaction. To the maximum possible extent, maintenance operations in fusion reactors are performed manually by qualified workers in full accordance with the "as low as reasonably achievable" (ALARA) principle. However, the option of hands-on maintenance becomes impractical, difficult or simply impossible in many circumstances, such as high biological dose rates. In this case, maintenance tasks will be performed with remote handling (RH) techniques. The International Thermonuclear Experimental Reactor ITER, to be commissioned in southern France around 2025, will be the first fusion experiment producing more power from fusion than energy necessary to heat the plasma. Its main objective is “to demonstrate the scientific and technological feasibility of fusion power for peaceful purposes”. However ITER represents an unequalled challenge in terms of RH system design, since it will be much more demanding and complex than any other remote maintenance system previously designed. The introduction of man-in-the-loop capabilities in the robotic systems designed for ITER maintenance would provide useful assistance during inspection, i.e. by providing the operator the ability and flexibility to locate and examine unplanned targets, or during handling operations, i.e. by making peg-in-hole tasks easier. Unfortunately, most transmission technologies able to withstand the very specific and extreme environmental conditions existing inside a fusion reactor are based on gears, screws, cables and chains, which make the whole system very flexible and subject to vibrations. This effect is further increased as structural parts of the maintenance equipment are generally lightweight and slender structures due to the size and the arduous accessibility to the reactor. Several methodologies aiming at avoiding or limiting the effects of vibrations on RH system performance have been investigated over the past decade. These methods often rely on the use of vibration sensors such as accelerometers. However, reviewing market shows that there is no commercial off-the-shelf (COTS) accelerometer that meets the very specific requirements for vibration sensing in the ITER in-vessel RH equipment (resilience to high total integrated dose, high sensitivity). The customisation and qualification of existing products or investigation of new concepts might be considered. However, these options would inevitably involve high development costs. While an extensive amount of work has been published on the modelling and control of flexible manipulators in the 1980s and 1990s, the possibility to use vision devices to stabilise an oscillating robotic arm has only been considered very recently and this promising solution has not been discussed at length. In parallel, recent developments on machine vision systems in nuclear environment have been very encouraging. Although they do not deal directly with vibration sensing, they open up new prospects in the use of radiation tolerant cameras. This thesis aims to demonstrate that vibration control of remote maintenance equipment operating in harsh environments such as ITER can be achieved without considering any extra sensor besides the embarked rad-hardened cameras that will inevitably be used to provide real-time visual feedback to the operators. In other words it is proposed to consider the radiation-tolerant vision devices as full sensors providing quantitative data that can be processed by the control scheme and not only as plain video feedback providing qualitative information. The work conducted within the present thesis has confirmed that methods based on the tracking of visual features from an unknown environment are effective candidates for the real-time control of vibrations. Oscillations induced at the end effector are estimated by exploiting a simple physical model of the manipulator. Using a camera mounted in an eye-in-hand configuration, this model is adjusted using direct measurement of the tip oscillations with respect to the static environment. The primary contribution of this thesis consists of implementing a markerless tracker to determine the velocity of a tip-mounted camera in an untrimmed environment in order to stabilise an oscillating long-reach robotic arm. In particular, this method implies modifying an existing online interaction matrix estimator to make it self-adjustable and deriving a multimode dynamic model of a flexible rotating beam. An innovative vision-based method using sinusoidal regression to sense low-frequency oscillations is also proposed and tested. Finally, the problem of online estimation of the image capture delay for visual servoing applications with high dynamics is addressed and an original approach based on the concept of cross-correlation is presented and experimentally validated

    Real Time Sequential Non Rigid Structure from motion using a single camera

    Get PDF
    En la actualidad las aplicaciones que basan su funcionamiento en una correcta localización y reconstrucción dentro de un entorno real en 3D han experimentado un gran interés en los últimos años, tanto por la comunidad investigadora como por la industrial. Estas aplicaciones varían desde la realidad aumentada, la robótica, la simulación, los videojuegos, etc. Dependiendo de la aplicación y del nivel de detalle de la reconstrucción, se emplean diversos dispositivos, algunos específicos, más complejos y caros como las cámaras estéreo, cámara y profundidad (RGBD) con Luz estructurada y Time of Flight (ToF), así como láser y otros más avanzados. Para aplicaciones sencillas es suficiente con dispositivos de uso común, como los smartphones, en los que aplicando técnicas de visión artificial, se pueden obtener modelos 3D del entorno para, en el caso de la realidad aumentada, mostrar información aumentada en la ubicación seleccionada.En robótica, la localización y generación simultáneas de un mapa del entorno en 3D es una tarea fundamental para conseguir la navegación autónoma. Este problema se conoce en el estado del arte como Simultaneous Localization And Mapping (SLAM) o Structure from Motion (SfM). Para la aplicación de estas técnicas, el objeto no ha de cambiar su forma a lo largo del tiempo. La reconstrucción es unívoca salvo factor de escala en captura monocular sin referencia. Si la condición de rigidez no se cumple, es porque la forma del objeto cambia a lo largo del tiempo. El problema sería equivalente a realizar una reconstrucción por fotograma, lo cual no se puede hacer de manera directa, puesto que diferentes formas, combinadas con diferentes poses de cámara pueden dar proyecciones similares. Es por esto que el campo de la reconstrucción de objetos deformables es todavía un área en desarrollo. Los métodos de SfM se han adaptado aplicando modelos físicos, restricciones temporales, espaciales, geométricas o de otros tipos para reducir la ambigüedad en las soluciones, naciendo así las técnicas conocidas como Non-Rigid SfM (NRSfM).En esta tesis se propone partir de una técnica de reconstrucción rígida bien conocida en el estado del arte como es PTAM (Parallel Tracking and Mapping) y adaptarla para incluir técnicas de NRSfM, basadas en modelo de bases lineales para estimar las deformaciones del objeto modelado dinámicamente y aplicar restricciones temporales y espaciales para mejorar las reconstrucciones, además de ir adaptándose a cambios de deformación que se presenten en la secuencia. Para ello, hay que realizar cambios de manera que cada uno de sus hilos de ejecución procesen datos no rígidos.El hilo encargado del seguimiento ya realizaba seguimiento basado en un mapa de puntos 3D, proporcionado a priori. La modificación más importante aquí es la integración de un modelo de deformación lineal para que se realice el cálculo de la deformación del objeto en tiempo real, asumiendo fijas las formas básicas de deformación. El cálculo de la pose de la cámara está basado en el sistema de estimación rígido, por lo que la estimación de pose y coeficientes de deformación se hace de manera alternada usando el algoritmo E-M (Expectation-Maximization). También, se imponen restricciones temporales y de forma para restringir las ambigüedades inherentes en las soluciones y mejorar la calidad de la estimación 3D.Respecto al hilo que gestiona el mapa, se actualiza en función del tiempo para que sea capaz de mejorar las bases de deformación cuando éstas no son capaces de explicar las formas que se ven en las imágenes actuales. Para ello, se sustituye la optimización de modelo rígido incluida en este hilo por un método de procesamiento exhaustivo NRSfM, para mejorar las bases acorde a las imágenes con gran error de reconstrucción desde el hilo de seguimiento. Con esto, el modelo se consigue adaptar a nuevas deformaciones, permitiendo al sistema evolucionar y ser estable a largo plazo.A diferencia de una gran parte de los métodos de la literatura, el sistema propuesto aborda el problema de la proyección perspectiva de forma nativa, minimizando los problemas de ambigüedad y de distancia al objeto existente en la proyección ortográfica. El sistema propuesto maneja centenares de puntos y está preparado para cumplir con restricciones de tiempo real para su aplicación en sistemas con recursos hardware limitados

    Recalage/Fusion d'images multimodales à l'aide de graphes d'ordres supérieurs

    Get PDF
    The main objective of this thesis is the exploration of higher order Markov Random Fields for image registration, specifically to encode the knowledge of global transformations, like rigid transformations, into the graph structure. Our main framework applies to 2D-2D or 3D-3D registration and use a hierarchical grid-based Markov Random Field model where the hidden variables are the displacements vectors of the control points of the grid.We first present the construction of a graph that allows to perform linear registration, which means here that we can perform affine registration, rigid registration, or similarity registration with the same graph while changing only one potential. Our framework is thus modular regarding the sought transformation and the metric used. Inference is performed with Dual Decomposition, which allows to handle the higher order hyperedges and which ensures the global optimum of the function is reached if we have an agreement among the slaves. A similar structure is also used to perform 2D-3D registration.Second, we fuse our former graph with another structure able to perform deformable registration. The resulting graph is more complex and another optimisation algorithm, called Alternating Direction Method of Multipliers is needed to obtain a better solution within reasonable time. It is an improvement of Dual Decomposition which speeds up the convergence. This framework is able to solve simultaneously both linear and deformable registration which allows to remove a potential bias created by the standard approach of consecutive registrations.L’objectif principal de cette thèse est l’exploration du recalage d’images à l’aide de champs aléatoires de Markov d’ordres supérieurs, et plus spécifiquement d’intégrer la connaissance de transformations globales comme une transformation rigide, dans la structure du graphe. Notre cadre principal s’applique au recalage 2D-2D ou 3D-3D et utilise une approche hiérarchique d’un modèle de champ de Markov dont le graphe est une grille régulière. Les variables cachées sont les vecteurs de déplacements des points de contrôle de la grille.Tout d’abord nous expliciterons la construction du graphe qui permet de recaler des images en cherchant entre elles une transformation affine, rigide, ou une similarité, tout en ne changeant qu’un potentiel sur l’ensemble du graphe, ce qui assure une flexibilité lors du recalage. Le choix de la métrique est également laissée à l’utilisateur et ne modifie pas le fonctionnement de notre algorithme. Nous utilisons l’algorithme d’optimisation de décomposition duale qui permet de gérer les hyper-arêtes du graphe et qui garantit l’obtention du minimum exact de la fonction pourvu que l’on ait un accord entre les esclaves. Un graphe similaire est utilisé pour réaliser du recalage 2D-3D.Ensuite, nous fusionnons le graphe précédent avec un autre graphe construit pour réaliser le recalage déformable. Le graphe résultant de cette fusion est plus complexe et, afin d’obtenir un résultat en un temps raisonnable, nous utilisons une méthode d’optimisation appelée ADMM (Alternating Direction Method of Multipliers) qui a pour but d’accélérer la convergence de la décomposition duale. Nous pouvons alors résoudre simultanément recalage affine et déformable, ce qui nous débarrasse du biais potentiel issu de l’approche classique qui consiste à recaler affinement puis de manière déformable
    corecore