654 research outputs found

    Robust Modular Feature-Based Terrain-Aided Visual Navigation and Mapping

    Get PDF
    The visual feature-based Terrain-Aided Navigation (TAN) system presented in this thesis addresses the problem of constraining inertial drift introduced into the location estimate of Unmanned Aerial Vehicles (UAVs) in GPS-denied environment. The presented TAN system utilises salient visual features representing semantic or human-interpretable objects (roads, forest and water boundaries) from onboard aerial imagery and associates them to a database of reference features created a-priori, through application of the same feature detection algorithms to satellite imagery. Correlation of the detected features with the reference features via a series of the robust data association steps allows a localisation solution to be achieved with a finite absolute bound precision defined by the certainty of the reference dataset. The feature-based Visual Navigation System (VNS) presented in this thesis was originally developed for a navigation application using simulated multi-year satellite image datasets. The extension of the system application into the mapping domain, in turn, has been based on the real (not simulated) flight data and imagery. In the mapping study the full potential of the system, being a versatile tool for enhancing the accuracy of the information derived from the aerial imagery has been demonstrated. Not only have the visual features, such as road networks, shorelines and water bodies, been used to obtain a position ’fix’, they have also been used in reverse for accurate mapping of vehicles detected on the roads into an inertial space with improved precision. Combined correction of the geo-coding errors and improved aircraft localisation formed a robust solution to the defense mapping application. A system of the proposed design will provide a complete independent navigation solution to an autonomous UAV and additionally give it object tracking capability

    Segmentation and Characterization of Small Retinal Vessels in Fundus Images Using the Tensor Voting Approach

    Get PDF
    RÉSUMÉ La rétine permet de visualiser facilement une partie du réseau vasculaire humain. Elle offre ainsi un aperçu direct sur le développement et le résultat de certaines maladies liées au réseau vasculaire dans son entier. Chaque complication visible sur la rétine peut avoir un impact sur la capacité visuelle du patient. Les plus petits vaisseaux sanguins sont parmi les premières structures anatomiques affectées par la progression d’une maladie, être capable de les analyser est donc crucial. Les changements dans l’état, l’aspect, la morphologie, la fonctionnalité, ou même la croissance des petits vaisseaux indiquent la gravité des maladies. Le diabète est une maladie métabolique qui affecte des millions de personnes autour du monde. Cette maladie affecte le taux de glucose dans le sang et cause des changements pathologiques dans différents organes du corps humain. La rétinopathie diabétique décrit l’en- semble des conditions et conséquences du diabète au niveau de la rétine. Les petits vaisseaux jouent un rôle dans le déclenchement, le développement et les conséquences de la rétinopa- thie. Dans les dernières étapes de cette maladie, la croissance des nouveaux petits vaisseaux, appelée néovascularisation, présente un risque important de provoquer la cécité. Il est donc crucial de détecter tous les changements qui ont lieu dans les petits vaisseaux de la rétine dans le but de caractériser les vaisseaux sains et les vaisseaux anormaux. La caractérisation en elle-même peut faciliter la détection locale d’une rétinopathie spécifique. La segmentation automatique des structures anatomiques comme le réseau vasculaire est une étape cruciale. Ces informations peuvent être fournies à un médecin pour qu’elles soient considérées lors de son diagnostic. Dans les systèmes automatiques d’aide au diagnostic, le rôle des petits vaisseaux est significatif. Ne pas réussir à les détecter automatiquement peut conduire à une sur-segmentation du taux de faux positifs des lésions rouges dans les étapes ultérieures. Les efforts de recherche se sont concentrés jusqu’à présent sur la localisation précise des vaisseaux de taille moyenne. Les modèles existants ont beaucoup plus de difficultés à extraire les petits vaisseaux sanguins. Les modèles existants ne sont pas robustes à la grande variance d’apparence des vaisseaux ainsi qu’à l’interférence avec l’arrière-plan. Les modèles de la littérature existante supposent une forme générale qui n’est pas suffisante pour s’adapter à la largeur étroite et la courbure qui caractérisent les petits vaisseaux sanguins. De plus, le contraste avec l’arrière-plan dans les régions des petits vaisseaux est très faible. Les méthodes de segmentation ou de suivi produisent des résultats fragmentés ou discontinus. Par ailleurs, la segmentation des petits vaisseaux est généralement faite aux dépends de l’amplification du bruit. Les modèles déformables sont inadéquats pour segmenter les petits vaisseaux. Les forces utilisées ne sont pas assez flexibles pour compenser le faible contraste, la largeur, et vii la variance des vaisseaux. Enfin, les approches de type apprentissage machine nécessitent un entraînement avec une base de données étiquetée. Il est très difficile d’obtenir ces bases de données dans le cas des petits vaisseaux. Cette thèse étend les travaux de recherche antérieurs en fournissant une nouvelle mé- thode de segmentation des petits vaisseaux rétiniens. La détection de ligne à échelles multiples (MSLD) est une méthode récente qui démontre une bonne performance de segmentation dans les images de la rétine, tandis que le vote tensoriel est une méthode proposée pour reconnecter les pixels. Une approche combinant un algorithme de détection de ligne et de vote tensoriel est proposée. L’application des détecteurs de lignes a prouvé son efficacité à segmenter les vais- seaux de tailles moyennes. De plus, les approches d’organisation perceptuelle comme le vote tensoriel ont démontré une meilleure robustesse en combinant les informations voisines d’une manière hiérarchique. La méthode de vote tensoriel est plus proche de la perception humain que d’autres modèles standards. Comme démontré dans ce manuscrit, c’est un outil pour segmenter les petits vaisseaux plus puissant que les méthodes existantes. Cette combinaison spécifique nous permet de surmonter les défis de fragmentation éprouvés par les méthodes de type modèle déformable au niveau des petits vaisseaux. Nous proposons également d’utiliser un seuil adaptatif sur la réponse de l’algorithme de détection de ligne pour être plus robuste aux images non-uniformes. Nous illustrons également comment une combinaison des deux méthodes individuelles, à plusieurs échelles, est capable de reconnecter les vaisseaux sur des distances variables. Un algorithme de reconstruction des vaisseaux est également proposé. Cette dernière étape est nécessaire car l’information géométrique complète est requise pour pouvoir utiliser la segmentation dans un système d’aide au diagnostic. La segmentation a été validée sur une base de données d’images de fond d’oeil à haute résolution. Cette base contient des images manifestant une rétinopathie diabétique. La seg- mentation emploie des mesures de désaccord standards et aussi des mesures basées sur la perception. En considérant juste les petits vaisseaux dans les images de la base de données, l’amélioration dans le taux de sensibilité que notre méthode apporte par rapport à la méthode standard de détection multi-niveaux de lignes est de 6.47%. En utilisant les mesures basées sur la perception, l’amélioration est de 7.8%. Dans une seconde partie du manuscrit, nous proposons également une méthode pour caractériser les rétines saines ou anormales. Certaines images contiennent de la néovascula- risation. La caractérisation des vaisseaux en bonne santé ou anormale constitue une étape essentielle pour le développement d’un système d’aide au diagnostic. En plus des défis que posent les petits vaisseaux sains, les néovaisseaux démontrent eux un degré de complexité encore plus élevé. Ceux-ci forment en effet des réseaux de vaisseaux à la morphologie com- plexe et inhabituelle, souvent minces et à fortes courbures. Les travaux existants se limitent viii à l’utilisation de caractéristiques de premier ordre extraites des petits vaisseaux segmentés. Notre contribution est d’utiliser le vote tensoriel pour isoler les jonctions vasculaires et d’uti- liser ces jonctions comme points d’intérêts. Nous utilisons ensuite une statistique spatiale de second ordre calculée sur les jonctions pour caractériser les vaisseaux comme étant sains ou pathologiques. Notre méthode améliore la sensibilité de la caractérisation de 9.09% par rapport à une méthode de l’état de l’art. La méthode développée s’est révélée efficace pour la segmentation des vaisseaux réti- niens. Des tenseurs d’ordre supérieur ainsi que la mise en œuvre d’un vote par tenseur via un filtrage orientable pourraient être étudiés pour réduire davantage le temps d’exécution et résoudre les défis encore présents au niveau des jonctions vasculaires. De plus, la caractéri- sation pourrait être améliorée pour la détection de la rétinopathie proliférative en utilisant un apprentissage supervisé incluant des cas de rétinopathie diabétique non proliférative ou d’autres pathologies. Finalement, l’incorporation des méthodes proposées dans des systèmes d’aide au diagnostic pourrait favoriser le dépistage régulier pour une détection précoce des rétinopathies et d’autres pathologies oculaires dans le but de réduire la cessité au sein de la population.----------ABSTRACT As an easily accessible site for the direct observation of the circulation system, human retina can offer a unique insight into diseases development or outcome. Retinal vessels are repre- sentative of the general condition of the whole systematic circulation, and thus can act as a "window" to the status of the vascular network in the whole body. Each complication on the retina can have an adverse impact on the patient’s sight. In this direction, small vessels’ relevance is very high as they are among the first anatomical structures that get affected as diseases progress. Moreover, changes in the small vessels’ state, appearance, morphology, functionality, or even growth indicate the severity of the diseases. This thesis will focus on the retinal lesions due to diabetes, a serious metabolic disease affecting millions of people around the world. This disorder disturbs the natural blood glucose levels causing various pathophysiological changes in different systems across the human body. Diabetic retinopathy is the medical term that describes the condition when the fundus and the retinal vessels are affected by diabetes. As in other diseases, small vessels play a crucial role in the onset, the development, and the outcome of the retinopathy. More importantly, at the latest stage, new small vessels, or neovascularizations, growth constitutes a factor of significant risk for blindness. Therefore, there is a need to detect all the changes that occur in the small retinal vessels with the aim of characterizing the vessels to healthy or abnormal. The characterization, in turn, can facilitate the detection of a specific retinopathy locally, like the sight-threatening proliferative diabetic retinopathy. Segmentation techniques can automatically isolate important anatomical structures like the vessels, and provide this information to the physician to assist him in the final decision. In comprehensive systems for the automatization of DR detection, small vessels role is significant as missing them early in a CAD pipeline might lead to an increase in the false positive rate of red lesions in subsequent steps. So far, the efforts have been concentrated mostly on the accurate localization of the medium range vessels. In contrast, the existing models are weak in case of the small vessels. The required generalization to adapt an existing model does not allow the approaches to be flexible, yet robust to compensate for the increased variability in the appearance as well as the interference with the background. So far, the current template models (matched filtering, line detection, and morphological processing) assume a general shape for the vessels that is not enough to approximate the narrow, curved, characteristics of the small vessels. Additionally, due to the weak contrast in the small vessel regions, the current segmentation and the tracking methods produce fragmented or discontinued results. Alternatively, the small vessel segmentation can be accomplished at the expense of x background noise magnification, in the case of using thresholding or the image derivatives methods. Furthermore, the proposed deformable models are not able to propagate a contour to the full extent of the vasculature in order to enclose all the small vessels. The deformable model external forces are ineffective to compensate for the low contrast, the low width, the high variability in the small vessel appearance, as well as the discontinuities. Internal forces, also, are not able to impose a global shape constraint to the contour that could be able to approximate the variability in the appearance of the vasculature in different categories of vessels. Finally, machine learning approaches require the training of a classifier on a labelled set. Those sets are difficult to be obtained, especially in the case of the smallest vessels. In the case of the unsupervised methods, the user has to predefine the number of clusters and perform an effective initialization of the cluster centers in order to converge to the global minimum. This dissertation expanded the previous research work and provides a new segmentation method for the smallest retinal vessels. Multi-scale line detection (MSLD) is a recent method that demonstrates good segmentation performance in the retinal images, while tensor voting is a method first proposed for reconnecting pixels. For the first time, we combined the line detection with the tensor voting framework. The application of the line detectors has been proved an effective way to segment medium-sized vessels. Additionally, perceptual organization approaches like tensor voting, demonstrate increased robustness by combining information coming from the neighborhood in a hierarchical way. Tensor voting is closer than standard models to the way human perception functions. As we show, it is a more powerful tool to segment small vessels than the existing methods. This specific combination allows us to overcome the apparent fragmentation challenge of the template methods at the smallest vessels. Moreover, we thresholded the line detection response adaptively to compensate for non-uniform images. We also combined the two individual methods in a multi-scale scheme in order to reconnect vessels at variable distances. Finally, we reconstructed the vessels from their extracted centerlines based on pixel painting as complete geometric information is required to be able to utilize the segmentation in a CAD system. The segmentation was validated on a high-resolution fundus image database that in- cludes diabetic retinopathy images of varying stages, using standard discrepancy as well as perceptual-based measures. When only the smallest vessels are considered, the improve- ments in the sensitivity rate for the database against the standard multi-scale line detection method is 6.47%. For the perceptual-based measure, the improvement is 7.8% against the basic method. The second objective of the thesis was to implement a method for the characterization of isolated retinal areas into healthy or abnormal cases. Some of the original images, from which xi these patches are extracted, contain neovascularizations. Investigation of image features for the vessels characterization to healthy or abnormal constitutes an essential step in the direction of developing CAD system for the automatization of DR screening. Given that the amount of data will significantly increase under CAD systems, the focus on this category of vessels can facilitate the referral of sight-threatening cases to early treatment. In addition to the challenges that small healthy vessels pose, neovessels demonstrate an even higher degree of complexity as they form networks of convolved, twisted, looped thin vessels. The existing work is limited to the use of first-order characteristics extracted from the small segmented vessels that limits the study of patterns. Our contribution is in using the tensor voting framework to isolate the retinal vascular junctions and in turn using those junctions as points of interests. Second, we exploited second-order statistics computed on the junction spatial distribution to characterize the vessels as healthy or neovascularizations. In fact, the second-order spatial statistics extracted from the junction distribution are combined with widely used features to improve the characterization sensitivity by 9.09% over the state of art. The developed method proved effective for the segmentation of the retinal vessels. Higher order tensors along with the implementation of tensor voting via steerable filtering could be employed to further reduce the execution time, and resolve the challenges at vascular junctions. Moreover, the characterization could be advanced to the detection of prolifera- tive retinopathy by extending the supervised learning to include non-proliferative diabetic retinopathy cases or other pathologies. Ultimately, the incorporation of the methods into CAD systems could facilitate screening for the effective reduction of the vision-threatening diabetic retinopathy rates, or the early detection of other than ocular pathologies

    Doctor of Philosophy

    Get PDF
    dissertationNeuroscientists are developing new imaging techniques and generating large volumes of data in an effort to understand the complex structure of the nervous system. The complexity and size of this data makes human interpretation a labor intensive task. To aid in the analysis, new segmentation techniques for identifying neurons in these feature rich datasets are required. However, the extremely anisotropic resolution of the data makes segmentation and tracking across slices difficult. Furthermore, the thickness of the slices can make the membranes of the neurons hard to identify. Similarly, structures can change significantly from one section to the next due to slice thickness which makes tracking difficult. This thesis presents a complete method for segmenting many neurons at once in two-dimensional (2D) electron microscopy images and reconstructing and visualizing them in three-dimensions (3D). First, we present an advanced method for identifying neuron membranes in 2D, necessary for whole neuron segmentation, using a machine learning approach. The method described uses a series of artificial neural networks (ANNs) in a framework combined with a feature vector that is composed of image and context; intensities sampled over a stencil neighborhood. Several ANNs are applied in series allowing each ANN to use the classification context; provided by the previous network to improve detection accuracy. To improve the membrane detection, we use information from a nonlinear alignment of sequential learned membrane images in a final ANN that improves membrane detection in each section. The final output, the detected membranes, are used to obtain 2D segmentations of all the neurons in an image. We also present a method that constructs 3D neuron representations by formulating the problem of finding paths through sets of sections as an optimal path computation, which applies a cost function to the identification of a cell from one section to the next and solves this optimization problem using Dijkstras algorithm. This basic formulation accounts for variability or inconsistencies between sections and prioritizes cells based on the evidence of their connectivity. Finally, we present a tool that combines these techniques with a visual user interface that enables users to quickly segment whole neurons in large volumes

    Doctor of Philosophy

    Get PDF
    dissertationElectron microscopy can visualize synapses at nanometer resolution, and can thereby capture the fine structure of these contacts. However, this imaging method lacks three key elements: temporal information, protein visualization, and large volume reconstruction. For my dissertation, I developed three methods in electron microscopy that overcame these limitations. First, I developed a method to freeze neurons at any desired time point after a stimulus to study synaptic vesicle cycle. Second, I developed a method to couple super-resolution fluorescence microscopy and electron microscopy to pinpoint the location of proteins in electron micrographs at nanometer resolution. Third, I collaborated with computer scientists to develop methods for semi-automated reconstruction of nervous system. I applied these techniques to answer two fundamental questions in synaptic biology. Which vesicles fuse in response to a stimulus? How are synaptic vesicles recovered at synapses after fusion? Only vesicles that are in direct contact with plasma membrane fuse upon stimulation. The active zone in C. elegans is broad, but primed vesicles are concentrated around the dense projection. Following exocytosis of synaptic vesicles, synaptic vesicle membrane was recovered rapidly at two distinct locations at a synapse: the dense projection and adherens junctions. These studies suggest that there may be a novel form of ultrafast endocytosis

    Robust perceptual organization techniques for analysis of color images

    Get PDF
    Esta tesis aborda el desarrollo de nuevas técnicas de análisis robusto de imágenes estrechamente relacionadas con el comportamiento del sistema visual humano. Uno de los pilares de la tesis es la votación tensorial, una técnica robusta que propaga y agrega información codificada en tensores mediante un proceso similar a la convolución. Su robustez y adaptabilidad han sido claves para su uso en esta tesis. Ambas propiedades han sido verificadas en tres nuevas aplicaciones de la votación tensorial: estimación de estructura, detección de bordes y segmentación de imágenes adquiridas mediante estereovisión.El mayor problema de la votación tensorial es su elevado coste computacional. En esta línea, esta tesis propone dos nuevas implementaciones eficientes de la votación tensorial derivadas de un análisis en profundidad de esta técnica.A pesar de su capacidad de adaptación, esta tesis muestra que la formulación original de la votación tensorial (a partir de aquí, votación tensorial clásica) no es adecuada para algunas aplicaciones, dado que las hipótesis en las que se basa no se ajustan a todas ellas. Esto ocurre particularmente en el filtrado de imágenes en color. Así, esta tesis muestra que, más que un método, la votación tensorial es una metodología en la que la codificación y el proceso de votación pueden ser adaptados específicamente para cada aplicación, manteniendo el espíritu de la votación tensorial.En esta línea, esta tesis propone un marco unificado en el que se realiza a la vez el filtrado de imágenes y la detección robusta de bordes. Este marco de trabajo es una extensión de la votación tensorial clásica en la que el color y la probabilidad de encontrar un borde en cada píxel se codifican mediante tensores, y en el que el proceso de votación se basa en un conjunto de criterios perceptuales relacionados con el modo en que el sistema visual humano procesa información. Los avances recientes en la percepción del color han sido esenciales en el diseño de dicho proceso de votación.Este nuevo enfoque ha sido efectivo, obteniendo excelentes resultados en ambas aplicaciones. En concreto, el nuevo método aplicado al filtrado de imágenes tiene un mejor rendimiento que los métodos del estado del arte para ruido real. Esto lo hace más adecuado para aplicaciones reales, donde los algoritmos de filtrado son imprescindibles. Además, el método aplicado a detección de bordes produce resultados más robustos que las técnicas del estado del arte y tiene un rendimiento competitivo con relación a la completitud, discriminabilidad, precisión y rechazo de falsas alarmas.Además, esta tesis demuestra que este nuevo marco de trabajo puede combinarse con otras técnicas para resolver el problema de segmentación robusta de imágenes. Los tensores obtenidos mediante el nuevo método se utilizan para clasificar píxeles como probablemente homogéneos o no homogéneos. Ambos tipos de píxeles se segmentan a continuación por medio de una variante de un algoritmo eficiente de segmentación de imágenes basada en grafos. Los experimentos muestran que el algoritmo propuesto obtiene mejores resultados en tres de las cinco métricas de evaluación aplicadas en comparación con las técnicas del estado del arte, con un coste computacional competitivo.La tesis también propone nuevas técnicas de evaluación en el ámbito del procesamiento de imágenes. En concreto, se proponen dos métricas de filtrado de imágenes con el fin de medir el grado en que un método es capaz de preservar los bordes y evitar la introducción de defectos. Asimismo, se propone una nueva metodología para la evaluación de detectores de bordes que evita posibles sesgos introducidos por el post-procesado. Esta metodología se basa en cinco métricas para estimar completitud, discriminabilidad, precisión, rechazo de falsas alarmas y robustez. Por último, se proponen dos nuevas métricas no paramétricas para estimar el grado de sobre e infrasegmentación producido por los algoritmos de segmentación de imágenes.This thesis focuses on the development of new robust image analysis techniques more closely related to the way the human visual system behaves. One of the pillars of the thesis is the so called tensor voting technique. This is a robust perceptual organization technique that propagates and aggregates information encoded by means of tensors through a convolution like process. Its robustness and adaptability have been one of the key points for using tensor voting in this thesis. These two properties are verified in the thesis by applying tensor voting to three applications where it had not been applied so far: image structure estimation, edge detection and image segmentation of images acquired through stereo vision.The most important drawback of tensor voting is that its usual implementations are highly time consuming. In this line, this thesis proposes two new efficient implementations of tensor voting, both derived from an in depth analysis of this technique.Despite its adaptability, this thesis shows that the original formulation of tensor voting (hereafter, classical tensor voting) is not adequate for some applications, since the hypotheses from which it is based are not suitable for all applications. This is particularly certain for color image denoising. Thus, this thesis shows that, more than a method, tensor voting can be thought of as a methodology in which the encoding and voting process can be tailored for every specific application, while maintaining the tensor voting spirit.By following this reasoning, this thesis proposes a unified framework for both image denoising and robust edge detection.This framework is an extension of the classical tensor voting in which both color and edginess the likelihood of finding an edge at every pixel of the image are encoded through tensors, and where the voting process takes into account a set of plausible perceptual criteria related to the way the human visual system processes visual information. Recent advances in the perception of color have been essential for designing such a voting process.This new approach has been found effective, since it yields excellent results for both applications. In particular, the new method applied to image denoising has a better performance than other state of the art methods for real noise. This makes it more adequate for real applications, in which an image denoiser is indeed required. In addition, the method applied to edge detection yields more robust results than the state of the art techniques and has a competitive performance in recall, discriminability, precision, and false alarm rejection.Moreover, this thesis shows how the results of this new framework can be combined with other techniques to tackle the problem of robust color image segmentation. The tensors obtained by applying the new framework are utilized to classify pixels into likely homogeneous and likely inhomogeneous. Those pixels are then sequentially segmented through a variation of an efficient graph based image segmentation algorithm. Experiments show that the proposed segmentation algorithm yields better scores in three of the five applied evaluation metrics when compared to the state of the art techniques with a competitive computational cost.This thesis also proposes new evaluation techniques in the scope of image processing. First, two new metrics are proposed in the field of image denoising: one to measure how an algorithm is able to preserve edges, and the second to measure how a method is able not to introduce undesirable artifacts. Second, a new methodology for assessing edge detectors that avoids possible bias introduced by post processing is proposed. It consists of five new metrics for assessing recall, discriminability, precision, false alarm rejection and robustness. Finally, two new non parametric metrics are proposed for estimating the degree of over and undersegmentation yielded by image segmentation algorithms

    Computational models for image contour grouping

    Get PDF
    Contours are one dimensional curves which may correspond to meaningful entities such as object boundaries. Accurate contour detection will simplify many vision tasks such as object detection and image recognition. Due to the large variety of image content and contour topology, contours are often detected as edge fragments at first, followed by a second step known as {u0300}{u0300}contour grouping'' to connect them. Due to ambiguities in local image patches, contour grouping is essential for constructing globally coherent contour representation. This thesis aims to group contours so that they are consistent with human perception. We draw inspirations from Gestalt principles, which describe perceptual grouping ability of human vision system. In particular, our work is most relevant to the principles of closure, similarity, and past experiences. The first part of our contribution is a new computational model for contour closure. Most of existing contour grouping methods have focused on pixel-wise detection accuracy and ignored the psychological evidences for topological correctness. This chapter proposes a higher-order CRF model to achieve contour closure in the contour domain. We also propose an efficient inference method which is guaranteed to find integer solutions. Tested on the BSDS benchmark, our method achieves a superior contour grouping performance, comparable precision-recall curves, and more visually pleasant results. Our work makes progresses towards a better computational model of human perceptual grouping. The second part is an energy minimization framework for salient contour detection problem. Region cues such as color/texture homogeneity, and contour cues such as local contrast, are both useful for this task. In order to capture both kinds of cues in a joint energy function, topological consistency between both region and contour labels must be satisfied. Our technique makes use of the topological concept of winding numbers. By using a fast method for winding number computation, we find that a small number of linear constraints are sufficient for label consistency. Our method is instantiated by ratio-based energy functions. Due to cue integration, our method obtains improved results. User interaction can also be incorporated to further improve the results. The third part of our contribution is an efficient category-level image contour detector. The objective is to detect contours which most likely belong to a prescribed category. Our method, which is based on three levels of shape representation and non-parametric Bayesian learning, shows flexibility in learning from either human labeled edge images or unlabelled raw images. In both cases, our experiments obtain better contour detection results than competing methods. In addition, our training process is robust even with a considerable size of training samples. In contrast, state-of-the-art methods require more training samples, and often human interventions are required for new category training. Last but not least, in Chapter 7 we also show how to leverage contour information for symmetry detection. Our method is simple yet effective for detecting the symmetric axes of bilaterally symmetric objects in unsegmented natural scene images. Compared with methods based on feature points, our model can often produce better results for the images containing limited texture

    Multimodal Three Dimensional Scene Reconstruction, The Gaussian Fields Framework

    Get PDF
    The focus of this research is on building 3D representations of real world scenes and objects using different imaging sensors. Primarily range acquisition devices (such as laser scanners and stereo systems) that allow the recovery of 3D geometry, and multi-spectral image sequences including visual and thermal IR images that provide additional scene characteristics. The crucial technical challenge that we addressed is the automatic point-sets registration task. In this context our main contribution is the development of an optimization-based method at the core of which lies a unified criterion that solves simultaneously for the dense point correspondence and transformation recovery problems. The new criterion has a straightforward expression in terms of the datasets and the alignment parameters and was used primarily for 3D rigid registration of point-sets. However it proved also useful for feature-based multimodal image alignment. We derived our method from simple Boolean matching principles by approximation and relaxation. One of the main advantages of the proposed approach, as compared to the widely used class of Iterative Closest Point (ICP) algorithms, is convexity in the neighborhood of the registration parameters and continuous differentiability, allowing for the use of standard gradient-based optimization techniques. Physically the criterion is interpreted in terms of a Gaussian Force Field exerted by one point-set on the other. Such formulation proved useful for controlling and increasing the region of convergence, and hence allowing for more autonomy in correspondence tasks. Furthermore, the criterion can be computed with linear complexity using recently developed Fast Gauss Transform numerical techniques. In addition, we also introduced a new local feature descriptor that was derived from visual saliency principles and which enhanced significantly the performance of the registration algorithm. The resulting technique was subjected to a thorough experimental analysis that highlighted its strength and showed its limitations. Our current applications are in the field of 3D modeling for inspection, surveillance, and biometrics. However, since this matching framework can be applied to any type of data, that can be represented as N-dimensional point-sets, the scope of the method is shown to reach many more pattern analysis applications

    Automated Extraction of Road Information from Mobile Laser Scanning Data

    Get PDF
    Effective planning and management of transportation infrastructure requires adequate geospatial data. Existing geospatial data acquisition techniques based on conventional route surveys are very time consuming, labor intensive, and costly. Mobile laser scanning (MLS) technology enables a rapid collection of enormous volumes of highly dense, irregularly distributed, accurate geo-referenced point cloud data in the format of three-dimensional (3D) point clouds. Today, more and more commercial MLS systems are available for transportation applications. However, many transportation engineers have neither interest in the 3D point cloud data nor know how to transform such data into their computer-aided model (CAD) formatted geometric road information. Therefore, automated methods and software tools for rapid and accurate extraction of 2D/3D road information from the MLS data are urgently needed. This doctoral dissertation deals with the development and implementation aspects of a novel strategy for the automated extraction of road information from the MLS data. The main features of this strategy include: (1) the extraction of road surfaces from large volumes of MLS point clouds, (2) the generation of 2D geo-referenced feature (GRF) images from the road-surface data, (3) the exploration of point density and intensity of MLS data for road-marking extraction, and (4) the extension of tensor voting (TV) for curvilinear pavement crack extraction. In accordance with this strategy, a RoadModeler prototype with three computerized algorithms was developed. They are: (1) road-surface extraction, (2) road-marking extraction, and (3) pavement-crack extraction. Four main contributions of this development can be summarized as follows. Firstly, a curb-based approach to road surface extraction with assistance of the vehicle’s trajectory is proposed and implemented. The vehicle’s trajectory and the function of curbs that separate road surfaces from sidewalks are used to efficiently separate road-surface points from large volume of MLS data. The accuracy of extracted road surfaces is validated with manually selected reference points. Secondly, the extracted road enables accurate detection of road markings and cracks for transportation-related applications in road traffic safety. To further improve computational efficiency, the extracted 3D road data are converted into 2D image data, termed as a GRF image. The GRF image of the extracted road enables an automated road-marking extraction algorithm and an automated crack detection algorithm, respectively. Thirdly, the automated road-marking extraction algorithm applies a point-density-dependent, multi-thresholding segmentation to the GRF image to overcome unevenly distributed intensity caused by the scanning range, the incidence angle, and the surface characteristics of an illuminated object. The morphological operation is then implemented to deal with the presence of noise and incompleteness of the extracted road markings. Fourthly, the automated crack extraction algorithm applies an iterative tensor voting (ITV) algorithm to the GRF image for crack enhancement. The tensor voting, a perceptual organization method that is capable of extracting curvilinear structures from the noisy and corrupted background, is explored and extended into the field of crack detection. The successful development of three algorithms suggests that the RoadModeler strategy offers a solution to the automated extraction of road information from the MLS data. Recommendations are given for future research and development to be conducted to ensure that this progress goes beyond the prototype stage and towards everyday use

    New contributions in overcomplete image representations inspired from the functional architecture of the primary visual cortex = Nuevas contribuciones en representaciones sobrecompletas de imágenes inspiradas por la arquitectura funcional de la corteza visual primaria

    Get PDF
    The present thesis aims at investigating parallelisms between the functional architecture of primary visual areas and image processing methods. A first objective is to refine existing models of biological vision on the base of information theory statements and a second is to develop original solutions for image processing inspired from natural vision. The available data on visual systems contains physiological and psychophysical studies, Gestalt psychology and statistics on natural images The thesis is mostly centered in overcomplete representations (i.e. representations increasing the dimensionality of the data) for multiple reasons. First because they allow to overcome existing drawbacks of critically sampled transforms, second because biological vision models appear overcomplete and third because building efficient overcomplete representations raises challenging and actual mathematical problems, in particular the problem of sparse approximation. The thesis proposes first a self-invertible log-Gabor wavelet transformation inspired from the receptive field and multiresolution arrangement of the simple cells in the primary visual cortex (V1). This transform shows promising abilities for noise elimination. Second, interactions observed between V1 cells consisting in lateral inhibition and in facilitation between aligned cells are shown efficient for extracting edges of natural images. As a third point, the redundancy introduced by the overcompleteness is reduced by a dedicated sparse approximation algorithm which builds a sparse representation of the images based on their edge content. For an additional decorrelation of the image information and for improving the image compression performances, edges arranged along continuous contours are coded in a predictive manner through chains of coefficients. This offers then an efficient representation of contours. Fourth, a study on contour completion using the tensor voting framework based on Gestalt psychology is presented. There, the use of iterations and of the curvature information allow to improve the robustness and the perceptual quality of the existing method. La presente tesis doctoral tiene como objetivo indagar en algunos paralelismos entre la arquitectura funcional de las áreas visuales primarias y el tratamiento de imágenes. Un primer objetivo consiste en mejorar los modelos existentes de visión biológica basándose en la teoría de la información. Un segundo es el desarrollo de nuevos algoritmos de tratamiento de imágenes inspirados de la visión natural. Los datos disponibles sobre el sistema visual abarcan estudios fisiológicos y psicofísicos, psicología Gestalt y estadísticas de las imágenes naturales. La tesis se centra principalmente en las representaciones sobrecompletas (i.e. representaciones que incrementan la dimensionalidad de los datos) por las siguientes razones. Primero porque permiten sobrepasar importantes desventajas de las transformaciones ortogonales; segundo porque los modelos de visión biológica necesitan a menudo ser sobrecompletos y tercero porque construir representaciones sobrecompletas eficientes involucra problemas matemáticos relevantes y novedosos, en particular el problema de las aproximaciones dispersas. La tesis propone primero una transformación en ondículas log-Gabor auto-inversible inspirada del campo receptivo y la organización en multiresolución de las células simples del cortex visual primario (V1). Esta transformación ofrece resultados prometedores para la eliminación del ruido. En segundo lugar, las interacciones observadas entre las células de V1 que consisten en la inhibición lateral y en la facilitación entre células alineadas se han mostrado eficientes para extraer los bordes de las imágenes naturales. En tercer lugar, la redundancia introducida por la transformación sobrecompleta se reduce gracias a un algoritmo dedicado de aproximación dispersa el cual construye una representación dispersa de las imágenes sobre la base de sus bordes. Para una decorrelación adicional y para conseguir más altas tasas de compresión, los bordes alineados a lo largo de contornos continuos están codificado de manera predictiva por cadenas de coeficientes, lo que ofrece una representacion eficiente de los contornos. Finalmente se presenta un estudio sobre el cierre de contornos utilizando la metodología de tensor voting. Proponemos el uso de iteraciones y de la información de curvatura para mejorar la robustez y la calidad perceptual de los métodos existentes
    corecore