Search CORE

5 research outputs found

Automated Hierarchical Image Segmentation Based on Merging of Quadrilaterals

Author: Chen Z
Chin FYL
Chung HY
Publication venue: WSEAS.
Publication date: 01/01/2006
Field of study

Proceedings of the 6th WSEAS International Conference on Signal Processing, Computational Geometry & Artifical Vision, 2006, p. 135-140This paper proposes a quadrilateral-based and automated hierarchical segmentation method, in which quadrilaterals are first constructed from an edge map, where neighboring quadrilaterals with similar features of interest are then merged together in a hierarchical mode to form regions. When evaluated qualitatively and quantitatively, the proposed method outperforms three traditional and commonly-used techniques, namely, K-means clustering, seeded region growing and quadrilateral-based segmentation. It is shown by experimental results that our proposed method is robust in both recovering missed important regions while preventing unnecessary over-segmentation, and offers an efficient description of the segmented objects conducive to content-based applications.postprintThe 6th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision (ISCGAV'06), Crete, Greece, August 2006. in Conference Proceedings, 2006, p. 135-14

HKU Scholars Hub

Autonomous navigation for guide following in crowded indoor environments

Author: Ballantyne James
Ballantyne James
Publication venue: Surgery and Cancer, Imperial College London
Publication date: 01/02/2011
Field of study

The requirements for assisted living are rapidly changing as the number of elderly patients over the age of 60 continues to increase. This rise places a high level of stress on nurse practitioners who must care for more patients than they are capable. As this trend is expected to continue, new technology will be required to help care for patients. Mobile robots present an opportunity to help alleviate the stress on nurse practitioners by monitoring and performing remedial tasks for elderly patients. In order to produce mobile robots with the ability to perform these tasks, however, many challenges must be overcome. The hospital environment requires a high level of safety to prevent patient injury. Any facility that uses mobile robots, therefore, must be able to ensure that no harm will come to patients whilst in a care environment. This requires the robot to build a high level of understanding about the environment and the people with close proximity to the robot. Hitherto, most mobile robots have used vision-based sensors or 2D laser range finders. 3D time-of-flight sensors have recently been introduced and provide dense 3D point clouds of the environment at real-time frame rates. This provides mobile robots with previously unavailable dense information in real-time. I investigate the use of time-of-flight cameras for mobile robot navigation in crowded environments in this thesis. A unified framework to allow the robot to follow a guide through an indoor environment safely and efficiently is presented. Each component of the framework is analyzed in detail, with real-world scenarios illustrating its practical use. Time-of-flight cameras are relatively new sensors and, therefore, have inherent problems that must be overcome to receive consistent and accurate data. I propose a novel and practical probabilistic framework to overcome many of the inherent problems in this thesis. The framework fuses multiple depth maps with color information forming a reliable and consistent view of the world. In order for the robot to interact with the environment, contextual information is required. To this end, I propose a region-growing segmentation algorithm to group points based on surface characteristics, surface normal and surface curvature. The segmentation process creates a distinct set of surfaces, however, only a limited amount of contextual information is available to allow for interaction. Therefore, a novel classifier is proposed using spherical harmonics to differentiate people from all other objects. The added ability to identify people allows the robot to find potential candidates to follow. However, for safe navigation, the robot must continuously track all visible objects to obtain positional and velocity information. A multi-object tracking system is investigated to track visible objects reliably using multiple cues, shape and color. The tracking system allows the robot to react to the dynamic nature of people by building an estimate of the motion flow. This flow provides the robot with the necessary information to determine where and at what speeds it is safe to drive. In addition, a novel search strategy is proposed to allow the robot to recover a guide who has left the field-of-view. To achieve this, a search map is constructed with areas of the environment ranked according to how likely they are to reveal the guide’s true location. Then, the robot can approach the most likely search area to recover the guide. Finally, all components presented are joined to follow a guide through an indoor environment. The results achieved demonstrate the efficacy of the proposed components

Spiral - Imperial College Digital Repository

Superquadric representation of scenes from multi-view range data

Author: Zhang Yan
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/01/2003
Field of study

Object representation denotes representing three-dimensional (3D) real-world objects with known graphic or mathematic primitives recognizable to computers. This research has numerous applications for object-related tasks in areas including computer vision, computer graphics, reverse engineering, etc. Superquadrics, as volumetric and parametric models, have been selected to be the representation primitives throughout this research. Superquadrics are able to represent a large family of solid shapes by a single equation with only a few parameters. This dissertation addresses superquadric representation of multi-part objects and multiobject scenes. Two issues motivate this research. First, superquadric representation of multipart objects or multi-object scenes has been an unsolved problem due to the complex geometry of objects. Second, superquadrics recovered from single-view range data tend to have low confidence and accuracy due to partially scanned object surfaces caused by inherent occlusions. To address these two problems, this dissertation proposes a multi-view superquadric representation algorithm. By incorporating both part decomposition and multi-view range data, the proposed algorithm is able to not only represent multi-part objects or multi-object scenes, but also achieve high confidence and accuracy of recovered superquadrics. The multi-view superquadric representation algorithm consists of (i) initial superquadric model recovery from single-view range data, (ii) pairwise view registration based on recovered superquadric models, (iii) view integration, (iv) part decomposition, and (v) final superquadric fitting for each decomposed part. Within the multi-view superquadric representation framework, this dissertation proposes a 3D part decomposition algorithm to automatically decompose multi-part objects or multiobject scenes into their constituent single parts consistent with human visual perception. Superquadrics can then be recovered for each decomposed single-part object. The proposed part decomposition algorithm is based on curvature analysis, and includes (i) Gaussian curvature estimation, (ii) boundary labeling, (iii) part growing and labeling, and (iv) post-processing. In addition, this dissertation proposes an extended view registration algorithm based on superquadrics. The proposed view registration algorithm is able to handle deformable superquadrics as well as 3D unstructured data sets. For superquadric fitting, two objective functions primarily used in the literature have been comprehensively investigated with respect to noise, viewpoints, sample resolutions, etc. The objective function proved to have better performance has been used throughout this dissertation. In summary, the three algorithms (contributions) proposed in this dissertation are generic and flexible in the sense of handling triangle meshes, which are standard surface primitives in computer vision and graphics. For each proposed algorithm, the dissertation presents both theory and experimental results. The results demonstrate the efficiency of the algorithms using both synthetic and real range data of a large variety of objects and scenes. In addition, the experimental results include comparisons with previous methods from the literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art in superquadric representation, and presents possible future extensions to this research

University of Tennessee, Knoxville: Trace

CiteSeerX

Restauration adaptative des contours par une approche inspirée de la prédiction des performances

Author: Rousseau Kami
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2008
Field of study

En télédétection, les cartes de contours peuvent servir, entre autres choses, à la restitution géométrique, à la recherche d'éléments linéaires, ainsi qu'à la segmentation. La création de ces cartes est faite relativement tôt dans la chaîne de traitements d'une image. Pour assurer la qualité des opérations subséquentes, il faut veiller à obtenir une carte de contours précise. Notre problématique est de savoir s'il est possible de diminuer la perte de temps liée au choix d'algorithme et de paramètre en corrigeant automatiquement la carte de contours. Nous concentrerons donc nos efforts sur le développement d'une méthode de détection/restauration de contours adaptative. Notre méthode s'inspire d'une technique de prédiction des performances d'algorithmes de bas niveau. Elle consiste à intégrer un traitement par réseau de neurones à une méthode"classique" de détection de contours. Plus précisément, nous proposons de combiner la carte de performances avec la carte de gradient pour permettre des décisions plus exactes. La présente étude a permis de développer un logiciel comprenant un réseau de neurones entraîné pour prédire la présence de contours. Ce réseau de neurones permet d'améliorer les décisions de détecteurs de contours, en réduisant le nombre de pixels de fausses alarmes et de contours manqués. La première étape de ce travail consiste en une méthode d'évaluation de performance pour les cartes de contours. Une fois ce choix effectué, il devient possible de comparer les cartes entre elles. Il est donc plus aisé de déterminer, pour chaque image, la meilleure détection de contours. La revue de la littérature réalisée simultanément a permis de faire un choix d'un groupe d'indicateurs prometteurs pour la restauration de contours. Ces derniers ont servi à la calibration et à l'entrainement d'un réseau de neurones pour modéliser les contours. Par la suite, l'information fournie par ce réseau a été combinée par multiplication arithmétique avec les cartes d'amplitudes de détecteurs"classiques" afin de fournir de nouvelles cartes d'amplitude du gradient. Le seuillage de ces contours donne des cartes de contours"optimisées". Sur les images aéroportées du jeu de données South Florida, la médiane des mesures-F de la pour l'algorithme de Sobel passe de 51,3 % avant la fusion à 56,4 % après. La médiane des mesures-F pour l'algorithme de Kirsch amélioré est de 56,3 % et celle de Frei-Chen amélioré est de 56,3 %. Pour l'algorithme de Sobel avec seuillage adaptatif, la mesure-F médiane est de 52,3 % avant fusion et de 57,2 % après fusion.En guise de comparaison, la mesure-F médiane pour le détecteur de Moon, mathématiquement optimal pour contours de type"rampe", est de 53,3 % et celle de l'algorithme de Canny, est de 61,1 %. L'applicabilité de notre algorithme se limite aux images qui, après filtrage, ont un rapport signal sur bruit supérieur ou égal à 20. Sur les photos au sol du jeu de données de South Florida, les résultats sont comparables à ceux obtenus sur les images aéroportées. Par contre, sur le jeu de données de Berkeley, les résultats n'ont pas été concluants. Sur une imagette IKONOS du campus de l'Université de Sherbrooke, pour l'algorithme de Sobel, la mesure-F est de 45,7 % «0,9 % avant la fusion et de 50,8 % après. Sur une imagette IKONOS de l'Agence Spatiale Canadienne, pour l'algorithme de Sobel avec seuillage adaptatif, la mesure-F est de 35,4 % «0,9 % avant la fusion et de 42,2 % après. Sur cette même image, l'algorithme de Argyle (Canny sans post-traitement) a une mesure-F de 35,1 % «0,9 % avant fusion et de 39,5 % après. Nos travaux ont permis d'améliorer la banque d'indicateurs de Chalmond, rendant possible le prétraitement avant le seuillage de la carte de gradient. À chaque étape, nous proposons un choix de paramètres permettant d'utiliser efficacement la méthode proposée. Les contours corrigés sont plus fins, plus complets et mieux localisés que les contours originaux. Une étude de sensibilité a été effectuée et permet de mieux comprendre la contribution de chaque indicateur. L'efficacité de l'outil développé est comparable à celle d'autres méthodes de détection de contours et en fait un choix intéressant pour la détection de contours. Les différences de qualité observées entre notre méthode et celle de Canny semble être dues à l'utilisation, ou non, de post-traitements. Grâce au logiciel développé, il est possible de réutiliser la méthodologie; cette dernière a permis d'opérationnaliser la méthode proposée. La possibilité de réutiliser le filtre, sans réentrainement est intéressante. La simplicité du paramétrage lors de l'utilisation est aussi un avantage. Ces deux facteurs répondent à un besoin de réduire le temps d'utilisation du logiciel

Savoirs UdeS

Robust perceptual organization techniques for analysis of color images

Author: Moreno Serrano Rodrigo
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2010
Field of study

Esta tesis aborda el desarrollo de nuevas técnicas de análisis robusto de imágenes estrechamente relacionadas con el comportamiento del sistema visual humano. Uno de los pilares de la tesis es la votación tensorial, una técnica robusta que propaga y agrega información codificada en tensores mediante un proceso similar a la convolución. Su robustez y adaptabilidad han sido claves para su uso en esta tesis. Ambas propiedades han sido verificadas en tres nuevas aplicaciones de la votación tensorial: estimación de estructura, detección de bordes y segmentación de imágenes adquiridas mediante estereovisión.El mayor problema de la votación tensorial es su elevado coste computacional. En esta línea, esta tesis propone dos nuevas implementaciones eficientes de la votación tensorial derivadas de un análisis en profundidad de esta técnica.A pesar de su capacidad de adaptación, esta tesis muestra que la formulación original de la votación tensorial (a partir de aquí, votación tensorial clásica) no es adecuada para algunas aplicaciones, dado que las hipótesis en las que se basa no se ajustan a todas ellas. Esto ocurre particularmente en el filtrado de imágenes en color. Así, esta tesis muestra que, más que un método, la votación tensorial es una metodología en la que la codificación y el proceso de votación pueden ser adaptados específicamente para cada aplicación, manteniendo el espíritu de la votación tensorial.En esta línea, esta tesis propone un marco unificado en el que se realiza a la vez el filtrado de imágenes y la detección robusta de bordes. Este marco de trabajo es una extensión de la votación tensorial clásica en la que el color y la probabilidad de encontrar un borde en cada píxel se codifican mediante tensores, y en el que el proceso de votación se basa en un conjunto de criterios perceptuales relacionados con el modo en que el sistema visual humano procesa información. Los avances recientes en la percepción del color han sido esenciales en el diseño de dicho proceso de votación.Este nuevo enfoque ha sido efectivo, obteniendo excelentes resultados en ambas aplicaciones. En concreto, el nuevo método aplicado al filtrado de imágenes tiene un mejor rendimiento que los métodos del estado del arte para ruido real. Esto lo hace más adecuado para aplicaciones reales, donde los algoritmos de filtrado son imprescindibles. Además, el método aplicado a detección de bordes produce resultados más robustos que las técnicas del estado del arte y tiene un rendimiento competitivo con relación a la completitud, discriminabilidad, precisión y rechazo de falsas alarmas.Además, esta tesis demuestra que este nuevo marco de trabajo puede combinarse con otras técnicas para resolver el problema de segmentación robusta de imágenes. Los tensores obtenidos mediante el nuevo método se utilizan para clasificar píxeles como probablemente homogéneos o no homogéneos. Ambos tipos de píxeles se segmentan a continuación por medio de una variante de un algoritmo eficiente de segmentación de imágenes basada en grafos. Los experimentos muestran que el algoritmo propuesto obtiene mejores resultados en tres de las cinco métricas de evaluación aplicadas en comparación con las técnicas del estado del arte, con un coste computacional competitivo.La tesis también propone nuevas técnicas de evaluación en el ámbito del procesamiento de imágenes. En concreto, se proponen dos métricas de filtrado de imágenes con el fin de medir el grado en que un método es capaz de preservar los bordes y evitar la introducción de defectos. Asimismo, se propone una nueva metodología para la evaluación de detectores de bordes que evita posibles sesgos introducidos por el post-procesado. Esta metodología se basa en cinco métricas para estimar completitud, discriminabilidad, precisión, rechazo de falsas alarmas y robustez. Por último, se proponen dos nuevas métricas no paramétricas para estimar el grado de sobre e infrasegmentación producido por los algoritmos de segmentación de imágenes.This thesis focuses on the development of new robust image analysis techniques more closely related to the way the human visual system behaves. One of the pillars of the thesis is the so called tensor voting technique. This is a robust perceptual organization technique that propagates and aggregates information encoded by means of tensors through a convolution like process. Its robustness and adaptability have been one of the key points for using tensor voting in this thesis. These two properties are verified in the thesis by applying tensor voting to three applications where it had not been applied so far: image structure estimation, edge detection and image segmentation of images acquired through stereo vision.The most important drawback of tensor voting is that its usual implementations are highly time consuming. In this line, this thesis proposes two new efficient implementations of tensor voting, both derived from an in depth analysis of this technique.Despite its adaptability, this thesis shows that the original formulation of tensor voting (hereafter, classical tensor voting) is not adequate for some applications, since the hypotheses from which it is based are not suitable for all applications. This is particularly certain for color image denoising. Thus, this thesis shows that, more than a method, tensor voting can be thought of as a methodology in which the encoding and voting process can be tailored for every specific application, while maintaining the tensor voting spirit.By following this reasoning, this thesis proposes a unified framework for both image denoising and robust edge detection.This framework is an extension of the classical tensor voting in which both color and edginess the likelihood of finding an edge at every pixel of the image are encoded through tensors, and where the voting process takes into account a set of plausible perceptual criteria related to the way the human visual system processes visual information. Recent advances in the perception of color have been essential for designing such a voting process.This new approach has been found effective, since it yields excellent results for both applications. In particular, the new method applied to image denoising has a better performance than other state of the art methods for real noise. This makes it more adequate for real applications, in which an image denoiser is indeed required. In addition, the method applied to edge detection yields more robust results than the state of the art techniques and has a competitive performance in recall, discriminability, precision, and false alarm rejection.Moreover, this thesis shows how the results of this new framework can be combined with other techniques to tackle the problem of robust color image segmentation. The tensors obtained by applying the new framework are utilized to classify pixels into likely homogeneous and likely inhomogeneous. Those pixels are then sequentially segmented through a variation of an efficient graph based image segmentation algorithm. Experiments show that the proposed segmentation algorithm yields better scores in three of the five applied evaluation metrics when compared to the state of the art techniques with a competitive computational cost.This thesis also proposes new evaluation techniques in the scope of image processing. First, two new metrics are proposed in the field of image denoising: one to measure how an algorithm is able to preserve edges, and the second to measure how a method is able not to introduce undesirable artifacts. Second, a new methodology for assessing edge detectors that avoids possible bias introduced by post processing is proposed. It consists of five new metrics for assessing recall, discriminability, precision, false alarm rejection and robustness. Finally, two new non parametric metrics are proposed for estimating the degree of over and undersegmentation yielded by image segmentation algorithms

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura