205 research outputs found

    Partial shape matching using CCP map and weighted graph transformation matching

    Get PDF
    La détection de la similarité ou de la différence entre les images et leur mise en correspondance sont des problèmes fondamentaux dans le traitement de l'image. Pour résoudre ces problèmes, on utilise, dans la littérature, différents algorithmes d'appariement. Malgré leur nouveauté, ces algorithmes sont pour la plupart inefficaces et ne peuvent pas fonctionner correctement dans les situations d’images bruitées. Dans ce mémoire, nous résolvons la plupart des problèmes de ces méthodes en utilisant un algorithme fiable pour segmenter la carte des contours image, appelée carte des CCPs, et une nouvelle méthode d'appariement. Dans notre algorithme, nous utilisons un descripteur local qui est rapide à calculer, est invariant aux transformations affines et est fiable pour des objets non rigides et des situations d’occultation. Après avoir trouvé le meilleur appariement pour chaque contour, nous devons vérifier si ces derniers sont correctement appariés. Pour ce faire, nous utilisons l'approche « Weighted Graph Transformation Matching » (WGTM), qui est capable d'éliminer les appariements aberrants en fonction de leur proximité et de leurs relations géométriques. WGTM fonctionne correctement pour les objets à la fois rigides et non rigides et est robuste aux distorsions importantes. Pour évaluer notre méthode, le jeu de données ETHZ comportant cinq classes différentes d'objets (bouteilles, cygnes, tasses, girafes, logos Apple) est utilisé. Enfin, notre méthode est comparée à plusieurs méthodes célèbres proposées par d'autres chercheurs dans la littérature. Bien que notre méthode donne un résultat comparable à celui des méthodes de référence en termes du rappel et de la précision de localisation des frontières, elle améliore significativement la précision moyenne pour toutes les catégories du jeu de données ETHZ.Matching and detecting similarity or dissimilarity between images is a fundamental problem in image processing. Different matching algorithms are used in literature to solve this fundamental problem. Despite their novelty, these algorithms are mostly inefficient and cannot perform properly in noisy situations. In this thesis, we solve most of the problems of previous methods by using a reliable algorithm for segmenting image contour map, called CCP Map, and a new matching method. In our algorithm, we use a local shape descriptor that is very fast, invariant to affine transform, and robust for dealing with non-rigid objects and occlusion. After finding the best match for the contours, we need to verify if they are correctly matched. For this matter, we use the Weighted Graph Transformation Matching (WGTM) approach, which is capable of removing outliers based on their adjacency and geometrical relationships. WGTM works properly for both rigid and non-rigid objects and is robust to high order distortions. For evaluating our method, the ETHZ dataset including five diverse classes of objects (bottles, swans, mugs, giraffes, apple-logos) is used. Finally, our method is compared to several famous methods proposed by other researchers in the literature. While our method shows a comparable result to other benchmarks in terms of recall and the precision of boundary localization, it significantly improves the average precision for all of the categories in the ETHZ dataset

    Multi-scale active shape description in medical imaging

    Get PDF
    Shape description in medical imaging has become an increasingly important research field in recent years. Fast and high-resolution image acquisition methods like Magnetic Resonance (MR) imaging produce very detailed cross-sectional images of the human body - shape description is then a post-processing operation which abstracts quantitative descriptions of anatomically relevant object shapes. This task is usually performed by clinicians and other experts by first segmenting the shapes of interest, and then making volumetric and other quantitative measurements. High demand on expert time and inter- and intra-observer variability impose a clinical need of automating this process. Furthermore, recent studies in clinical neurology on the correspondence between disease status and degree of shape deformations necessitate the use of more sophisticated, higher-level shape description techniques. In this work a new hierarchical tool for shape description has been developed, combining two recently developed and powerful techniques in image processing: differential invariants in scale-space, and active contour models. This tool enables quantitative and qualitative shape studies at multiple levels of image detail, exploring the extra image scale degree of freedom. Using scale-space continuity, the global object shape can be detected at a coarse level of image detail, and finer shape characteristics can be found at higher levels of detail or scales. New methods for active shape evolution and focusing have been developed for the extraction of shapes at a large set of scales using an active contour model whose energy function is regularized with respect to scale and geometric differential image invariants. The resulting set of shapes is formulated as a multiscale shape stack which is analysed and described for each scale level with a large set of shape descriptors to obtain and analyse shape changes across scales. This shape stack leads naturally to several questions in regard to variable sampling and appropriate levels of detail to investigate an image. The relationship between active contour sampling precision and scale-space is addressed. After a thorough review of modem shape description, multi-scale image processing and active contour model techniques, the novel framework for multi-scale active shape description is presented and tested on synthetic images and medical images. An interesting result is the recovery of the fractal dimension of a known fractal boundary using this framework. Medical applications addressed are grey-matter deformations occurring for patients with epilepsy, spinal cord atrophy for patients with Multiple Sclerosis, and cortical impairment for neonates. Extensions to non-linear scale-spaces, comparisons to binary curve and curvature evolution schemes as well as other hierarchical shape descriptors are discussed

    Statistical/Geometric Techniques for Object Representation and Recognition

    Get PDF
    Object modeling and recognition are key areas of research in computer vision and graphics with wide range of applications. Though research in these areas is not new, traditionally most of it has focused on analyzing problems under controlled environments. The challenges posed by real life applications demand for more general and robust solutions. The wide variety of objects with large intra-class variability makes the task very challenging. The difficulty in modeling and matching objects also vary depending on the input modality. In addition, the easy availability of sensors and storage have resulted in tremendous increase in the amount of data that needs to be processed which requires efficient algorithms suitable for large-size databases. In this dissertation, we address some of the challenges involved in modeling and matching of objects in realistic scenarios. Object matching in images require accounting for large variability in the appearance due to changes in illumination and view point. Any real world object is characterized by its underlying shape and albedo, which unlike the image intensity are insensitive to changes in illumination conditions. We propose a stochastic filtering framework for estimating object albedo from a single intensity image by formulating the albedo estimation as an image estimation problem. We also show how this albedo estimate can be used for illumination insensitive object matching and for more accurate shape recovery from a single image using standard shape from shading formulation. We start with the simpler problem where the pose of the object is known and only the illumination varies. We then extend the proposed approach to handle unknown pose in addition to illumination variations. We also use the estimated albedo maps for another important application, which is recognizing faces across age progression. Many approaches which address the problem of modeling and recognizing objects from images assume that the underlying objects are of diffused texture. But most real world objects exhibit a combination of diffused and specular properties. We propose an approach for separating the diffused and specular reflectance from a given color image so that the algorithms proposed for objects of diffused texture become applicable to a much wider range of real world objects. Representing and matching the 2D and 3D geometry of objects is also an integral part of object matching with applications in gesture recognition, activity classification, trademark and logo recognition, etc. The challenge in matching 2D/3D shapes lies in accounting for the different rigid and non-rigid deformations, large intra-class variability, noise and outliers. In addition, since shapes are usually represented as a collection of landmark points, the shape matching algorithm also has to deal with the challenges of missing or unknown correspondence across these data points. We propose an efficient shape indexing approach where the different feature vectors representing the shape are mapped to a hash table. For a query shape, we show how the similar shapes in the database can be efficiently retrieved without the need for establishing correspondence making the algorithm extremely fast and scalable. We also propose an approach for matching and registration of 3D point cloud data across unknown or missing correspondence using an implicit surface representation. Finally, we discuss possible future directions of this research

    Contributions to the content-based image retrieval using pictorial queries

    Get PDF
    Descripció del recurs: el 02 de novembre de 2010L'accés massiu a les càmeres digitals, els ordinadors personals i a Internet, ha propiciat la creació de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevància totes aquelles eines dissenyades per organitzar la informació i facilitar la seva cerca. Les imatges són un cas particular de dades que requereixen tècniques específiques de descripció i indexació. L'àrea de la visió per computador encarregada de l'estudi d'aquestes tècniques rep el nom de Recuperació d'Imatges per Contingut, en anglès Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinó que es basen en característiques extretes de les pròpies imatges. En contrast a les més de 6000 llengües parlades en el món, les descripcions basades en característiques visuals representen una via d'expressió universal. La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en àrees de coneixement molt diverses. Així doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecció de la propietat intel·lectual, el periodisme, el disseny gràfic, la cerca d'informació en Internet, la preservació dels patrimoni cultural, etc. Un dels punts importants d'una aplicació de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari és l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenció en aquells sistemes en què la consulta es formula a partir d'una representació pictòrica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecció, Consulta-segons-Composició-Icònica, Consulta-segons-Esboç i Consulta-segons-Il·lustració. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecció d'una imatge, fins a la creació d'una il·lustració en color, l'usuari és qui pren el control de les dades d'entrada del sistema. Al llarg dels capítols d'aquesta tesi hem analitzat la influència que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera també hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pràctic mitjançant una aplicació final

    Contributions to the Content-Based Image Retrieval Using Pictorial Queris

    Get PDF
    L'accés massiu a les càmeres digitals, els ordinadors personals i a Internet, ha propiciat la creació de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevància totes aquelles eines dissenyades per organitzar la informació i facilitar la seva cerca.Les imatges són un cas particular de dades que requereixen tècniques específiques de descripció i indexació. L'àrea de la visió per computador encarregada de l'estudi d'aquestes tècniques rep el nom de Recuperació d'Imatges per Contingut, en anglès Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinó que es basen en característiques extretes de les pròpies imatges. En contrast a les més de 6000 llengües parlades en el món, les descripcions basades en característiques visuals representen una via d'expressió universal.La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en àrees de coneixement molt diverses. Així doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecció de la propietat intel·lectual, el periodisme, el disseny gràfic, la cerca d'informació en Internet, la preservació dels patrimoni cultural, etc. Un dels punts importants d'una aplicació de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari és l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenció en aquells sistemes en què la consulta es formula a partir d'una representació pictòrica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecció, Consulta-segons-Composició-Icònica, Consulta-segons-Esboç i Consulta-segons-Il·lustració. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecció d'una imatge, fins a la creació d'una il·lustració en color, l'usuari és qui pren el control de les dades d'entrada del sistema. Al llarg dels capítols d'aquesta tesi hem analitzat la influència que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera també hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pràctic mitjançant una aplicació final

    A Modular Approach to Lung Nodule Detection from Computed Tomography Images Using Artificial Neural Networks and Content Based Image Representation

    Get PDF
    Lung cancer is one of the most lethal cancer types. Research in computer aided detection (CAD) and diagnosis for lung cancer aims at providing effective tools to assist physicians in cancer diagnosis and treatment to save lives. In this dissertation, we focus on developing a CAD framework for automated lung cancer nodule detection from 3D lung computed tomography (CT) images. Nodule detection is a challenging task that no machine intelligence can surpass human capability to date. In contrast, human recognition power is limited by vision capacity and may suffer from work overload and fatigue, whereas automated nodule detection systems can complement expert’s efforts to achieve better detection performance. The proposed CAD framework encompasses several desirable properties such as mimicking physicians by means of geometric multi-perspective analysis, computational efficiency, and the most importantly producing high performance in detection accuracy. As the central part of the framework, we develop a novel hierarchical modular decision engine implemented by Artificial Neural Networks. One advantage of this decision engine is that it supports the combination of spatial-level and feature-level information analysis in an efficient way. Our methodology overcomes some of the limitations of current lung nodule detection techniques by combining geometric multi-perspective analysis with global and local feature analysis. The proposed modular decision engine design is flexible to modifications in the decision modules; the engine structure can adopt the modifications without having to re-design the entire system. The engine can easily accommodate multi-learning scheme and parallel implementation so that each information type can be processed (in parallel) by the most adequate learning technique of its own. We have also developed a novel shape representation technique that is invariant under rigid-body transformation and we derived new features based on this shape representation for nodule detection. We implemented a prototype nodule detection system as a demonstration of the proposed framework. Experiments have been conducted to assess the performance of the proposed methodologies using real-world lung CT data. Several performance measures for detection accuracy are used in the assessment. The results show that the decision engine is able to classify patterns efficiently with very good classification performance
    • …
    corecore