    Contributions to the content-based image retrieval using pictorial queries

    Descripció del recurs: el 02 de novembre de 2010L'accés massiu a les càmeres digitals, els ordinadors personals i a Internet, ha propiciat la creació de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevància totes aquelles eines dissenyades per organitzar la informació i facilitar la seva cerca. Les imatges són un cas particular de dades que requereixen tècniques específiques de descripció i indexació. L'àrea de la visió per computador encarregada de l'estudi d'aquestes tècniques rep el nom de Recuperació d'Imatges per Contingut, en anglès Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinó que es basen en característiques extretes de les pròpies imatges. En contrast a les més de 6000 llengües parlades en el món, les descripcions basades en característiques visuals representen una via d'expressió universal. La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en àrees de coneixement molt diverses. Així doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecció de la propietat intel·lectual, el periodisme, el disseny gràfic, la cerca d'informació en Internet, la preservació dels patrimoni cultural, etc. Un dels punts importants d'una aplicació de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari és l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenció en aquells sistemes en què la consulta es formula a partir d'una representació pictòrica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecció, Consulta-segons-Composició-Icònica, Consulta-segons-Esboç i Consulta-segons-Il·lustració. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecció d'una imatge, fins a la creació d'una il·lustració en color, l'usuari és qui pren el control de les dades d'entrada del sistema. Al llarg dels capítols d'aquesta tesi hem analitzat la influència que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera també hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pràctic mitjançant una aplicació final

    Contributions to the Content-Based Image Retrieval Using Pictorial Queris

    Partial shape matching using CCP map and weighted graph transformation matching

    La détection de la similarité ou de la différence entre les images et leur mise en correspondance sont des problèmes fondamentaux dans le traitement de l'image. Pour résoudre ces problèmes, on utilise, dans la littérature, différents algorithmes d'appariement. Malgré leur nouveauté, ces algorithmes sont pour la plupart inefficaces et ne peuvent pas fonctionner correctement dans les situations d’images bruitées. Dans ce mémoire, nous résolvons la plupart des problèmes de ces méthodes en utilisant un algorithme fiable pour segmenter la carte des contours image, appelée carte des CCPs, et une nouvelle méthode d'appariement. Dans notre algorithme, nous utilisons un descripteur local qui est rapide à calculer, est invariant aux transformations affines et est fiable pour des objets non rigides et des situations d’occultation. Après avoir trouvé le meilleur appariement pour chaque contour, nous devons vérifier si ces derniers sont correctement appariés. Pour ce faire, nous utilisons l'approche « Weighted Graph Transformation Matching » (WGTM), qui est capable d'éliminer les appariements aberrants en fonction de leur proximité et de leurs relations géométriques. WGTM fonctionne correctement pour les objets à la fois rigides et non rigides et est robuste aux distorsions importantes. Pour évaluer notre méthode, le jeu de données ETHZ comportant cinq classes différentes d'objets (bouteilles, cygnes, tasses, girafes, logos Apple) est utilisé. Enfin, notre méthode est comparée à plusieurs méthodes célèbres proposées par d'autres chercheurs dans la littérature. Bien que notre méthode donne un résultat comparable à celui des méthodes de référence en termes du rappel et de la précision de localisation des frontières, elle améliore significativement la précision moyenne pour toutes les catégories du jeu de données ETHZ.Matching and detecting similarity or dissimilarity between images is a fundamental problem in image processing. Different matching algorithms are used in literature to solve this fundamental problem. Despite their novelty, these algorithms are mostly inefficient and cannot perform properly in noisy situations. In this thesis, we solve most of the problems of previous methods by using a reliable algorithm for segmenting image contour map, called CCP Map, and a new matching method. In our algorithm, we use a local shape descriptor that is very fast, invariant to affine transform, and robust for dealing with non-rigid objects and occlusion. After finding the best match for the contours, we need to verify if they are correctly matched. For this matter, we use the Weighted Graph Transformation Matching (WGTM) approach, which is capable of removing outliers based on their adjacency and geometrical relationships. WGTM works properly for both rigid and non-rigid objects and is robust to high order distortions. For evaluating our method, the ETHZ dataset including five diverse classes of objects (bottles, swans, mugs, giraffes, apple-logos) is used. Finally, our method is compared to several famous methods proposed by other researchers in the literature. While our method shows a comparable result to other benchmarks in terms of recall and the precision of boundary localization, it significantly improves the average precision for all of the categories in the ETHZ dataset

    3D object reconstruction from line drawings.

    Cao Liangliang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 64-69).Abstracts in English and Chinese.Chapter 1 --- Introduction and Related Work --- p.1Chapter 1.1 --- Reconstruction from Single Line Drawings and the Applications --- p.1Chapter 1.2 --- Optimization-based Reconstruction --- p.2Chapter 1.3 --- Other Reconstruction Methods --- p.2Chapter 1.3.1 --- Line Labeling and Algebraic Methods --- p.2Chapter 1.3.2 --- CAD Reconstruction --- p.3Chapter 1.3.3 --- Modelling from Images --- p.3Chapter 1.4 --- Finding Faces of Line Drawings --- p.4Chapter 1.5 --- Generalized Cylinder --- p.4Chapter 1.6 --- Research Problems and Our Contribution --- p.5Chapter 1.6.1 --- A New Criteria --- p.5Chapter 1.6.2 --- Recover Objects from Line Drawings without Hidden Lines --- p.6Chapter 1.6.3 --- Reconstruction of Curved Objects --- p.6Chapter 1.6.4 --- Planar Limbs Assumption and the Derived Models --- p.6Chapter 2 --- A New Criteria for Reconstruction --- p.8Chapter 2.1 --- Introduction --- p.8Chapter 2.2 --- Human Visual Perception and the Symmetry Measure --- p.10Chapter 2.3 --- Reconstruction Based on Symmetry and Planarity --- p.11Chapter 2.3.1 --- Finding Faces --- p.11Chapter 2.3.2 --- Constraint of Planarity --- p.11Chapter 2.3.3 --- Objective Function --- p.12Chapter 2.3.4 --- Reconstruction Algorithm --- p.13Chapter 2.4 --- Experimental Results --- p.13Chapter 2.5 --- Summary --- p.18Chapter 3 --- Line Drawings without Hidden Lines: Inference and Reconstruction --- p.19Chapter 3.1 --- Introduction --- p.19Chapter 3.2 --- Terminology --- p.20Chapter 3.3 --- Theoretical Inference of the Hidden Topological Structure --- p.21Chapter 3.3.1 --- Assumptions --- p.21Chapter 3.3.2 --- Finding the Degrees and Ranks --- p.22Chapter 3.3.3 --- Constraints for the Inference --- p.23Chapter 3.4 --- An Algorithm to Recover the Hidden Topological Structure --- p.25Chapter 3.4.1 --- Outline of the Algorithm --- p.26Chapter 3.4.2 --- Constructing the Initial Hidden Structure --- p.26Chapter 3.4.3 --- Reducing Initial Hidden Structure --- p.27Chapter 3.4.4 --- Selecting the Most Plausible Structure --- p.28Chapter 3.5 --- Reconstruction of 3D Objects --- p.29Chapter 3.6 --- Experimental Results --- p.32Chapter 3.7 --- Summary --- p.32Chapter 4 --- Curved Objects Reconstruction from 2D Line Drawings --- p.35Chapter 4.1 --- Introduction --- p.35Chapter 4.2 --- Related Work --- p.36Chapter 4.2.1 --- Face Identification --- p.36Chapter 4.2.2 --- 3D Reconstruction of planar objects --- p.37Chapter 4.3 --- Reconstruction of Curved Objects --- p.37Chapter 4.3.1 --- Transformation of Line Drawings --- p.37Chapter 4.3.2 --- Finding 3D Bezier Curves --- p.39Chapter 4.3.3 --- Bezier Surface Patches and Boundaries --- p.40Chapter 4.3.4 --- Generating Bezier Surface Patches --- p.41Chapter 4.4 --- Results --- p.43Chapter 4.5 --- Summary --- p.45Chapter 5 --- Planar Limbs and Degen Generalized Cylinders --- p.47Chapter 5.1 --- Introduction --- p.47Chapter 5.2 --- Planar Limbs and View Directions --- p.49Chapter 5.3 --- DGCs in Homogeneous Coordinates --- p.53Chapter 5.3.1 --- Homogeneous Coordinates --- p.53Chapter 5.3.2 --- Degen Surfaces --- p.54Chapter 5.3.3 --- DGCs --- p.54Chapter 5.4 --- Properties of DGCs --- p.56Chapter 5.5 --- Potential Applications --- p.59Chapter 5.5.1 --- Recovery of DGC Descriptions --- p.59Chapter 5.5.2 --- Deformable DGCs --- p.60Chapter 5.6 --- Summary --- p.61Chapter 6 --- Conclusion and Future Work --- p.62Bibliography --- p.6

    Statistical part-based models for object detection in large 3D scans

    3D scanning technology has matured to a point where very large scale acquisition of high resolution geometry has become feasible. However, having large quantities of 3D data poses new technical challenges. Many applications of practical use require an understanding of semantics of the acquired geometry. Consequently scene understanding plays a key role for many applications. This thesis is concerned with two core topics: 3D object detection and semantic alignment. We address the problem of efficiently detecting large quantities of objects in 3D scans according to object categories learned from sparse user annotation. Objects are modeled by a collection of smaller sub-parts and a graph structure representing part dependencies. The thesis introduces two novel approaches: A part-based chain structured Markov model and a general part-based full correlation model. Both models come with efficient detection schemes which allow for interactive run-times.Die Technologie für 3-dimensionale bildgebende Verfahren (3D Scans) ist mittlerweile an einem Punkt angelangt, an dem hochaufglöste Geometrie-Modelle für sehr große Szenen erstellbar sind. Große Mengen dreidimensionaler Daten stellen allerdings neue technische Herausforderungen. Viele Anwendungen von praktischem Nutzen erfordern ein semantisches Verständnis der akquirierten Geometrie. Dementsprechend spielt das sogenannte “Szenenverstehen” eine Schlüsselrolle bei vielen Anwendungen. Diese Dissertation beschäftigt sich mit 2 Kernthemen: 3D Objekt-Detektion und semantische (Objekt-) Anordnung. Das Problem hierbei ist, große Mengen von Objekten effizient in 3D Scans zu detektieren, wobei die Objekte aus bestimmten Objektkategorien entstammen, welche mittels gerinfügiger Annotationen durch den Benutzer gelernt werden. Dabei werden Objekte modelliert durch eine Ansammlung kleinerer Teilstücke und einer Graph-Struktur, welche die Abhängigkeiten der Einzelteile repäsentiert. Diese Arbeit stellt zwei neuartige Ansätze vor: Ein Markov-Modell, das aus einer teilebasierten Kettenstruktur besteht und einen generellen Ansatz, der auf einem Modell mit voll korrelierten Einzelteilen beruht. Zu beiden Modellen werden effiziente Detektionsschemata aufgezeigt, die interaktive Laufzeiten ermöglichen

    Pattern Recognition

    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Improving Bags-of-Words model for object categorization

    In the past decade, Bags-of-Words (BOW) models have become popular for the task of object recognition, owing to their good performance and simplicity. Some of the most effective recent methods for computer-based object recognition work by detecting and extracting local image features, before quantizing them according to a codebook rule such as k-means clustering, and classifying these with conventional classifiers such as Support Vector Machines and Naive Bayes. In this thesis, a Spatial Object Recognition Framework is presented that consists of the four main contributions of the research. The first contribution, frequent keypoint pattern discovery, works by combining pairs and triplets of frequent keypoints in order to discover intermediate representations for object classes. Based on the same frequent keypoints principle, algorithms for locating the region-of-interest in training images is then discussed. Extensions to the successful Spatial Pyramid Matching scheme, in order to better capture spatial relationships, are then proposed. The pairs frequency histogram and shapes frequency histogram work by capturing more redefined spatial information between local image features. Finally, alternative techniques to Spatial Pyramid Matching for capturing spatial information are presented. The proposed techniques, variations of binned log-polar histograms, divides the image into grids of different scale and different orientation. Thus captures the distribution of image features both in distance and orientation explicitly. Evaluations on the framework are focused on several recent and popular datasets, including image retrieval, object recognition, and object categorization. Overall, while the effectiveness of the framework is limited in some of the datasets, the proposed contributions are nevertheless powerful improvements of the BOW model


    Symmetry has weaved itself into almost all fabrics of science, as well as in arts, and has left an indelible imprint on our everyday lives. And, in the same manner, it has pervaded a wide range of areas of computer science, especially computer vision area, and a copious amount of literature has been produced to seek an algorithmic way to identify symmetry in digital data. Notwithstanding decades of endeavor and attempt to have an efficient system that can locate and recover symmetry embedded in real-world images, it is still challenging to fully automate such tasks while maintaining a high level of efficiency. The subject of this thesis is symmetry of imaged objects. Symmetry is one of the non-accidental features of shapes and has long been (maybe mistakenly) speculated as a pre-attentive feature, which improves recognition of quickly presented objects and reconstruction of shapes from incomplete set of measurements. While symmetry is known to provide rich and useful geometric cues to computer vision, it has been barely used as a principal feature for applications because figuring out how to represent and recognize symmetries embedded in objects is a singularly difficult task, both for computer vision and for perceptual psychology. The three main problems addressed in the dissertation are: (i) finding approximate symmetry by identifying the most prominent axis of symmetry out of an entire region, (ii) locating bilaterally symmetrical areas from a scene, and (iii) automating the process of symmetry recovery by solving the problems mentioned above. Perfect symmetries are rare in the extreme in natural images and symmetry perception in humans allows for qualification so that symmetry can be graduated based on the degree of structural deformation or replacement error. There have been many approaches to detect approximate symmetry by searching an optimal solution in a form of an exhaustive exploration of the parameter space or surmising the center of mass. The algorithm set out in this thesis circumvents the computationally intensive operations by using geometric constraints of symmetric images, and assumes no prerequisite knowledge of the barycenter. The results from an extensive set of evaluation experiments on metrics for symmetry distance and a comparison of the performance between the method presented in this thesis and the state of the art approach are demonstrated as well. Many biological vision systems employ a special computational strategy to locate regions of interest based on local image cues while viewing a compound visual scene. The method taken in this thesis is a bottom-up approach that causes the observer favors stimuli based on their saliency, and creates a feature map contingent on symmetry. With the help of summed area tables, the time complexity of the proposed algorithm is linear in the size of the image. The distinguished regions are then delivered to the algorithm described above to uncover approximate symmetry