1,876 research outputs found

    Learning Equivariant Representations

    Get PDF
    State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present

    3D shape matching and registration : a probabilistic perspective

    Get PDF
    Dense correspondence is a key area in computer vision and medical image analysis. It has applications in registration and shape analysis. In this thesis, we develop a technique to recover dense correspondences between the surfaces of neuroanatomical objects over heterogeneous populations of individuals. We recover dense correspondences based on 3D shape matching. In this thesis, the 3D shape matching problem is formulated under the framework of Markov Random Fields (MRFs). We represent the surfaces of neuroanatomical objects as genus zero voxel-based meshes. The surface meshes are projected into a Markov random field space. The projection carries both geometric and topological information in terms of Gaussian curvature and mesh neighbourhood from the original space to the random field space. Gaussian curvature is projected to the nodes of the MRF, and the mesh neighbourhood structure is projected to the edges. 3D shape matching between two surface meshes is then performed by solving an energy function minimisation problem formulated with MRFs. The outcome of the 3D shape matching is dense point-to-point correspondences. However, the minimisation of the energy function is NP hard. In this thesis, we use belief propagation to perform the probabilistic inference for 3D shape matching. A sparse update loopy belief propagation algorithm adapted to the 3D shape matching is proposed to obtain an approximate global solution for the 3D shape matching problem. The sparse update loopy belief propagation algorithm demonstrates significant efficiency gain compared to standard belief propagation. The computational complexity and convergence property analysis for the sparse update loopy belief propagation algorithm are also conducted in the thesis. We also investigate randomised algorithms to minimise the energy function. In order to enhance the shape matching rate and increase the inlier support set, we propose a novel clamping technique. The clamping technique is realized by combining the loopy belief propagation message updating rule with the feedback from 3D rigid body registration. By using this clamping technique, the correct shape matching rate is increased significantly. Finally, we investigate 3D shape registration techniques based on the 3D shape matching result. Based on the point-to-point dense correspondences obtained from the 3D shape matching, a three-point based transformation estimation technique is combined with the RANdom SAmple Consensus (RANSAC) algorithm to obtain the inlier support set. The global registration approach is purely dependent on point-wise correspondences between two meshed surfaces. It has the advantage that the need for orientation initialisation is eliminated and that all shapes of spherical topology. The comparison of our MRF based 3D registration approach with a state-of-the-art registration algorithm, the first order ellipsoid template, is conducted in the experiments. These show dense correspondence for pairs of hippocampi from two different data sets, each of around 20 60+ year old healthy individuals

    Analysis of 3D objects at multiple scales (application to shape matching)

    Get PDF
    Depuis quelques années, l évolution des techniques d acquisition a entraîné une généralisation de l utilisation d objets 3D très dense, représentés par des nuages de points de plusieurs millions de sommets. Au vu de la complexité de ces données, il est souvent nécessaire de les analyser pour en extraire les structures les plus pertinentes, potentiellement définies à plusieurs échelles. Parmi les nombreuses méthodes traditionnellement utilisées pour analyser des signaux numériques, l analyse dite scale-space est aujourd hui un standard pour l étude des courbes et des images. Cependant, son adaptation aux données 3D pose des problèmes d instabilité et nécessite une information de connectivité, qui n est pas directement définie dans les cas des nuages de points. Dans cette thèse, nous présentons une suite d outils mathématiques pour l analyse des objets 3D, sous le nom de Growing Least Squares (GLS). Nous proposons de représenter la géométrie décrite par un nuage de points via une primitive du second ordre ajustée par une minimisation aux moindres carrés, et cela à pour plusieurs échelles. Cette description est ensuite derivée analytiquement pour extraire de manière continue les structures les plus pertinentes à la fois en espace et en échelle. Nous montrons par plusieurs exemples et comparaisons que cette représentation et les outils associés définissent une solution efficace pour l analyse des nuages de points à plusieurs échelles. Un défi intéressant est l analyse d objets 3D acquis dans le cadre de l étude du patrimoine culturel. Dans cette thèse, nous nous étudions les données générées par l acquisition des fragments des statues entourant par le passé le Phare d Alexandrie, Septième Merveille du Monde. Plus précisément, nous nous intéressons au réassemblage d objets fracturés en peu de fragments (une dizaine), mais avec de nombreuses parties manquantes ou fortement dégradées par l action du temps. Nous proposons un formalisme pour la conception de systèmes d assemblage virtuel semi-automatiques, permettant de combiner à la fois les connaissances des archéologues et la précision des algorithmes d assemblage. Nous présentons deux systèmes basés sur cette conception, et nous montrons leur efficacité dans des cas concrets.Over the last decades, the evolution of acquisition techniques yields the generalization of detailed 3D objects, represented as huge point sets composed of millions of vertices. The complexity of the involved data often requires to analyze them for the extraction and characterization of pertinent structures, which are potentially defined at multiple scales. Amongthe wide variety of methods proposed to analyze digital signals, the scale-space analysis istoday a standard for the study of 2D curves and images. However, its adaptation to 3D dataleads to instabilities and requires connectivity information, which is not directly availablewhen dealing with point sets.In this thesis, we present a new multi-scale analysis framework that we call the GrowingLeast Squares (GLS). It consists of a robust local geometric descriptor that can be evaluatedon point sets at multiple scales using an efficient second-order fitting procedure. We proposeto analytically differentiate this descriptor to extract continuously the pertinent structuresin scale-space. We show that this representation and the associated toolbox define an effi-cient way to analyze 3D objects represented as point sets at multiple scales. To this end, we demonstrate its relevance in various application scenarios.A challenging application is the analysis of acquired 3D objects coming from the CulturalHeritage field. In this thesis, we study a real-world dataset composed of the fragments ofthe statues that were surrounding the legendary Alexandria Lighthouse. In particular, wefocus on the problem of fractured object reassembly, consisting of few fragments (up to aboutten), but with missing parts due to erosion or deterioration. We propose a semi-automaticformalism to combine both the archaeologist s knowledge and the accuracy of geometricmatching algorithms during the reassembly process. We use it to design two systems, andwe show their efficiency in concrete cases.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Analysis of 3D objects at multiple scales (application to shape matching)

    Get PDF
    Depuis quelques années, l évolution des techniques d acquisition a entraîné une généralisation de l utilisation d objets 3D très dense, représentés par des nuages de points de plusieurs millions de sommets. Au vu de la complexité de ces données, il est souvent nécessaire de les analyser pour en extraire les structures les plus pertinentes, potentiellement définies à plusieurs échelles. Parmi les nombreuses méthodes traditionnellement utilisées pour analyser des signaux numériques, l analyse dite scale-space est aujourd hui un standard pour l étude des courbes et des images. Cependant, son adaptation aux données 3D pose des problèmes d instabilité et nécessite une information de connectivité, qui n est pas directement définie dans les cas des nuages de points. Dans cette thèse, nous présentons une suite d outils mathématiques pour l analyse des objets 3D, sous le nom de Growing Least Squares (GLS). Nous proposons de représenter la géométrie décrite par un nuage de points via une primitive du second ordre ajustée par une minimisation aux moindres carrés, et cela à pour plusieurs échelles. Cette description est ensuite derivée analytiquement pour extraire de manière continue les structures les plus pertinentes à la fois en espace et en échelle. Nous montrons par plusieurs exemples et comparaisons que cette représentation et les outils associés définissent une solution efficace pour l analyse des nuages de points à plusieurs échelles. Un défi intéressant est l analyse d objets 3D acquis dans le cadre de l étude du patrimoine culturel. Dans cette thèse, nous nous étudions les données générées par l acquisition des fragments des statues entourant par le passé le Phare d Alexandrie, Septième Merveille du Monde. Plus précisément, nous nous intéressons au réassemblage d objets fracturés en peu de fragments (une dizaine), mais avec de nombreuses parties manquantes ou fortement dégradées par l action du temps. Nous proposons un formalisme pour la conception de systèmes d assemblage virtuel semi-automatiques, permettant de combiner à la fois les connaissances des archéologues et la précision des algorithmes d assemblage. Nous présentons deux systèmes basés sur cette conception, et nous montrons leur efficacité dans des cas concrets.Over the last decades, the evolution of acquisition techniques yields the generalization of detailed 3D objects, represented as huge point sets composed of millions of vertices. The complexity of the involved data often requires to analyze them for the extraction and characterization of pertinent structures, which are potentially defined at multiple scales. Amongthe wide variety of methods proposed to analyze digital signals, the scale-space analysis istoday a standard for the study of 2D curves and images. However, its adaptation to 3D dataleads to instabilities and requires connectivity information, which is not directly availablewhen dealing with point sets.In this thesis, we present a new multi-scale analysis framework that we call the GrowingLeast Squares (GLS). It consists of a robust local geometric descriptor that can be evaluatedon point sets at multiple scales using an efficient second-order fitting procedure. We proposeto analytically differentiate this descriptor to extract continuously the pertinent structuresin scale-space. We show that this representation and the associated toolbox define an effi-cient way to analyze 3D objects represented as point sets at multiple scales. To this end, we demonstrate its relevance in various application scenarios.A challenging application is the analysis of acquired 3D objects coming from the CulturalHeritage field. In this thesis, we study a real-world dataset composed of the fragments ofthe statues that were surrounding the legendary Alexandria Lighthouse. In particular, wefocus on the problem of fractured object reassembly, consisting of few fragments (up to aboutten), but with missing parts due to erosion or deterioration. We propose a semi-automaticformalism to combine both the archaeologist s knowledge and the accuracy of geometricmatching algorithms during the reassembly process. We use it to design two systems, andwe show their efficiency in concrete cases.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Relating Multimodal Imagery Data in 3D

    Get PDF
    This research develops and improves the fundamental mathematical approaches and techniques required to relate imagery and imagery derived multimodal products in 3D. Image registration, in a 2D sense, will always be limited by the 3D effects of viewing geometry on the target. Therefore, effects such as occlusion, parallax, shadowing, and terrain/building elevation can often be mitigated with even a modest amounts of 3D target modeling. Additionally, the imaged scene may appear radically different based on the sensed modality of interest; this is evident from the differences in visible, infrared, polarimetric, and radar imagery of the same site. This thesis develops a `model-centric\u27 approach to relating multimodal imagery in a 3D environment. By correctly modeling a site of interest, both geometrically and physically, it is possible to remove/mitigate some of the most difficult challenges associated with multimodal image registration. In order to accomplish this feat, the mathematical framework necessary to relate imagery to geometric models is thoroughly examined. Since geometric models may need to be generated to apply this `model-centric\u27 approach, this research develops methods to derive 3D models from imagery and LIDAR data. Of critical note, is the implementation of complimentary techniques for relating multimodal imagery that utilize the geometric model in concert with physics based modeling to simulate scene appearance under diverse imaging scenarios. Finally, the often neglected final phase of mapping localized image registration results back to the world coordinate system model for final data archival are addressed. In short, once a target site is properly modeled, both geometrically and physically, it is possible to orient the 3D model to the same viewing perspective as a captured image to enable proper registration. If done accurately, the synthetic model\u27s physical appearance can simulate the imaged modality of interest while simultaneously removing the 3-D ambiguity between the model and the captured image. Once registered, the captured image can then be archived as a texture map on the geometric site model. In this way, the 3D information that was lost when the image was acquired can be regained and properly related with other datasets for data fusion and analysis

    Shape Retrieval Methods for Architectural 3D Models

    Get PDF
    This thesis introduces new methods for content-based retrieval of architecture-related 3D models. We thereby consider two different overall types of architectural 3D models. The first type consists of context objects that are used for detailed design and decoration of 3D building model drafts. This includes e.g. furnishing for interior design or barriers and fences for forming the exterior environment. The second type consists of actual building models. To enable efficient content-based retrieval for both model types that is tailored to the user requirements of the architectural domain, type-specific algorithms must be developed. On the one hand, context objects like furnishing that provide similar functions (e.g. seating furniture) often share a similar shape. Nevertheless they might be considered to belong to different object classes from an architectural point of view (e.g. armchair, elbow chair, swivel chair). The differentiation is due to small geometric details and is sometimes only obvious to an expert from the domain. Building models on the other hand are often distinguished according to the underlying floor- and room plans. Topological floor plan properties for example serve as a starting point for telling apart residential and commercial buildings. The first contribution of this thesis is a new meta descriptor for 3D retrieval that combines different types of local shape descriptors using a supervised learning approach. The approach enables the differentiation of object classes according to small geometric details and at the same time integrates expert knowledge from the field of architecture. We evaluate our approach using a database containing arbitrary 3D models as well as on one that only consists of models from the architectural domain. We then further extend our approach by adding a sophisticated shape descriptor localization strategy. Additionally, we exploit knowledge about the spatial relationship of object components to further enhance the retrieval performance. In the second part of the thesis we introduce attributed room connectivity graphs (RCGs) as a means to characterize a 3D building model according to the structure of its underlying floor plans. We first describe how RCGs are inferred from a given building model and discuss how substructures of this graph can be queried efficiently. We then introduce a new descriptor denoted as Bag-of-Attributed-Subgraphs that transforms attributed graphs into a vector-based representation using subgraph embeddings. We finally evaluate the retrieval performance of this new method on a database consisting of building models with different floor plan types. All methods presented in this thesis are aimed at an as automated as possible workflow for indexing and retrieval such that only minimum human interaction is required. Accordingly, only polygon soups are required as inputs which do not need to be manually repaired or structured. Human effort is only needed for offline groundtruth generation to enable supervised learning and for providing information about the orientation of building models and the unit of measurement used for modeling

    Mining Mid-level Features for Image Classification

    No full text
    International audienceMid-level or semi-local features learnt using class-level information are potentially more distinctive than the traditional low-level local features constructed in a purely bottom-up fashion. At the same time they preserve some of the robustness properties with respect to occlusions and image clutter. In this paper we propose a new and effective scheme for extracting mid-level features for image classification, based on relevant pattern mining. In par- ticular, we mine relevant patterns of local compositions of densely sampled low-level features. We refer to the new set of obtained patterns as Frequent Local Histograms or FLHs. During this process, we pay special attention to keeping all the local histogram information and to selecting the most relevant reduced set of FLH patterns for classification. The careful choice of the visual primitives and an extension to exploit both local and global spatial information allow us to build powerful bag-of-FLH-based image representations. We show that these bag-of-FLHs are more discriminative than traditional bag-of-words and yield state-of-the-art results on various image classification benchmarks, including Pascal VOC

    A 3D Digital Approach to the Stylistic and Typo-Technological Study of Small Figurines from Ayia Irini, Cyprus

    Get PDF
    The thesis aims to develop a 3D digital approach to the stylistic and typo-technological study of coroplastic, focusing on small figurines. The case study to test the method is a sample of terracotta statuettes from an assemblage of approximately 2000 statues and figurines found at the beginning of the 20th century in a rural open-air sanctuary at Ayia Irini (Cyprus) by the archaeologists of the Swedish Cyprus Expedition. The excavators identified continuity of worship at the sanctuary from the Late Cypriot III (circa 1200 BC) to the end of the Cypro-Archaic II period (ca. 475 BC). They attributed the small figurines to the Cypro-Archaic I-II. Although the excavation was one of the first performed through the newly established stratigraphic method, the archaeologists studied the site and its material following a traditional, merely qualitative approach. Theanalysis of the published results identified a classification of the material with no-clear-cut criteria, and their overlap between types highlights ambiguities in creating groups and classes. Similarly, stratigraphic arguments and different opinions among archaeologists highlight the need for revising. Moreover, pastlegislation allowed the excavators to export half of the excavated antiquities, creating a dispersion of the assemblage. Today, the assemblage is still partly exhibited at the Cyprus Museum in Nicosia and in four different museums in Sweden. Such a setting prevents to study, analyse and interpret the assemblageholistically. This research proposes a 3D chaîne opératoire methodology to study the collection’s small terracotta figurines, aiming to understand the context’s function and social role as reflected by the classification obtained with the 3D digital approach. The integration proposed in this research of traditional archaeological studies, and computer-assisted investigation based on quantitative criteria, identified and defined with 3D measurements and analytical investigations, is adopted as a solution to the biases of a solely qualitative approach. The 3D geometric analysis of the figurines focuses on the objects’ shape and components, mode of manufacture, level of expertise, specialisation or skills of the craftsman and production techniques. The analysis leads to the creation of classes of artefacts which allow archaeologists to formulate hypotheses on the production process, identify a common production (e.g., same hand, same workshop) and establish a relative chronological sequence. 3D reconstruction of the excavation’s area contributes to the virtual re-unification of the assemblage for its holistic study, the relative chronological dating of the figurines and the interpretation of their social and ritual purposes. The results obtained from the selected sample prove the efficacy of the proposed 3D approach and support the expansion of the analysis to the whole assemblage, and possibly initiate quantitative and systematic studies on Cypriot coroplastic production
    • …
    corecore