397 research outputs found

    Multi-Surface Simplex Spine Segmentation for Spine Surgery Simulation and Planning

    Get PDF
    This research proposes to develop a knowledge-based multi-surface simplex deformable model for segmentation of healthy as well as pathological lumbar spine data. It aims to provide a more accurate and robust segmentation scheme for identification of intervertebral disc pathologies to assist with spine surgery planning. A robust technique that combines multi-surface and shape statistics-aware variants of the deformable simplex model is presented. Statistical shape variation within the dataset has been captured by application of principal component analysis and incorporated during the segmentation process to refine results. In the case where shape statistics hinder detection of the pathological region, user-assistance is allowed to disable the prior shape influence during deformation. Results have been validated against user-assisted expert segmentation

    Regular Grids: An Irregular Approach to the 3D Modelling Pipeline

    Get PDF
    The 3D modelling pipeline covers the process by which a physical object is scanned to create a set of points that lay on its surface. These data are then cleaned to remove outliers or noise, and the points are reconstructed into a digital representation of the original object. The aim of this thesis is to present novel grid-based methods and provide several case studies of areas in the 3D modelling pipeline in which they may be effectively put to use. The first is a demonstration of how using a grid can allow a significant reduction in memory required to perform the reconstruction. The second is the detection of surface features (ridges, peaks, troughs, etc.) during the surface reconstruction process. The third contribution is the alignment of two meshes with zero prior knowledge. This is particularly suited to aligning two related, but not identical, models. The final contribution is the comparison of two similar meshes with support for both qualitative and quantitative outputs

    Man-made Surface Structures from Triangulated Point Clouds

    Get PDF
    Photogrammetry aims at reconstructing shape and dimensions of objects captured with cameras, 3D laser scanners or other spatial acquisition systems. While many acquisition techniques deliver triangulated point clouds with millions of vertices within seconds, the interpretation is usually left to the user. Especially when reconstructing man-made objects, one is interested in the underlying surface structure, which is not inherently present in the data. This includes the geometric shape of the object, e.g. cubical or cylindrical, as well as corresponding surface parameters, e.g. width, height and radius. Applications are manifold and range from industrial production control to architectural on-site measurements to large-scale city models. The goal of this thesis is to automatically derive such surface structures from triangulated 3D point clouds of man-made objects. They are defined as a compound of planar or curved geometric primitives. Model knowledge about typical primitives and relations between adjacent pairs of them should affect the reconstruction positively. After formulating a parametrized model for man-made surface structures, we develop a reconstruction framework with three processing steps: During a fast pre-segmentation exploiting local surface properties we divide the given surface mesh into planar regions. Making use of a model selection scheme based on minimizing the description length, this surface segmentation is free of control parameters and automatically yields an optimal number of segments. A subsequent refinement introduces a set of planar or curved geometric primitives and hierarchically merges adjacent regions based on their joint description length. A global classification and constraint parameter estimation combines the data-driven segmentation with high-level model knowledge. Therefore, we represent the surface structure with a graphical model and formulate factors based on likelihood as well as prior knowledge about parameter distributions and class probabilities. We infer the most probable setting of surface and relation classes with belief propagation and estimate an optimal surface parametrization with constraints induced by inter-regional relations. The process is specifically designed to work on noisy data with outliers and a few exceptional freeform regions not describable with geometric primitives. It yields full 3D surface structures with watertightly connected surface primitives of different types. The performance of the proposed framework is experimentally evaluated on various data sets. On small synthetically generated meshes we analyze the accuracy of the estimated surface parameters, the sensitivity w.r.t. various properties of the input data and w.r.t. model assumptions as well as the computational complexity. Additionally we demonstrate the flexibility w.r.t. different acquisition techniques on real data sets. The proposed method turns out to be accurate, reasonably fast and little sensitive to defects in the data or imprecise model assumptions.Künstliche Oberflächenstrukturen aus triangulierten Punktwolken Ein Ziel der Photogrammetrie ist die Rekonstruktion der Form und Größe von Objekten, die mit Kameras, 3D-Laserscannern und anderern räumlichen Erfassungssystemen aufgenommen wurden. Während viele Aufnahmetechniken innerhalb von Sekunden triangulierte Punktwolken mit Millionen von Punkten liefern, ist deren Interpretation gewöhnlicherweise dem Nutzer überlassen. Besonders bei der Rekonstruktion künstlicher Objekte (i.S.v. engl. man-made = „von Menschenhand gemacht“ ist man an der zugrunde liegenden Oberflächenstruktur interessiert, welche nicht inhärent in den Daten enthalten ist. Diese umfasst die geometrische Form des Objekts, z.B. quaderförmig oder zylindrisch, als auch die zugehörigen Oberflächenparameter, z.B. Breite, Höhe oder Radius. Die Anwendungen sind vielfältig und reichen von industriellen Fertigungskontrollen über architektonische Raumaufmaße bis hin zu großmaßstäbigen Stadtmodellen. Das Ziel dieser Arbeit ist es, solche Oberflächenstrukturen automatisch aus triangulierten Punktwolken von künstlichen Objekten abzuleiten. Sie sind definiert als ein Verbund ebener und gekrümmter geometrischer Primitive. Modellwissen über typische Primitive und Relationen zwischen Paaren von ihnen soll die Rekonstruktion positiv beeinflussen. Nachdem wir ein parametrisiertes Modell für künstliche Oberflächenstrukturen formuliert haben, entwickeln wir ein Rekonstruktionsverfahren mit drei Verarbeitungsschritten: Im Rahmen einer schnellen Vorsegmentierung, die lokale Oberflächeneigenschaften berücksichtigt, teilen wir die gegebene vermaschte Oberfläche in ebene Regionen. Unter Verwendung eines Schemas zur Modellauswahl, das auf der Minimierung der Beschreibungslänge beruht, ist diese Oberflächensegmentierung unabhängig von Kontrollparametern und liefert automatisch eine optimale Anzahl an Regionen. Eine anschließende Verbesserung führt eine Menge von ebenen und gekrümmten geometrischen Primitiven ein und fusioniert benachbarte Regionen hierarchisch basierend auf ihrer gemeinsamen Beschreibungslänge. Eine globale Klassifikation und bedingte Parameterschätzung verbindet die datengetriebene Segmentierung mit hochrangigem Modellwissen. Dazu stellen wir die Oberflächenstruktur in Form eines graphischen Modells dar und formulieren Faktoren basierend auf der Likelihood sowie auf apriori Wissen über die Parameterverteilungen und Klassenwahrscheinlichkeiten. Wir leiten die wahrscheinlichste Konfiguration von Flächen- und Relationsklassen mit Hilfe von Belief-Propagation ab und schätzen eine optimale Oberflächenparametrisierung mit Bedingungen, die durch die Relationen zwischen benachbarten Primitiven induziert werden. Der Prozess ist eigens für verrauschte Daten mit Ausreißern und wenigen Ausnahmeregionen konzipiert, die nicht durch geometrische Primitive beschreibbar sind. Er liefert wasserdichte 3D-Oberflächenstrukturen mit Oberflächenprimitiven verschiedener Art. Die Leistungsfähigkeit des vorgestellten Verfahrens wird an verschiedenen Datensätzen experimentell evaluiert. Auf kleinen, synthetisch generierten Oberflächen untersuchen wir die Genauigkeit der geschätzten Oberflächenparameter, die Sensitivität bzgl. verschiedener Eigenschaften der Eingangsdaten und bzgl. Modellannahmen sowie die Rechenkomplexität. Außerdem demonstrieren wir die Flexibilität bzgl. verschiedener Aufnahmetechniken anhand realer Datensätze. Das vorgestellte Rekonstruktionsverfahren erweist sich als genau, hinreichend schnell und wenig anfällig für Defekte in den Daten oder falsche Modellannahmen

    Characterisation and Classification of Hidden Conducting Security Threats using Magnetic Polarizability Tensors

    Get PDF
    The early detection of terrorist threat objects, such as guns and knives, through improved metal detection, has the potential to reduce the number of attacks and improve public safety and security. Walk through metal detectors (WTMDs) are commonly deployed for security screening purposes in applications where these attacks are of particular con-cern such as in airports, transport hubs, government buildings and at concerts. However, there is scope to improve the identification of an object’s shape and its material proper-ties. Using current techniques there is often the requirement for any metallic objects to be inspected or scanned separately before a patron may be determined to pose no threat, making the process slow. This can often lead to build ups of large queues of unscreened people waiting to be screened which becomes another security threat in itself. To improve the current method, there is considerable potential to use the fields applied and measured by a metal detector since, hidden within the field perturbation, is object characterisation information. The magnetic polarizability tensor (MPT) offers an economical characteri-sation of metallic objects and its spectral signature provides additional object character-isation information. The MPT spectral signature can be determined from measurements of the induced voltage over a range of frequencies for a hidden object. With classification in mind, it can also be computed in advance for different threat and non-threat objects, producing a dataset of these objects from which a machine learning (ML) classifier can be trained. There is also potential to generate this dataset synthetically, via the application of a method based on finite elements (FE). This concept of training an ML classifier trained on a synthetic dataset of MPT based characterisations is at the heart of this work.In this thesis, details for the production and use of a first of its kind synthetic dataset of realistic object characterisations are presented. To achieve this, first a review of re-cent developments of MPT object characterisations is provided, motivating the use of MPT spectral signatures. A problem specific, H(curl) based, hp-finite element discreti-sation is presented, which allows for the development of a reduced order model (ROM), using a projection based proper orthogonal decomposition (PODP), that benefits from a-posteriori error estimates. This allows for the rapid production of MPT spectral signatures the accuracy of which is guaranteed. This methodology is then implemented in Python, using the NGSolve finite element package, where other problem specific efficiencies are also included along with a series of additional outputs of interest, this software is then packaged and released as the open source MPT-Calculator. This methodology and software are then extensively tested by application to a series of illustrative examples. Using this software, MPT spectral signatures are then produced for a series of realistic threat and non-threat objects, creating the first of its kind synthetic dataset, which is also released as the open source MPT-Library dataset. Lastly, a series of ML classifiers are documented and applied to several supervised classification problems using this new syn-thetic dataset. A series of challenging numerical examples are included to demonstrate the success of the proposed methodology

    Algorithms for the reconstruction, analysis, repairing and enhancement of 3D urban models from multiple data sources

    Get PDF
    Over the last few years, there has been a notorious growth in the field of digitization of 3D buildings and urban environments. The substantial improvement of both scanning hardware and reconstruction algorithms has led to the development of representations of buildings and cities that can be remotely transmitted and inspected in real-time. Among the applications that implement these technologies are several GPS navigators and virtual globes such as Google Earth or the tools provided by the Institut Cartogràfic i Geològic de Catalunya. In particular, in this thesis, we conceptualize cities as a collection of individual buildings. Hence, we focus on the individual processing of one structure at a time, rather than on the larger-scale processing of urban environments. Nowadays, there is a wide diversity of digitization technologies, and the choice of the appropriate one is key for each particular application. Roughly, these techniques can be grouped around three main families: - Time-of-flight (terrestrial and aerial LiDAR). - Photogrammetry (street-level, satellite, and aerial imagery). - Human-edited vector data (cadastre and other map sources). Each of these has its advantages in terms of covered area, data quality, economic cost, and processing effort. Plane and car-mounted LiDAR devices are optimal for sweeping huge areas, but acquiring and calibrating such devices is not a trivial task. Moreover, the capturing process is done by scan lines, which need to be registered using GPS and inertial data. As an alternative, terrestrial LiDAR devices are more accessible but cover smaller areas, and their sampling strategy usually produces massive point clouds with over-represented plain regions. A more inexpensive option is street-level imagery. A dense set of images captured with a commodity camera can be fed to state-of-the-art multi-view stereo algorithms to produce realistic-enough reconstructions. One other advantage of this approach is capturing high-quality color data, whereas the geometric information is usually lacking. In this thesis, we analyze in-depth some of the shortcomings of these data-acquisition methods and propose new ways to overcome them. Mainly, we focus on the technologies that allow high-quality digitization of individual buildings. These are terrestrial LiDAR for geometric information and street-level imagery for color information. Our main goal is the processing and completion of detailed 3D urban representations. For this, we will work with multiple data sources and combine them when possible to produce models that can be inspected in real-time. Our research has focused on the following contributions: - Effective and feature-preserving simplification of massive point clouds. - Developing normal estimation algorithms explicitly designed for LiDAR data. - Low-stretch panoramic representation for point clouds. - Semantic analysis of street-level imagery for improved multi-view stereo reconstruction. - Color improvement through heuristic techniques and the registration of LiDAR and imagery data. - Efficient and faithful visualization of massive point clouds using image-based techniques.Durant els darrers anys, hi ha hagut un creixement notori en el camp de la digitalització d'edificis en 3D i entorns urbans. La millora substancial tant del maquinari d'escaneig com dels algorismes de reconstrucció ha portat al desenvolupament de representacions d'edificis i ciutats que es poden transmetre i inspeccionar remotament en temps real. Entre les aplicacions que implementen aquestes tecnologies hi ha diversos navegadors GPS i globus virtuals com Google Earth o les eines proporcionades per l'Institut Cartogràfic i Geològic de Catalunya. En particular, en aquesta tesi, conceptualitzem les ciutats com una col·lecció d'edificis individuals. Per tant, ens centrem en el processament individual d'una estructura a la vegada, en lloc del processament a gran escala d'entorns urbans. Avui en dia, hi ha una àmplia diversitat de tecnologies de digitalització i la selecció de l'adequada és clau per a cada aplicació particular. Aproximadament, aquestes tècniques es poden agrupar en tres famílies principals: - Temps de vol (LiDAR terrestre i aeri). - Fotogrametria (imatges a escala de carrer, de satèl·lit i aèries). - Dades vectorials editades per humans (cadastre i altres fonts de mapes). Cadascun d'ells presenta els seus avantatges en termes d'àrea coberta, qualitat de les dades, cost econòmic i esforç de processament. Els dispositius LiDAR muntats en avió i en cotxe són òptims per escombrar àrees enormes, però adquirir i calibrar aquests dispositius no és una tasca trivial. A més, el procés de captura es realitza mitjançant línies d'escaneig, que cal registrar mitjançant GPS i dades inercials. Com a alternativa, els dispositius terrestres de LiDAR són més accessibles, però cobreixen àrees més petites, i la seva estratègia de mostreig sol produir núvols de punts massius amb regions planes sobrerepresentades. Una opció més barata són les imatges a escala de carrer. Es pot fer servir un conjunt dens d'imatges capturades amb una càmera de qualitat mitjana per obtenir reconstruccions prou realistes mitjançant algorismes estèreo d'última generació per produir. Un altre avantatge d'aquest mètode és la captura de dades de color d'alta qualitat. Tanmateix, la informació geomètrica resultant sol ser de baixa qualitat. En aquesta tesi, analitzem en profunditat algunes de les mancances d'aquests mètodes d'adquisició de dades i proposem noves maneres de superar-les. Principalment, ens centrem en les tecnologies que permeten una digitalització d'alta qualitat d'edificis individuals. Es tracta de LiDAR terrestre per obtenir informació geomètrica i imatges a escala de carrer per obtenir informació sobre colors. El nostre objectiu principal és el processament i la millora de representacions urbanes 3D amb molt detall. Per a això, treballarem amb diverses fonts de dades i les combinarem quan sigui possible per produir models que es puguin inspeccionar en temps real. La nostra investigació s'ha centrat en les següents contribucions: - Simplificació eficaç de núvols de punts massius, preservant detalls d'alta resolució. - Desenvolupament d'algoritmes d'estimació normal dissenyats explícitament per a dades LiDAR. - Representació panoràmica de baixa distorsió per a núvols de punts. - Anàlisi semàntica d'imatges a escala de carrer per millorar la reconstrucció estèreo de façanes. - Millora del color mitjançant tècniques heurístiques i el registre de dades LiDAR i imatge. - Visualització eficient i fidel de núvols de punts massius mitjançant tècniques basades en imatges

    Human perception-oriented segmentation for triangle meshes

    Get PDF
    A segmentação de malhas é um tópico importante de investigação em computação gráfica, em particular em modelação geométrica. Isto deve-se ao facto de as técnicas de segmentaçãodemalhasteremváriasaplicações,nomeadamentenaproduçãodefilmes, animaçãoporcomputador, realidadevirtual, compressãodemalhas, assimcomoemjogosdigitais. Emconcreto, asmalhastriangularessãoamplamenteusadasemaplicações interativas, visto que sua segmentação em partes significativas (também designada por segmentação significativa, segmentação perceptiva ou segmentação perceptualmente significativa ) é muitas vezes vista como uma forma de acelerar a interação com o utilizador ou a deteção de colisões entre esses objetos 3D definidos por uma malha, bem como animar uma ou mais partes significativas (por exemplo, a cabeça de uma personagem) de um dado objeto, independentemente das restantes partes. Acontece que não se conhece nenhuma técnica capaz de segmentar correctamente malhas arbitrárias −ainda que restritas aos domínios de formas livres e não-livres− em partes significativas. Algumas técnicas são mais adequadas para objetos de forma não-livre (por exemplo, peças mecânicas definidas geometricamente por quádricas), enquanto outras são mais talhadas para o domínio dos objectos de forma livre. Só na literatura recente surgem umas poucas técnicas que se aplicam a todo o universo de objetos de forma livre e não-livre. Pior ainda é o facto de que a maioria das técnicas de segmentação não serem totalmente automáticas, no sentido de que quase todas elas exigem algum tipo de pré-requisitos e assistência do utilizador. Resumindo, estes três desafios relacionados com a proximidade perceptual, generalidade e automação estão no cerne do trabalho descrito nesta tese. Para enfrentar estes desafios, esta tese introduz o primeiro algoritmo de segmentação baseada nos contornos ou fronteiras dos segmentos, cuja técnica se inspira nas técnicas de segmentação baseada em arestas, tão comuns em análise e processamento de imagem,porcontraposiçãoàstécnicasesegmentaçãobaseadaemregiões. Aideiaprincipal é a de encontrar em primeiro lugar a fronteira de cada região para, em seguida, identificar e agrupar todos os seus triângulos internos. As regiões da malha encontradas correspondem a saliências e reentrâncias, que não precisam de ser estritamente convexas, nem estritamente côncavas, respectivamente. Estas regiões, designadas regiões relaxadamenteconvexas(ousaliências)eregiõesrelaxadamentecôncavas(oureentrâncias), produzem segmentações que são menos sensíveis ao ruído e, ao mesmo tempo, são mais intuitivas do ponto de vista da perceção humana; por isso, é designada por segmentação orientada à perceção humana (ou, human perception- oriented (HPO), do inglês). Além disso, e ao contrário do atual estado-da-arte da segmentação de malhas, a existência destas regiões relaxadas torna o algoritmo capaz de segmentar de maneira bastante plausível tanto objectos de forma não-livre como objectos de forma livre. Nesta tese, enfrentou-se também um quarto desafio, que está relacionado com a fusão de segmentação e multi-resolução de malhas. Em boa verdade, já existe na literatura uma variedade grande de técnicas de segmentação, bem como um número significativo de técnicas de multi-resolução, para malhas triangulares. No entanto, não é assim tão comum encontrar estruturas de dados e algoritmos que façam a fusão ou a simbiose destes dois conceitos, multi-resolução e segmentação, num único esquema multi-resolução que sirva os propósitos das aplicações que lidam com malhas simples e segmentadas, sendo que neste contexto se entende que uma malha simples é uma malha com um único segmento. Sendo assim, nesta tese descreve-se um novo esquema (entenda-seestruturasdedadosealgoritmos)demulti-resoluçãoesegmentação,designado por extended Ghost Cell (xGC). Este esquema preserva a forma das malhas, tanto em termos globais como locais, ou seja, os segmentos da malha e as suas fronteiras, bem como os seus vincos e ápices são preservados, não importa o nível de resolução que usamos durante a/o simplificação/refinamento da malha. Além disso, ao contrário de outros esquemas de segmentação, tornou-se possível ter segmentos adjacentes com dois ou mais níveis de resolução de diferença. Isto é particularmente útil em animação por computador, compressão e transmissão de malhas, operações de modelação geométrica, visualização científica e computação gráfica. Em suma, esta tese apresenta um esquema genérico, automático, e orientado à percepção humana, que torna possível a simbiose dos conceitos de segmentação e multiresolução de malhas trianguladas que sejam representativas de objectos 3D.The mesh segmentation is an important topic in computer graphics, in particular in geometric computing. This is so because mesh segmentation techniques find many applications in movies, computer animation, virtual reality, mesh compression, and games. Infact, trianglemeshesarewidelyusedininteractiveapplications, sothattheir segmentation in meaningful parts (i.e., human-perceptually segmentation, perceptive segmentationormeaningfulsegmentation)isoftenseenasawayofspeedinguptheuser interaction, detecting collisions between these mesh-covered objects in a 3D scene, as well as animating one or more meaningful parts (e.g., the head of a humanoid) independently of the other parts of a given object. It happens that there is no known technique capable of correctly segmenting any mesh into meaningful parts. Some techniques are more adequate for non-freeform objects (e.g., quadricmechanicalparts), whileothersperformbetterinthedomainoffreeform objects. Only recently, some techniques have been developed for the entire universe of objects and shapes. Even worse it is the fact that most segmentation techniques are not entirely automated in the sense that almost all techniques require some sort of pre-requisites and user assistance. Summing up, these three challenges related to perceptual proximity, generality and automation are at the core of the work described in this thesis. In order to face these challenges, we have developed the first contour-based mesh segmentation algorithm that we may find in the literature, which is inspired in the edgebased segmentation techniques used in image analysis, as opposite to region-based segmentation techniques. Its leading idea is to firstly find the contour of each region, and then to identify and collect all of its inner triangles. The encountered mesh regions correspond to ups and downs, which do not need to be strictly convex nor strictly concave, respectively. These regions, called relaxedly convex regions (or saliences) and relaxedly concave regions (or recesses), produce segmentations that are less-sensitive to noise and, at the same time, are more intuitive from the human point of view; hence it is called human perception- oriented (HPO) segmentation. Besides, and unlike the current state-of-the-art in mesh segmentation, the existence of these relaxed regions makes the algorithm suited to both non-freeform and freeform objects. In this thesis, we have also tackled a fourth challenge, which is related with the fusion of mesh segmentation and multi-resolution. Truly speaking, a plethora of segmentation techniques, as well as a number of multiresolution techniques, for triangle meshes already exist in the literature. However, it is not so common to find algorithms and data structures that fuse these two concepts, multiresolution and segmentation, into a symbiotic multi-resolution scheme for both plain and segmented meshes, in which a plainmeshisunderstoodasameshwithasinglesegment. So, weintroducesuchanovel multiresolution segmentation scheme, called extended Ghost Cell (xGC) scheme. This scheme preserves the shape of the meshes in both global and local terms, i.e., mesh segments and their boundaries, as well as creases and apices are preserved, no matter the level of resolution we use for simplification/refinement of the mesh. Moreover, unlike other segmentation schemes, it was made possible to have adjacent segments with two or more resolution levels of difference. This is particularly useful in computer animation, mesh compression and transmission, geometric computing, scientific visualization, and computer graphics. In short, this thesis presents a fully automatic, general, and human perception-oriented scheme that symbiotically integrates the concepts of mesh segmentation and multiresolution

    Geometric data understanding : deriving case specific features

    Get PDF
    There exists a tradition using precise geometric modeling, where uncertainties in data can be considered noise. Another tradition relies on statistical nature of vast quantity of data, where geometric regularity is intrinsic to data and statistical models usually grasp this level only indirectly. This work focuses on point cloud data of natural resources and the silhouette recognition from video input as two real world examples of problems having geometric content which is intangible at the raw data presentation. This content could be discovered and modeled to some degree by such machine learning (ML) approaches like deep learning, but either a direct coverage of geometry in samples or addition of special geometry invariant layer is necessary. Geometric content is central when there is a need for direct observations of spatial variables, or one needs to gain a mapping to a geometrically consistent data representation, where e.g. outliers or noise can be easily discerned. In this thesis we consider transformation of original input data to a geometric feature space in two example problems. The first example is curvature of surfaces, which has met renewed interest since the introduction of ubiquitous point cloud data and the maturation of the discrete differential geometry. Curvature spectra can characterize a spatial sample rather well, and provide useful features for ML purposes. The second example involves projective methods used to video stereo-signal analysis in swimming analytics. The aim is to find meaningful local geometric representations for feature generation, which also facilitate additional analysis based on geometric understanding of the model. The features are associated directly to some geometric quantity, and this makes it easier to express the geometric constraints in a natural way, as shown in the thesis. Also, the visualization and further feature generation is much easier. Third, the approach provides sound baseline methods to more traditional ML approaches, e.g. neural network methods. Fourth, most of the ML methods can utilize the geometric features presented in this work as additional features.Geometriassa käytetään perinteisesti tarkkoja malleja, jolloin datassa esiintyvät epätarkkuudet edustavat melua. Toisessa perinteessä nojataan suuren datamäärän tilastolliseen luonteeseen, jolloin geometrinen säännönmukaisuus on datan sisäsyntyinen ominaisuus, joka hahmotetaan tilastollisilla malleilla ainoastaan epäsuorasti. Tämä työ keskittyy kahteen esimerkkiin: luonnonvaroja kuvaaviin pistepilviin ja videohahmontunnistukseen. Nämä ovat todellisia ongelmia, joissa geometrinen sisältö on tavoittamattomissa raakadatan tasolla. Tämä sisältö voitaisiin jossain määrin löytää ja mallintaa koneoppimisen keinoin, esim. syväoppimisen avulla, mutta joko geometria pitää kattaa suoraan näytteistämällä tai tarvitaan neuronien lisäkerros geometrisia invariansseja varten. Geometrinen sisältö on keskeinen, kun tarvitaan suoraa avaruudellisten suureiden havainnointia, tai kun tarvitaan kuvaus geometrisesti yhtenäiseen dataesitykseen, jossa poikkeavat näytteet tai melu voidaan helposti erottaa. Tässä työssä tarkastellaan datan muuntamista geometriseen piirreavaruuteen kahden esimerkkiohjelman suhteen. Ensimmäinen esimerkki on pintakaarevuus, joka on uudelleen virinneen kiinnostuksen kohde kaikkialle saatavissa olevan datan ja diskreetin geometrian kypsymisen takia. Kaarevuusspektrit voivat luonnehtia avaruudellista kohdetta melko hyvin ja tarjota koneoppimisessa hyödyllisiä piirteitä. Toinen esimerkki koskee projektiivisia menetelmiä käytettäessä stereovideosignaalia uinnin analytiikkaan. Tavoite on löytää merkityksellisiä paikallisen geometrian esityksiä, jotka samalla mahdollistavat muun geometrian ymmärrykseen perustuvan analyysin. Piirteet liittyvät suoraan johonkin geometriseen suureeseen, ja tämä helpottaa luonnollisella tavalla geometristen rajoitteiden käsittelyä, kuten väitöstyössä osoitetaan. Myös visualisointi ja lisäpiirteiden luonti muuttuu helpommaksi. Kolmanneksi, lähestymistapa suo selkeän vertailumenetelmän perinteisemmille koneoppimisen lähestymistavoille, esim. hermoverkkomenetelmille. Neljänneksi, useimmat koneoppimismenetelmät voivat hyödyntää tässä työssä esitettyjä geometrisia piirteitä lisäämällä ne muiden piirteiden joukkoon

    Few-Shot Keypoint Detection as Task Adaptation via Latent Embeddings

    Get PDF
    Dense object tracking, the ability to localize specific object points with pixel-level accuracy, is an important computer vision task with numerous downstream applications in robotics. Existing approaches either compute dense keypoint embeddings in a single forward pass, meaning the model is trained to track everything at once, or allocate their full capacity to a sparse predefined set of points, trading generality for accuracy. In this paper we explore a middle ground based on the observation that the number of relevant points at a given time are typically relatively few, e.g. grasp points on a target object. Our main contribution is a novel architecture, inspired by few-shot task adaptation, which allows a sparse-style network to condition on a keypoint embedding that indicates which point to track. Our central finding is that this approach provides the generality of dense-embedding models, while offering accuracy significantly closer to sparse-keypoint approaches. We present results illustrating this capacity vs. accuracy trade-off, and demonstrate the ability to zero-shot transfer to new object instances (within-class) using a real-robot pick-and-place task

    Autonomous Exploration of Large-Scale Natural Environments

    Get PDF
    This thesis addresses issues which arise when using robotic platforms to explore large-scale, natural environments. Two main problems are identified: the volume of data collected by autonomous platforms and the complexity of planning surveys in large environments. Autonomous platforms are able to rapidly accumulate large data sets. The volume of data that must be processed is often too large for human experts to analyse exhaustively in a practical amount of time or in a cost-effective manner. This burden can create a bottleneck in the process of converting observations into scientifically relevant data. Although autonomous platforms can collect precisely navigated, high-resolution data, they are typically limited by finite battery capacities, data storage and computational resources. Deployments are also limited by project budgets and time frames. These constraints make it impractical to sample large environments exhaustively. To use the limited resources effectively, trajectories which maximise the amount of information gathered from the environment must be designed. This thesis addresses these problems. Three primary contributions are presented: a new classifier designed to accept probabilistic training targets rather than discrete training targets; a semi-autonomous pipeline for creating models of the environment; and an offline method for autonomously planning surveys. These contributions allow large data sets to be processed with minimal human intervention and promote efficient allocation of resources. In this thesis environmental models are established by learning the correlation between data extracted from a digital elevation model (DEM) of the seafloor and habitat categories derived from in-situ images. The DEM of the seafloor is collected using ship-borne multibeam sonar and the in-situ images are collected using an autonomous underwater vehicle (AUV). While the thesis specifically focuses on mapping and exploring marine habitats with an AUV, the research applies equally to other applications such as aerial and terrestrial environmental monitoring and planetary exploration
    corecore