400 research outputs found

    Elastic Geometric Shape Matching

    Get PDF

    Real Time Stereo Cameras System Calibration Tool and Attitude and Pose Computation with Low Cost Cameras

    Get PDF
    The Engineering in autonomous systems has many strands. The area in which this work falls, the artificial vision, has become one of great interest in multiple contexts and focuses on robotics. This work seeks to address and overcome some real difficulties encountered when developing technologies with artificial vision systems which are, the calibration process and pose computation of robots in real-time. Initially, it aims to perform real-time camera intrinsic (3.2.1) and extrinsic (3.3) stereo camera systems calibration needed to the main goal of this work, the real-time pose (position and orientation) computation of an active coloured target with stereo vision systems. Designed to be intuitive, easy-to-use and able to run under real-time applications, this work was developed for use either with low-cost and easy-to-acquire or more complex and high resolution stereo vision systems in order to compute all the parameters inherent to this same system such as the intrinsic values of each one of the cameras and the extrinsic matrices computation between both cameras. More oriented towards the underwater environments, which are very dynamic and computationally more complex due to its particularities such as light reflections. The available calibration information, whether generated by this tool or loaded configurations from other tools allows, in a simplistic way, to proceed to the calibration of an environment colorspace and the detection parameters of a specific target with active visual markers (4.1.1), useful within unstructured environments. With a calibrated system and environment, it is possible to detect and compute, in real time, the pose of a target of interest. The combination of position and orientation or attitude is referred as the pose of an object. For performance analysis and quality of the information obtained, this tools are compared with others already existent.A engenharia de sistemas autónomos actua em diversas vertentes. Uma delas, a visão artificial, em que este trabalho assenta, tornou-se uma das de maior interesse em múltiplos contextos e focos na robótica. Assim, este trabalho procura abordar e superar algumas dificuldades encontradas aquando do desenvolvimento de tecnologias baseadas na visão artificial. Inicialmente, propõe-se a fornecer ferramentas para realizar as calibrações necessárias de intrínsecos (3.2.1) e extrínsecos (3.3) de sistemas de visão stereo em tempo real para atingir o objectivo principal, uma ferramenta de cálculo da posição e orientação de um alvo activo e colorido através de sistemas de visão stereo. Desenhadas para serem intuitivas, fáceis de utilizar e capazes de operar em tempo real, estas ferramentas foram desenvolvidas tendo em vista a sua integração quer com camaras de baixo custo e aquisição fácil como com camaras mais complexas e de maior resolução. Propõem-se a realizar a calibração dos parâmetros inerentes ao sistema de visão stereo como os intrínsecos de cada uma das camaras e as matrizes de extrínsecos que relacionam ambas as camaras. Este trabalho foi orientado para utilização em meio subaquático onde se presenciam ambientes com elevada dinâmica visual e maior complexidade computacional devido `a suas particularidades como reflexões de luz e má visibilidade. Com a informação de calibração disponível, quer gerada pelas ferramentas fornecidas, quer obtida a partir de outras, pode ser carregada para proceder a uma calibração simplista do espaço de cor e dos parâmetros de deteção de um alvo específico com marcadores ativos coloridos (4.1.1). Estes marcadores são ´uteis em ambientes não estruturados. Para análise da performance e qualidade da informação obtida, as ferramentas de calibração e cálculo de pose (posição e orientação), serão comparadas com outras já existentes

    Multimedia: information representation and access

    Get PDF
    [About the book] Information retrieval (IR) is a complex human activity supported by sophisticated systems. Information science has contributed much to the design and evaluation of previous generations of IR system development and to our general understanding of how such systems should be designed and yet, due to the increasing success and diversity of IR systems, many recent textbooks concentrate on IR systems themselves and ignore the human side of searching for information. This book is the first text to provide an information science perspective on IR

    Informationsrouting, Korrespondenzfindung und Objekterkennung im Gehirn

    Get PDF
    The dissertation deals with the general problem of how the brain can establish correspondences between neural patterns stored in different cortical areas. Although an important capability in many cognitive areas like language understanding, abstract reasoning, or motor control, this thesis concentrates on invariant object recognition as application of correspondence finding. One part of the work presents a correspondence-based, neurally plausible system for face recognition. Other parts address the question of visual information routing over several stages by proposing optimal architectures for such routing ('switchyards') and deriving ontogenetic mechanisms for the growth of switchyards. Finally, the idea of multi-stage routing is united with the object recognition system introduced before, making suggestions of how the so far distinct feature-based and correspondence-based approaches to object recognition could be reconciled.Allgemein gesprochen beschäftigt sich die vorliegende Arbeit mit der Frage, wie das Gehirn Korrespondenzen zwischen Aktivitätsmustern finden kann. Dies ist ein zentrales Thema in der visuellen Objekterkennung, hat aber Bedeutung für alle Bereiche der neuronalen Datenverarbeitung vom Hören bis zum abstrakten Denken. Das Korrespondenzfinden sollte invariant gegenüber Veränderungen sein, die das Erscheinungsbild, aber nicht die Bedeutung der Muster ändern. Außerdem sollte es auch funktionieren, wenn die beiden Muster nicht direkt, sondern nur über Zwischenstationen miteinander verbunden sind. Voraussetzungen für das invariante Korrespondenzfinden zwischen Mustern sind einerseits die Existenz sinnvoller Verbindungsstrukturen, und andererseits ein prinzipieller neuronaler Mechanismus zum Finden von Korrespondenzen. Mit einem prinzipiellen Korrespondenzfindungsmechanismus befasst sich Kapitel 2 der Arbeit. Dieser beruht auf dynamischen Links zwischen den Punkten beider Muster, die durch punktuelle ähnlichkeit der Muster und globale Konsistenz mit benachbarten Links aktiviert werden. In mehrschichtigen Systemen können dynamische Links außer zur Korrespondenzfindung auch zum kontrollierten Routing von Information verwendet werden. Unter Verwendung dieser Eigenschaft wird in Kapitel 2 ein Gesichtserkennungssystem entwickelt, das invariant gegenüber Verschiebung und robust gegenüber Verformungen ist und gute Performanz auf Benchmarkdatenbanken In Kapitel 3 wird untersucht, was die sparsamste Methode ist, neuronale Muster so zu verbinden, dass es von jedem Punkt des einen Musters einen Pfad zu jedem Punkt des anderen gibt und visuelle Information von einem Muster zum anderen geroutet werden kann. Dabei wird die Gesamtmenge an benötigten neuronalen Ressourcen, also sowohl Verbindungen als auch merkmalrepräsentierende Einheiten der Zwischenschichten, minimiert. Dies führt zu mehrstufigen Strukturen mit weit gespreizten, aber dünn besetzten Verästelungen, die wir Switchyards nennen. Bei der Interpretation der Ergebnisse zeigt sich, dass Switchyards mit den qualitativen und quantitativen Gegebenheiten im Primatenhirn vereinbar sind, soweit diese bekannt sind. Kapitel 4 beschäftigt sich mit der Frage, wie solche doch recht komplizierten neuronalen Verbindungsstrukturen ontogenetisch entstehen können. Es wird ein möglicher Mechanismus vorgestellt, der auf chemischen Markern basiert. Die Marker werden von den Einheiten der untersten Schicht gebildet und diffundieren durch die entstehenden Verbindungen nach oben. Verbindungen wachsen bevorzugt zwischen Einheiten, die sehr unähnliche chemische Marker enthalten. Die resultierenden Verbindungsstrukturen sind beinahe identisch mit den in Kapitel 3 analytisch hergeleiteten Architekturen und biologisch sogar noch plausibler. Kapitel 5 führt die Ideen der vorangegangenen Kapitel zusammen, um das Korrespondenzfinden zwischen Mustern über mehrstufige Routingstrukturen hinweg zu realisieren. Es wird gezeigt, wie mit Hilfe von Switchyards Korrespondenzen zwischen normalen'' visuellen Mustern gefunden werden können, obwohl anfangs keine der einzelnen Stufen des Switchyards auf beiden Seiten Muster anliegen hat, die miteinander abgeglichen werden könnten. Im Anschluss wird das Prinzip zu einem vollständigen Erkennungssystem ausgebaut, das über mehrere Routingstufen hinweg ein gegebenes Eingangsmuster positionsinvariant einem mehrerer gespeicherter Muster zuordnen kann

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    Registration of medical images for applications in minimally invasive procedures

    Get PDF
    Il punto di partenza di questa tesi \ue8 l'analisi dei metodi allo stato dell'arte di registrazione delle immagini mediche per verificare se sono adatti ad essere utilizzati per assistere il medico durante una procedura minimamente invasiva , ad esempio una procedura percutanea eseguita manualmente o un intervento teleoperato eseguito per mezzo di un robot . La prima conclusione \ue8 che, anche se ci sono tanti lavori dedicati allo sviluppo di algoritmi di registrazione da applicare nel contesto medico, la maggior parte di essi non sono stati progettati per essere utilizzati nello scenario della sala operatoria (OR) anche perch\ue9, rispetto ad altre applicazioni , OR richiede anche la validazione, prestazioni in tempo reale e la presenza di altri strumenti . Gli algoritmi allo stato dell'arte sono basati su un iterazione in tre fasi : ottimizzazione - trasformazione - valutazione della somiglianza delle immagini registrate. In questa tesi, studiamo la fattibilit\ue0 dell'approccio in tre fasi per applicazioni OR, mostrando i limiti che tale approccio incontra nelle applicazioni che stiamo considerando. Verr\ue0 dimostrato come un metodo semplice si potrebbe utilizzare nella OR. Abbiamo poi sviluppato una teoria che \ue8 adatta a registrare grandi insiemi di dati non strutturati estratti da immagini mediche, tenendo conto dei vincoli della OR . Vista l'impossibilit\ue0 di lavorare con dati medici di tipo DICOM, verr\ue0 impiegato un metodo per registrare dataset composti da insiemi di punti non strutturati. Gli algoritmi proposti sono progettati per trovare la corrispondenza spaziale in forma chiusa tenendo conto del tipo di dati, il vincolo del tempo e la presenza di rumore e /o piccole deformazioni. La teoria e gli algoritmi che abbiamo sviluppato sono derivati dalla teoria delle forme proposta da Kendall (Kendall's shapes) e utilizza un descrittore globale della forma per calcolare le corrispondenze e la distanza tra le strutture coinvolte . Poich\ue9 la registrazione \ue8 solo una componente nelle applicazioni mediche, l' ultima parte della tesi \ue8 dedicata ad alcune applicazioni pratiche in OR che possono beneficiare della procedura di registrazione .The registration of medical images is necessary to establish spatial correspondences across two or more images. Registration is rarely the end-goal, but instead, the results of image registration are used in other tasks. The starting point of this thesis is to analyze which methods at the state of the art of image registration are suitable to be used in assisting a physician during a minimally invasive procedure, such as a percutaneous procedure performed manually or a teleoperated intervention performed by the means of a robot. The first conclusion is that, even if much previous work has been devoted to develop registration algorithms to be applied in the medical context, most of them are not designed to be used in the operating room scenario (OR) because, compared to other applications, the OR requires also a strong validation, real-time performance and the presence of other instruments. Almost all of these algorithms are based on a three phase iteration: optimize-transform-evaluate similarity. In this thesis, we study the feasibility of this three steps approach in the OR, showing the limits that such approach encounter in the applications we are considering. We investigate how could a simple method be realizable and what are the assumptions for such a method to work. We then develop a theory that is suitable to register large sets of unstructured data extracted from medical images keeping into account the constraints of the OR. The use of the whole radiologic information is not feasible in the OR context, therefore the method we are introducing registers processed dataset extracted from the original medical images. The framework we propose is designed to find the spatial correspondence in closed form keeping into account the type of the data, the real-time constraint and the presence of noise and/or small deformations. The theory and algorithms we have developed are in the framework of the shape theory proposed by Kendall (Kendall's shapes) and uses a global descriptor of the shape to compute the correspondences and the distance between shapes. Since the registration is only a component of a medical application, the last part of the thesis is dedicated to some practical applications in the OR that can benefit from the registration procedure

    Real-Time Structure and Object Aware Semantic SLAM

    Get PDF
    Simultaneous Localization And Mapping (SLAM) is one of the fundamental problems in mobile robotics and addresses the reconstruction of a previously unseen environment while simultaneously localising a mobile robot with respect to it. For visual-SLAM, the simplest representation of the map is a collection of 3D points that is sparse and efficient to compute and update, particularly for large-scale environments, however it lacks semantic information and is not useful for high-level tasks such as robotic grasping and manipulation. Although methods to compute denser representations have been proposed, these reconstructions remain equivalent to a collection of points and therefore carry no additional semantic information or relationship. Man-made environments contain many structures and objects that carry high-level semantics and can potentially act as landmarks of a SLAM map, while encapsulating semantic information as opposed to a set of points. For instance, planes are good representations for feature deprived regions, where they provide information complimentary to points and can also model dominant planar layouts of the environment with very few parameters. Furthermore, a generic representation for previously unseen objects can be used as a general landmark that carries semantics in the reconstructed map. Integrating visual semantic understanding and geometric reconstruction has been studied before, however due to various reasons, including high- level geometric entities in the SLAM framework has been restricted to a slow, offline structure-from-motion context, or high-level entities merely act as regulators for points in the map instead of independent landmarks. One of those critical reasons is the lack of proper mathematical representation for high-level landmarks and the other main reasons are the challenge of detection and tracking of these landmarks and formulating an observation model – a mapping between corresponding image observable quantities and estimated parameters of the representations. In this work, we address these challenges to achieve an online real-time SLAM framework with scalable maps consisting of both sparse points and high-level structural and semantic landmarks such as planes and objects. We explicitly target real-time performance and keep that as a beacon which influences critically the representation choice and all the modules of our SLAM system. In the context of factor graphs, we propose novel representations for structural entities as planes and general unseen and not-predefined objects as bounded dual quadrics that decompose to permit clean, fast and effective real-time implementation that is amenable to the nonlinear leastsquare formulation and respects the sparsity pattern of the SLAM problem. In this representation we are not concerned with high-fidelity reconstruction of individual objects, but rather to represent the general layout and orientation of objects in the environment. Also the minimal representations of planes is explored leading to a representation that can be constructed and updated online in a least-squares framework. Another challenge that we address in this work is to marry high-level landmark detections based on deep-learned frameworks, with geometric SLAM systems. Due to the recent success of CNN-based object detections and also depth and surface normal estimations from single image, it is feasible now to detect and estimate these semantic landmarks from single RGB images, therefore leading us seamlessly from RGB-D SLAM system to pure monocular SLAM thanks to the real-time predictions of the trained CNN and appropriate representations. Furthermore, to benefit from deep-learned priors, we incorporate high-fidelity single-image reconstructions and hallucinations of objects on top of the coarse quadrics to enrich the sparse map semantically, while constraining the shape of the coarse quadrics even more. Pertinent to our beacon, proposed landmark representations in the map also provide the potential for imposing additional constraints and priors that carry crucial semantic information about the scene, without incurring great extra computational cost. In this work, we have explored and proposed constraints such as priors on the extent and shape of the objects, point-plane regularizer, plane-plane (Manhattan assumption), and plane-object (supporting affordance) constraints. We evaluate our proposed SLAM system extensively using different input sensor modalities from RGB-D to monocular in almost all publicly available benchmarks both indoors and outdoors to show its applicability as a general-purpose SLAM solution. The extensive experiments show the efficacy of our SLAM through different comparisons and ablation studies including high-level structures and objects with imposed constraints among them in various scenarios. In particular, the estimated camera trajectories have been improved significantly in varied sequences of visual SLAM datasets and also our own captured sequences with UR5 robotic arm equipped with a depth camera. In addition to more accurate camera trajectories, our system yields enriched sparse maps with semantically meaningful planar structures and generic objects in the scene along with their mutual relationshipsThesis (Ph.D.) -- University of Adelaide, School of Computer Science, 201

    Internationales Kolloquium über Anwendungen der Informatik und Mathematik in Architektur und Bauwesen : 20. bis 22.7. 2015, Bauhaus-Universität Weimar

    Get PDF
    The 20th International Conference on the Applications of Computer Science and Mathematics in Architecture and Civil Engineering will be held at the Bauhaus University Weimar from 20th till 22nd July 2015. Architects, computer scientists, mathematicians, and engineers from all over the world will meet in Weimar for an interdisciplinary exchange of experiences, to report on their results in research, development and practice and to discuss. The conference covers a broad range of research areas: numerical analysis, function theoretic methods, partial differential equations, continuum mechanics, engineering applications, coupled problems, computer sciences, and related topics. Several plenary lectures in aforementioned areas will take place during the conference. We invite architects, engineers, designers, computer scientists, mathematicians, planners, project managers, and software developers from business, science and research to participate in the conference
    corecore