    Signal processing with Fourier analysis, novel algorithms and applications

    Fourier analysis is the study of the way general functions may be represented or approximated by sums of simpler trigonometric functions, also analogously known as sinusoidal modeling. The original idea of Fourier had a profound impact on mathematical analysis, physics and engineering because it diagonalizes time-invariant convolution operators. In the past signal processing was a topic that stayed almost exclusively in electrical engineering, where only the experts could cancel noise, compress and reconstruct signals. Nowadays it is almost ubiquitous, as everyone now deals with modern digital signals. Medical imaging, wireless communications and power systems of the future will experience more data processing conditions and wider range of applications requirements than the systems of today. Such systems will require more powerful, efficient and flexible signal processing algorithms that are well designed to handle such needs. No matter how advanced our hardware technology becomes we will still need intelligent and efficient algorithms to address the growing demands in signal processing. In this thesis, we investigate novel techniques to solve a suite of four fundamental problems in signal processing that have a wide range of applications. The relevant equations, literature of signal processing applications, analysis and final numerical algorithms/methods to solve them using Fourier analysis are discussed for different applications in the electrical engineering/computer science. The first four chapters cover the following topics of central importance in the field of signal processing: • Fast Phasor Estimation using Adaptive Signal Processing (Chapter 2) • Frequency Estimation from Nonuniform Samples (Chapter 3) • 2D Polar and 3D Spherical Polar Nonuniform Discrete Fourier Transform (Chapter 4) • Robust 3D registration using Spherical Polar Discrete Fourier Transform and Spherical Harmonics (Chapter 5) Even though each of these four methods discussed may seem completely disparate, the underlying motivation for more efficient processing by exploiting the Fourier domain signal structure remains the same. The main contribution of this thesis is the innovation in the analysis, synthesis, discretization of certain well known problems like phasor estimation, frequency estimation, computations of a particular non-uniform Fourier transform and signal registration on the transformed domain. We conduct propositions and evaluations of certain applications relevant algorithms such as, frequency estimation algorithm using non-uniform sampling, polar and spherical polar Fourier transform. The techniques proposed are also useful in the field of computer vision and medical imaging. From a practical perspective, the proposed algorithms are shown to improve the existing solutions in the respective fields where they are applied/evaluated. The formulation and final proposition is shown to have a variety of benefits. Future work with potentials in medical imaging, directional wavelets, volume rendering, video/3D object classifications, high dimensional registration are also discussed in the final chapter. Finally, in the spirit of reproducible research we release the implementation of these algorithms to the public using Github

    Reasoning about Geometric Object Interactions in 3D for Manipulation Action Understanding

    In order to efficiently interact with human users, intelligent agents and autonomous systems need the ability of interpreting human actions. We focus our attention on manipulation actions, wherein an agent typically grasps an object and moves it, possibly altering its physical state. Agent-object and object-object interactions during a manipulation are a defining part of the performed action itself. In this thesis, we focus on extracting semantic cues, derived from geometric object interactions in 3D space during a manipulation, that are useful for action understanding at the cognitive level. First, we introduce a simple grounding model for the most common pairwise spatial relations between objects and investigate the descriptive power of their temporal evolution for action characterization. We propose a compact, abstract action descriptor that encodes the geometric object interactions during action execution, as captured by the spatial relation dynamics. Our experiments on a diverse dataset confirm both the validity and effectiveness of our spatial relation models and the discriminative power of our representation with respect to the underlying action semantics. Second, we model and detect lower level interactions, namely object contacts and separations, viewing them as topological scene changes within a dense motion estimation setting. In addition to improving motion estimation accuracy in the challenging case of motion boundaries induced by these events, our approach shows promising performance in the explicit detection and classification of the latter. Building upon dense motion estimation and using detected contact events as an attention mechanism, we propose a bottom-up pipeline for the guided segmentation and rigid motion extraction of manipulated objects. Finally, in addition to our methodological contributions, we introduce a new open-source software library for point cloud data processing, developed for the needs of this thesis, which aims at providing an easy to use, flexible, and efficient framework for the rapid development of performant software for a range of 3D perception tasks

    Rekonstruktion und skalierbare Detektion und Verfolgung von 3D Objekten

    The task of detecting objects in images is essential for autonomous systems to categorize, comprehend and eventually navigate or manipulate its environment. Since many applications demand not only detection of objects but also the estimation of their exact poses, 3D CAD models can prove helpful since they provide means for feature extraction and hypothesis refinement. This work, therefore, explores two paths: firstly, we will look into methods to create richly-textured and geometrically accurate models of real-life objects. Using these reconstructions as a basis, we will investigate on how to improve in the domain of 3D object detection and pose estimation, focusing especially on scalability, i.e. the problem of dealing with multiple objects simultaneously.Objekterkennung in Bildern ist für ein autonomes System von entscheidender Bedeutung, um seine Umgebung zu kategorisieren, zu erfassen und schließlich zu navigieren oder zu manipulieren. Da viele Anwendungen nicht nur die Erkennung von Objekten, sondern auch die Schätzung ihrer exakten Positionen erfordern, können sich 3D-CAD-Modelle als hilfreich erweisen, da sie Mittel zur Merkmalsextraktion und Verfeinerung von Hypothesen bereitstellen. In dieser Arbeit werden daher zwei Wege untersucht: Erstens werden wir Methoden untersuchen, um strukturreiche und geometrisch genaue Modelle realer Objekte zu erstellen. Auf der Grundlage dieser Konstruktionen werden wir untersuchen, wie sich der Bereich der 3D-Objekterkennung und der Posenschätzung verbessern lässt, wobei insbesondere die Skalierbarkeit im Vordergrund steht, d.h. das Problem der gleichzeitigen Bearbeitung mehrerer Objekte

    Integrasjon av et minimalistisk sett av sensorer for kartlegging og lokalisering av landbruksroboter

    Robots have recently become ubiquitous in many aspects of daily life. For in-house applications there is vacuuming, mopping and lawn-mowing robots. Swarms of robots have been used in Amazon warehouses for several years. Autonomous driving cars, despite being set back by several safety issues, are undeniably becoming the standard of the automobile industry. Not just being useful for commercial applications, robots can perform various tasks, such as inspecting hazardous sites, taking part in search-and-rescue missions. Regardless of end-user applications, autonomy plays a crucial role in modern robots. The essential capabilities required for autonomous operations are mapping, localization and navigation. The goal of this thesis is to develop a new approach to solve the problems of mapping, localization, and navigation for autonomous robots in agriculture. This type of environment poses some unique challenges such as repetitive patterns, large-scale sparse features environments, in comparison to other scenarios such as urban/cities, where the abundance of good features such as pavements, buildings, road lanes, traffic signs, etc., exists. In outdoor agricultural environments, a robot can rely on a Global Navigation Satellite System (GNSS) to determine its whereabouts. It is often limited to the robot's activities to accessible GNSS signal areas. It would fail for indoor environments. In this case, different types of exteroceptive sensors such as (RGB, Depth, Thermal) cameras, laser scanner, Light Detection and Ranging (LiDAR) and proprioceptive sensors such as Inertial Measurement Unit (IMU), wheel-encoders can be fused to better estimate the robot's states. Generic approaches of combining several different sensors often yield superior estimation results but they are not always optimal in terms of cost-effectiveness, high modularity, reusability, and interchangeability. For agricultural robots, it is equally important for being robust for long term operations as well as being cost-effective for mass production. We tackle this challenge by exploring and selectively using a handful of sensors such as RGB-D cameras, LiDAR and IMU for representative agricultural environments. The sensor fusion algorithms provide high precision and robustness for mapping and localization while at the same time assuring cost-effectiveness by employing only the necessary sensors for a task at hand. In this thesis, we extend the LiDAR mapping and localization methods for normal urban/city scenarios to cope with the agricultural environments where the presence of slopes, vegetation, trees render the traditional approaches to fail. Our mapping method substantially reduces the memory footprint for map storing, which is important for large-scale farms. We show how to handle the localization problem in dynamic growing strawberry polytunnels by using only a stereo visual-inertial (VI) and depth sensor to extract and track only invariant features. This eliminates the need for remapping to deal with dynamic scenes. Also, for a demonstration of the minimalistic requirement for autonomous agricultural robots, we show the ability to autonomously traverse between rows in a difficult environment of zigzag-liked polytunnel using only a laser scanner. Furthermore, we present an autonomous navigation capability by using only a camera without explicitly performing mapping or localization. Finally, our mapping and localization methods are generic and platform-agnostic, which can be applied to different types of agricultural robots. All contributions presented in this thesis have been tested and validated on real robots in real agricultural environments. All approaches have been published or submitted in peer-reviewed conference papers and journal articles.Roboter har nylig blitt standard i mange deler av hverdagen. I hjemmet har vi støvsuger-, vaske- og gressklippende roboter. Svermer med roboter har blitt brukt av Amazons varehus i mange år. Autonome selvkjørende biler, til tross for å ha vært satt tilbake av sikkerhetshensyn, er udiskutabelt på vei til å bli standarden innen bilbransjen. Roboter har mer nytte enn rent kommersielt bruk. Roboter kan utføre forskjellige oppgaver, som å inspisere farlige områder og delta i leteoppdrag. Uansett hva sluttbrukeren velger å gjøre, spiller autonomi en viktig rolle i moderne roboter. De essensielle egenskapene for autonome operasjoner i landbruket er kartlegging, lokalisering og navigering. Denne type miljø gir spesielle utfordringer som repetitive mønstre og storskala miljø med få landskapsdetaljer, sammenlignet med andre steder, som urbane-/bymiljø, hvor det finnes mange landskapsdetaljer som fortau, bygninger, trafikkfelt, trafikkskilt, etc. I utendørs jordbruksmiljø kan en robot bruke Global Navigation Satellite System (GNSS) til å navigere sine omgivelser. Dette begrenser robotens aktiviteter til områder med tilgjengelig GNSS signaler. Dette vil ikke fungere i miljøer innendørs. I ett slikt tilfelle vil reseptorer mot det eksterne miljø som (RGB-, dybde-, temperatur-) kameraer, laserskannere, «Light detection and Ranging» (LiDAR) og propriopsjonære detektorer som treghetssensorer (IMU) og hjulenkodere kunne brukes sammen for å bedre kunne estimere robotens tilstand. Generisk kombinering av forskjellige sensorer fører til overlegne estimeringsresultater, men er ofte suboptimale med hensyn på kostnadseffektivitet, moduleringingsgrad og utbyttbarhet. For landbruksroboter så er det like viktig med robusthet for lang tids bruk som kostnadseffektivitet for masseproduksjon. Vi taklet denne utfordringen med å utforske og selektivt velge en håndfull sensorer som RGB-D kameraer, LiDAR og IMU for representative landbruksmiljø. Algoritmen som kombinerer sensorsignalene gir en høy presisjonsgrad og robusthet for kartlegging og lokalisering, og gir samtidig kostnadseffektivitet med å bare bruke de nødvendige sensorene for oppgaven som skal utføres. I denne avhandlingen utvider vi en LiDAR kartlegging og lokaliseringsmetode normalt brukt i urbane/bymiljø til å takle landbruksmiljø, hvor hellinger, vegetasjon og trær gjør at tradisjonelle metoder mislykkes. Vår metode reduserer signifikant lagringsbehovet for kartlagring, noe som er viktig for storskala gårder. Vi viser hvordan lokaliseringsproblemet i dynamisk voksende jordbær-polytuneller kan løses ved å bruke en stereo visuel inertiel (VI) og en dybdesensor for å ekstrahere statiske objekter. Dette eliminerer behovet å kartlegge på nytt for å klare dynamiske scener. I tillegg demonstrerer vi de minimalistiske kravene for autonome jordbruksroboter. Vi viser robotens evne til å bevege seg autonomt mellom rader i ett vanskelig miljø med polytuneller i sikksakk-mønstre ved bruk av kun en laserskanner. Videre presenterer vi en autonom navigeringsevne ved bruk av kun ett kamera uten å eksplisitt kartlegge eller lokalisere. Til slutt viser vi at kartleggings- og lokaliseringsmetodene er generiske og platform-agnostiske, noe som kan brukes med flere typer jordbruksroboter. Alle bidrag presentert i denne avhandlingen har blitt testet og validert med ekte roboter i ekte landbruksmiljø. Alle forsøk har blitt publisert eller sendt til fagfellevurderte konferansepapirer og journalartikler

    Sparse octree algorithms for scalable dense volumetric tracking and mapping

    This thesis is concerned with the problem of Simultaneous Localisation and Mapping (SLAM), the task of localising an agent within an unknown environment and at the same time building a representation of it. In particular, we tackle the fundamental scalability limitations of dense volumetric SLAM systems. We do so by proposing a highly efficient hierarchical data-structure based on octrees together with a set of algorithms to support the most compute-intensive operations in typical volumetric reconstruction pipelines. We employ our hierarchical representation in a novel dense pipeline based on occupancy probabilities. Crucially, the complete space representation encoded by the octree enables to demonstrate a fully integrated system in which tracking, mapping and occupancy queries can be performed seamlessly on a single coherent representation. While achieving accuracy either at par or better than the current state-of-the-art, we demonstrate run-time performance of at least an order of magnitude better than currently available hierarchical data-structures. Finally, we introduce a novel multi-scale reconstruction system that exploits our octree hierarchy. By adaptively selecting the appropriate scale to match the effective sensor resolution in both integration and rendering, we demonstrate better reconstruction results and tracking accuracy compared to single-resolution grids. Furthermore, we achieve much higher computational performance by propagating information up and down the tree in a lazy fashion, which allow us to reduce the computational load when updating distant surfaces. We have released our software as an open-source library, named supereight, which is freely available for the benefit of the wider community. One of the main advantages of our library is its flexibility. By carefully providing a set of algorithmic abstractions, supereight enables SLAM practitioners to freely experiment with different map representations with no intervention on the back-end library code and crucially, preserving performance. Our work has been adopted by robotics researchers in both academia and industry.Open Acces

    Semantic models of scenes and objects for service and industrial robotics

    What may seem straightforward for the human perception system is still challenging for robots. Automatically segmenting the elements with highest relevance or salience, i.e. the semantics, is non-trivial given the high level of variability in the world and the limits of vision sensors. This stands up when multiple ambiguous sources of information are available, which is the case when dealing with moving robots. This thesis leverages on the availability of contextual cues and multiple points of view to make the segmentation task easier. Four robotic applications will be presented, two designed for service robotics and two for an industrial context. Semantic models of indoor environments will be built enriching geometric reconstructions with semantic information about objects, structural elements and humans. Our approach leverages on the importance of context, the availability of multiple source of information, as well as multiple view points showing with extensive experiments on several datasets that these are all crucial elements to boost state-of-the-art performances. Furthermore, moving to applications with robots analyzing object surfaces instead of their surroundings, semantic models of Carbon Fiber Reinforced Polymers will be built augmenting geometric models with accurate measurements of superficial fiber orientations, and inner defects invisible to the human-eye. We succeeded in reaching an industrial grade accuracy making these models useful for autonomous quality inspection and process optimization. In all applications, special attention will be paid towards fast methods suitable for real robots like the two prototypes presented in this thesis

    Robotic Assembly Using 3D and 2D Computer Vision

    The content of this thesis concerns the development and evaluation of a robotic cell used for automated assembly. The automated assembly is made possible by a combination of an eye-inhand 2D camera and a stationary 3D camera used to automatically detect objects. Computer vision, kinematics and programming is the main topics of the thesis. Possible approaches to object detection has been investigated and evaluated in terms of performance. The kinematic relation between the cameras in the robotic cell and robotic manipulator movements has been described. A functioning solution has been implemented in the robotic cell at the Department of Production and Quality Engineering laboratory. Theory with significant importance to the developed solution is presented. The methods used to achieve each part of the solution is anchored in theory and presented with the decisions and guidelines made throughout the project work in order to achieve the final solution. Each part of the system is presented with associated results. The combination of these results yields a solution which proves that the methods developed to achieve automated assembly works as intended. Limitations, challenges and future possibilities and improvements for the solution is then discussed. The results from the experiments presented in this thesis demonstrates the performance of the developed system. The system fulfills the specifications defined in the problem description and is functioning as intended considering the instrumentation used

    A Multi-Sensor Fusion-Based Underwater Slam System

    This dissertation addresses the problem of real-time Simultaneous Localization and Mapping (SLAM) in challenging environments. SLAM is one of the key enabling technologies for autonomous robots to navigate in unknown environments by processing information on their on-board computational units. In particular, we study the exploration of challenging GPS-denied underwater environments to enable a wide range of robotic applications, including historical studies, health monitoring of coral reefs, underwater infrastructure inspection e.g., bridges, hydroelectric dams, water supply systems, and oil rigs. Mapping underwater structures is important in several fields, such as marine archaeology, Search and Rescue (SaR), resource management, hydrogeology, and speleology. However, due to the highly unstructured nature of such environments, navigation by human divers could be extremely dangerous, tedious, and labor intensive. Hence, employing an underwater robot is an excellent fit to build the map of the environment while simultaneously localizing itself in the map. The main contribution of this dissertation is the design and development of a real-time robust SLAM algorithm for small and large scale underwater environments. SVIn – a novel tightly-coupled keyframe-based non-linear optimization framework fusing Sonar, Visual, Inertial and water depth information with robust initialization, loop-closing, and relocalization capabilities has been presented. Introducing acoustic range information to aid the visual data, shows improved reconstruction and localization. The availability of depth information from water pressure enables a robust initialization and refines the scale factor, as well as assists to reduce the drift for the tightly-coupled integration. The complementary characteristics of these sensing v modalities provide accurate and robust localization in unstructured environments with low visibility and low visual features – as such make them the ideal choice for underwater navigation. The proposed system has been successfully tested and validated in both benchmark datasets and numerous real world scenarios. It has also been used for planning for underwater robot in the presence of obstacles. Experimental results on datasets collected with a custom-made underwater sensor suite and an autonomous underwater vehicle (AUV) Aqua2 in challenging underwater environments with poor visibility, demonstrate performance never achieved before in terms of accuracy and robustness. To aid the sparse reconstruction, a contour-based reconstruction approach utilizing the well defined edges between the well lit area and darkness has been developed. In particular, low lighting conditions, or even complete absence of natural light inside caves, results in strong lighting variations, e.g., the cone of the artificial video light intersecting underwater structures and the shadow contours. The proposed method utilizes these contours to provide additional features, resulting into a denser 3D point cloud than the usual point clouds from a visual odometry system. Experimental results in an underwater cave demonstrate the performance of our system. This enables more robust navigation of autonomous underwater vehicles using the denser 3D point cloud to detect obstacles and achieve higher resolution reconstructions

    Vehicle localization with enhanced robustness for urban automated driving

