78 research outputs found

    Real-time 3-D Mapping with Estimating Acoustic Materials

    Full text link
    This paper proposes a real-time system integrating an acoustic material estimation from visual appearance and an on-the-fly mapping in the 3-dimension. The proposed method estimates the acoustic materials of surroundings in indoor scenes and incorporates them to a 3-D occupancy map, as a robot moves around the environment. To estimate the acoustic material from the visual cue, we apply the state-of-the-art semantic segmentation CNN network based on the assumption that the visual appearance and the acoustic materials have a strong association. Furthermore, we introduce an update policy to handle the material estimations during the online mapping process. As a result, our environment map with acoustic material can be used for sound-related robotics applications, such as sound source localization taking into account various acoustic propagation (e.g., reflection)

    ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields

    Full text link
    We present ESLAM, an efficient implicit neural representation method for Simultaneous Localization and Mapping (SLAM). ESLAM reads RGB-D frames with unknown camera poses in a sequential manner and incrementally reconstructs the scene representation while estimating the current camera position in the scene. We incorporate the latest advances in Neural Radiance Fields (NeRF) into a SLAM system, resulting in an efficient and accurate dense visual SLAM method. Our scene representation consists of multi-scale axis-aligned perpendicular feature planes and shallow decoders that, for each point in the continuous space, decode the interpolated features into Truncated Signed Distance Field (TSDF) and RGB values. Our extensive experiments on two standard and recent datasets, Replica and ScanNet, show that ESLAM improves the accuracy of 3D reconstruction and camera localization of state-of-the-art dense visual SLAM methods by more than 50%, while it runs up to Ă—\times10 faster and does not require any pre-training.Comment: Project page: https://www.idiap.ch/paper/eslam

    Efficient From-Point Visibility for Global Illumination in Virtual Scenes with Participating Media

    Get PDF
    Sichtbarkeitsbestimmung ist einer der fundamentalen Bausteine fotorealistischer Bildsynthese. Da die Berechnung der Sichtbarkeit allerdings äußerst kostspielig zu berechnen ist, wird nahezu die gesamte Berechnungszeit darauf verwendet. In dieser Arbeit stellen wir neue Methoden zur Speicherung, Berechnung und Approximation von Sichtbarkeit in Szenen mit streuenden Medien vor, die die Berechnung erheblich beschleunigen, dabei trotzdem qualitativ hochwertige und artefaktfreie Ergebnisse liefern

    Real-time GPU-accelerated Out-of-Core Rendering and Light-field Display Visualization for Improved Massive Volume Understanding

    Get PDF
    Nowadays huge digital models are becoming increasingly available for a number of different applications ranging from CAD, industrial design to medicine and natural sciences. Particularly, in the field of medicine, data acquisition devices such as MRI or CT scanners routinely produce huge volumetric datasets. Currently, these datasets can easily reach dimensions of 1024^3 voxels and datasets larger than that are not uncommon. This thesis focuses on efficient methods for the interactive exploration of such large volumes using direct volume visualization techniques on commodity platforms. To reach this goal specialized multi-resolution structures and algorithms, which are able to directly render volumes of potentially unlimited size are introduced. The developed techniques are output sensitive and their rendering costs depend only on the complexity of the generated images and not on the complexity of the input datasets. The advanced characteristics of modern GPGPU architectures are exploited and combined with an out-of-core framework in order to provide a more flexible, scalable and efficient implementation of these algorithms and data structures on single GPUs and GPU clusters. To improve visual perception and understanding, the use of novel 3D display technology based on a light-field approach is introduced. This kind of device allows multiple naked-eye users to perceive virtual objects floating inside the display workspace, exploiting the stereo and horizontal parallax. A set of specialized and interactive illustrative techniques capable of providing different contextual information in different areas of the display, as well as an out-of-core CUDA based ray-casting engine with a number of improvements over current GPU volume ray-casters are both reported. The possibilities of the system are demonstrated by the multi-user interactive exploration of 64-GVoxel datasets on a 35-MPixel light-field display driven by a cluster of PCs. ------------------------------------------------------------------------------------------------------ Negli ultimi anni si sta verificando una proliferazione sempre più consistente di modelli digitali di notevoli dimensioni in campi applicativi che variano dal CAD e la progettazione industriale alla medicina e le scienze naturali. In modo particolare, nel settore della medicina, le apparecchiature di acquisizione dei dati come RM o TAC producono comunemente dei dataset volumetrici di grosse dimensioni. Questi dataset possono facilmente raggiungere taglie dell’ordine di 10243 voxels e dataset di dimensioni maggiori possono essere frequenti. Questa tesi si focalizza su metodi efficienti per l’esplorazione di tali grossi volumi utilizzando tecniche di visualizzazione diretta su piattaforme HW di diffusione di massa. Per raggiungere tale obiettivo si introducono strutture specializzate multi-risoluzione e algoritmi in grado di visualizzare volumi di dimensioni potenzialmente infinite. Le tecniche sviluppate sono “ouput sensitive” e la loro complessità di rendering dipende soltanto dalle dimensioni delle immagini generate e non dalle dimensioni dei dataset di input. Le caratteristiche avanzate delle architetture moderne GPGPU vengono inoltre sfruttate e combinate con un framework “out-of-core” in modo da offrire una implementazione di questi algoritmi e strutture dati più flessibile, scalabile ed efficiente su singole GPU o cluster di GPU. Per migliorare la percezione visiva e la comprensione dei dati, viene introdotto inoltre l’uso di tecnologie di display 3D di nuova generazione basate su un approccio di tipo light-field. Questi tipi di dispositivi consentono a diversi utenti di percepire ad occhio nudo oggetti che galleggiano all’interno dello spazio di lavoro del display, sfruttando lo stereo e la parallasse orizzontale. Si descrivono infine un insieme di tecniche illustrative interattive in grado di fornire diverse informazioni contestuali in diverse zone del display, così come un motore di “ray-casting out-of-core” basato su CUDA e contenente una serie di miglioramenti rispetto agli attuali metodi GPU di “ray-casting” di volumi. Le possibilità del sistema sono dimostrate attraverso l’esplorazione interattiva di dataset di 64-GVoxel su un display di tipo light-field da 35-MPixel pilotato da un cluster di PC

    Towards Fully Dynamic Surface Illumination in Real-Time Rendering using Acceleration Data Structures

    Get PDF
    The improvements in GPU hardware, including hardware-accelerated ray tracing, and the push for fully dynamic realistic-looking video games, has been driving more research in the use of ray tracing in real-time applications. The work described in this thesis covers multiple aspects such as optimisations, adapting existing offline methods to real-time constraints, and adding effects which were hard to simulate without the new hardware, all working towards a fully dynamic surface illumination rendering in real-time.Our first main area of research concerns photon-based techniques, commonly used to render caustics. As many photons can be required for a good coverage of the scene, an efficient approach for detecting which ones contribute to a pixel is essential. We improve that process by adapting and extending an existing acceleration data structure; if performance is paramount, we present an approximation which trades off some quality for a 2–3× improvement in rendering time. The tracing of all the photons, and especially when long paths are needed, had become the highest cost. As most paths do not change from frame to frame, we introduce a validation procedure allowing the reuse of as many as possible, even in the presence of dynamic lights and objects. Previous algorithms for associating pixels and photons do not robustly handle specular materials, so we designed an approach leveraging ray tracing hardware to allow for caustics to be visible in mirrors or behind transparent objects.Our second research focus switches from a light-based perspective to a camera-based one, to improve the picking of light sources when shading: photon-based techniques are wonderful for caustics, but not as efficient for direct lighting estimations. When a scene has thousands of lights, only a handful can be evaluated at any given pixel due to time constraints. Current selection methods in video games are fast but at the cost of introducing bias. By adapting an acceleration data structure from offline rendering that stochastically chooses a light source based on its importance, we provide unbiased direct lighting evaluation at about 30 fps. To support dynamic scenes, we organise it in a two-level system making it possible to only update the parts containing moving lights, and in a more efficient way.We worked on top of the new ray tracing hardware to handle lighting situations that previously proved too challenging, and presented optimisations relevant for future algorithms in that space. These contributions will help in reducing some artistic constraints while designing new virtual scenes for real-time applications

    Robot Navigation in Human Environments

    Get PDF
    For the near future, we envision service robots that will help us with everyday chores in home, office, and urban environments. These robots need to work in environments that were designed for humans and they have to collaborate with humans to fulfill their tasks. In this thesis, we propose new methods for communicating, transferring knowledge, and collaborating between humans and robots in four different navigation tasks. In the first application, we investigate how automated services for giving wayfinding directions can be improved to better address the needs of the human recipients. We propose a novel method based on inverse reinforcement learning that learns from a corpus of human-written route descriptions what amount and type of information a route description should contain. By imitating the human teachers' description style, our algorithm produces new route descriptions that sound similarly natural and convey similar information content, as we show in a user study. In the second application, we investigate how robots can leverage background information provided by humans for exploring an unknown environment more efficiently. We propose an algorithm for exploiting user-provided information such as sketches or floor plans by combining a global exploration strategy based on the solution of a traveling salesman problem with a local nearest-frontier-first exploration scheme. Our experiments show that the exploration tours are significantly shorter and that our system allows the user to effectively select the areas that the robot should explore. In the second part of this thesis, we focus on humanoid robots in home and office environments. The human-like body plan allows humanoid robots to navigate in environments and operate tools that were designed for humans, making humanoid robots suitable for a wide range of applications. As localization and mapping are prerequisites for all navigation tasks, we first introduce a novel feature descriptor for RGB-D sensor data and integrate this building block into an appearance-based simultaneous localization and mapping system that we adapt and optimize for the usage on humanoid robots. Our optimized system is able to track a real Nao humanoid robot more accurately and more robustly than existing approaches. As the third application, we investigate how humanoid robots can cover known environments efficiently with their camera, for example for inspection or search tasks. We extend an existing next-best-view approach by integrating inverse reachability maps, allowing us to efficiently sample and check collision-free full-body poses. Our approach enables the robot to inspect as much of the environment as possible. In our fourth application, we extend the coverage scenario to environments that also include articulated objects that the robot has to actively manipulate to uncover obstructed regions. We introduce algorithms for navigation subtasks that run highly parallelized on graphics processing units for embedded devices. Together with a novel heuristic for estimating utility maps, our system allows to find high-utility camera poses for efficiently covering environments with articulated objects. All techniques presented in this thesis were implemented in software and thoroughly evaluated in user studies, simulations, and experiments in both artificial and real-world environments. Our approaches advance the state of the art towards universally usable robots in everyday environments.Roboternavigation in menschlichen Umgebungen In naher Zukunft erwarten wir Serviceroboter, die uns im Haushalt, im Büro und in der Stadt alltägliche Arbeiten abnehmen. Diese Roboter müssen in für Menschen gebauten Umgebungen zurechtkommen und sie müssen mit Menschen zusammenarbeiten um ihre Aufgaben zu erledigen. In dieser Arbeit schlagen wir neue Methoden für die Kommunikation, Wissenstransfer und Zusammenarbeit zwischen Menschen und Robotern bei Navigationsaufgaben in vier Anwendungen vor. In der ersten Anwendung untersuchen wir, wie automatisierte Dienste zur Generierung von Wegbeschreibungen verbessert werden können, um die Beschreibungen besser an die Bedürfnisse der Empfänger anzupassen. Wir schlagen eine neue Methode vor, die inverses bestärkendes Lernen nutzt, um aus einem Korpus von von Menschen geschriebenen Wegbeschreibungen zu lernen, wie viel und welche Art von Information eine Wegbeschreibung enthalten sollte. Indem unser Algorithmus den Stil der Wegbeschreibungen der menschlichen Lehrer imitiert, kann der Algorithmus neue Wegbeschreibungen erzeugen, die sich ähnlich natürlich anhören und einen ähnlichen Informationsgehalt vermitteln, was wir in einer Benutzerstudie zeigen. In der zweiten Anwendung untersuchen wir, wie Roboter von Menschen bereitgestellte Hintergrundinformationen nutzen können, um eine bisher unbekannte Umgebung schneller zu erkunden. Wir schlagen einen Algorithmus vor, der Hintergrundinformationen wie Gebäudegrundrisse oder Skizzen nutzt, indem er eine globale Explorationsstrategie basierend auf der Lösung eines Problems des Handlungsreisenden kombiniert mit einer lokalen Explorationsstrategie. Unsere Experimente zeigen, dass die Erkundungstouren signifikant kürzer werden und dass der Benutzer mit unserem System effektiv die zu erkundenden Regionen spezifizieren kann. Der zweite Teil dieser Arbeit konzentriert sich auf humanoide Roboter in Umgebungen zu Hause und im Büro. Der menschenähnliche Körperbau ermöglicht es humanoiden Robotern, in Umgebungen zu navigieren und Werkzeuge zu benutzen, die für Menschen gebaut wurden, wodurch humanoide Roboter für vielfältige Aufgaben einsetzbar sind. Da Lokalisierung und Kartierung Grundvoraussetzungen für alle Navigationsaufgaben sind, führen wir zunächst einen neuen Merkmalsdeskriptor für RGB-D-Sensordaten ein und integrieren diesen Baustein in ein erscheinungsbasiertes simultanes Lokalisierungs- und Kartierungsverfahren, das wir an die Besonderheiten von humanoiden Robotern anpassen und optimieren. Unser System kann die Position eines realen humanoiden Roboters genauer und robuster verfolgen, als es mit existierenden Ansätzen möglich ist. Als dritte Anwendung untersuchen wir, wie humanoide Roboter bekannte Umgebungen effizient mit ihrer Kamera abdecken können, beispielsweise zu Inspektionszwecken oder zum Suchen eines Gegenstands. Wir erweitern ein bestehendes Verfahren, das die nächstbeste Beobachtungsposition berechnet, durch inverse Erreichbarkeitskarten, wodurch wir kollisionsfreie Ganzkörperposen effizient generieren und prüfen können. Unser Ansatz ermöglicht es dem Roboter, so viel wie möglich von der Umgebung zu untersuchen. In unserer vierten Anwendung erweitern wir dieses Szenario um Umgebungen, die auch bewegbare Gegenstände enthalten, die der Roboter aktiv bewegen muss um verdeckte Regionen zu sehen. Wir führen Algorithmen für Teilprobleme ein, die hoch parallelisiert auf Grafikkarten von eingebetteten Systemen ausgeführt werden. Zusammen mit einer neuen Heuristik zur Schätzung von Nutzenkarten ermöglicht dies unserem System Beobachtungspunkte mit hohem Nutzen zu finden, um Umgebungen mit bewegbaren Objekten effizient zu inspizieren. Alle vorgestellten Techniken wurden in Software implementiert und sorgfältig evaluiert in Benutzerstudien, Simulationen und Experimenten in künstlichen und realen Umgebungen. Unsere Verfahren bringen den Stand der Forschung voran in Richtung universell einsetzbarer Roboter in alltäglichen Umgebungen
    • …
    corecore