    Fast DEM collision checks on multicore nodes.

    Many particle simulations today rely on spherical or analytical particle shape descriptions. They find non-spherical, triangulated particle models computationally infeasible due to expensive collision detections. We propose a hybrid collision detection algorithm based upon an iterative solve of a minimisation problem that automatically falls back to a brute-force comparison-based algorithm variant if the problem is ill-posed. Such a hybrid can exploit the vector facilities of modern chips and it is well-prepared for the arising manycore era. Our approach pushes the boundary where non-analytical particle shapes and the aligning of more accurate first principle physics become manageable

    Parallel Multiscale Contact Dynamics for Rigid Non-spherical Bodies

    The simulation of large numbers of rigid bodies of non-analytical shapes or vastly varying sizes which collide with each other is computationally challenging. The fundamental problem is the identification of all contact points between all particles at every time step. In the Discrete Element Method (DEM), this is particularly difficult for particles of arbitrary geometry that exhibit sharp features (e.g. rock granulates). While most codes avoid non-spherical or non-analytical shapes due to the computational complexity, we introduce an iterative-based contact detection method for triangulated geometries. The new method is an improvement over a naive brute force approach which checks all possible geometric constellations of contact and thus exhibits a lot of execution branching. Our iterative approach has limited branching and high floating point operations per processed byte. It thus is suitable for modern Single Instruction Multiple Data (SIMD) CPU hardware. As only the naive brute force approach is robust and always yields a correct solution, we propose a hybrid solution that combines the best of the two worlds to produce fast and robust contacts. In terms of the DEM workflow, we furthermore propose a multilevel tree-based data structure strategy that holds all particles in the domain on multiple scales in grids. Grids reduce the total computational complexity of the simulation. The data structure is combined with the DEM phases to form a single touch tree-based traversal that identifies both contact points between particle pairs and introduces concurrency to the system during particle comparisons in one multiscale grid sweep. Finally, a reluctant adaptivity variant is introduced which enables us to realise an improved time stepping scheme with larger time steps than standard adaptivity while we still minimise the grid administration overhead. Four different parallelisation strategies that exploit multicore architectures are discussed for the triad of methodological ingredients. Each parallelisation scheme exhibits unique behaviour depending on the grid and particle geometry at hand. The fusion of them into a task-based parallelisation workflow yields promising speedups. Our work shows that new computer architecture can push the boundary of DEM computability but this is only possible if the right data structures and algorithms are chosen

    GPU Usage for Parallel Functions and Contacts in Modelica

    This thesis investigates two ways of incorporating GPUs in Modelica. The first by automatically generating GPU code for Modelica functions, and the second by using GPU accelerated external code for a contact handling package. Automatic parallelization of functions is desired, as it can potentially accelerate large simulations significantly. Special patterns of nested for-loops in Modelica code are recognized and translated into CUDA kernel functions. Inline integration allows a broader spectrum of models to take advantage of the parallelization, by reducing CPU-GPU transfers. The prototype has been tested and achieved a speed-up factor of up to five compared to the CPU. The contact handling package is capable of handling both complex contact behavior between arbitrarily shaped bodies and large DEM-like simulations, something which Modelica is currently lacking. Attempts to accelerate the package with GPUs were made, with partial success for the broad phase. The package uses Morton encoding for the broad phase, and the narrow phase is based on CSG intersection with BSP trees. Contact response is calculated using a volume dependent method, taking friction, damping and multiple contact points into account. The capability of the package was demonstrated by the fact that both complex contact behavior such as the inversion of the Tippe Top toy and tens of thousands of colliding spheres could be simulated.One of the key components in our modern society is the ability to simulate. By simulations, the industry can design new cars, phones, aircrafts etc., without having to go through prototype after prototype, allowing us the cheap and high-tech products most of us rely on in our day-to-day life. However, good as simulations are today there are still severe limitations on what can be simulated. Two of the largest limiting factors are how hard simulations are to design and how long they take to run. The first of these problems are tackled by the Modelica programming language, which is designed for easy set-up of simulations. We have attacked both these problems by using GPUs to speed-up Modelica simulations more than 5 times, and extending the capability of Modelica to handle collisions between objects

    A multiresolution Discrete Element Method for triangulated objects with implicit time stepping

    Simulations of many rigid bodies colliding with each other sometimes yield particularly interesting results if the colliding objects differ significantly in size and are nonspherical. The most expensive part within such a simulation code is the collision detection. We propose a family of novel multiscale collision detection algorithms that can be applied to triangulated objects within explicit and implicit time stepping methods. They are well suited to handle objects that cannot be represented by analytical shapes or assemblies of analytical objects. Inspired by multigrid methods and adaptive mesh refinement, we determine collision points iteratively over a resolution hierarchy and combine a functional minimization plus penalty parameters with the actual comparision-based geometric distance calculation. Coarse surrogate geometry representations identify “no collision” scenarios early on and otherwise yield an educated guess which triangle subsets of the next finer level might yield collisions. They prune the search tree and furthermore feed conservative contact force estimates into the iterative solve behind an implicit time stepping. Implicit time stepping and nonanalytical shapes often yield prohibitive high compute cost for rigid body simulations. Our approach reduces the object-object comparison cost algorithmically by one to two orders of magnitude. It also exhibits high vectorization efficiency due to its iterative nature

    The Peano software---parallel, automaton-based, dynamically adaptive grid traversals

    We discuss the design decisions, design alternatives, and rationale behind the third generation of Peano, a framework for dynamically adaptive Cartesian meshes derived from spacetrees. Peano ties the mesh traversal to the mesh storage and supports only one element-wise traversal order resulting from space-filling curves. The user is not free to choose a traversal order herself. The traversal can exploit regular grid subregions and shared memory as well as distributed memory systems with almost no modifications to a serial application code. We formalize the software design by means of two interacting automata—one automaton for the multiscale grid traversal and one for the application-specific algorithmic steps. This yields a callback-based programming paradigm. We further sketch the supported application types and the two data storage schemes realized before we detail high-performance computing aspects and lessons learned. Special emphasis is put on observations regarding the used programming idioms and algorithmic concepts. This transforms our report from a “one way to implement things” code description into a generic discussion and summary of some alternatives, rationale, and design decisions to be made for any tree-based adaptive mesh refinement software

    Efficient numerical methods for the simulation of particulate and liquid-solid flows

    In this work a set of efficient numerical methods for the simulation of particulate flows and virtual prototyping applications are proposed. These methods are implemented as modular components in the FEATFLOW software package which is used as the fluid flow solver. In direct particulate flow simulations the calculation of the hydrodynamic forces acting on the particles is of central importance. For this task acceleration techniques are proposed based on hierarchical spatial partitioning. For arbitrary shaped particles the usage of distance maps to rapidly process the needed geometric information is employed and analyzed. In case of collisions between the particles it is shown how these same structures can be used to efficiently handle the collision broad phase and narrow phase. The computation of collision forces in the proposed particulate flow solving scheme can be handled by several collision models. The used models are based on a constrained-based formulation which leads to a linear complementarity problem (LCP). Another approach is added into the particulate flow solver that is based on the discrete element method (DEM). This approach is suited very well to an Implementation on graphic processing units (GPU) as the particles can be handled independently and thus excellent use of the massive parallel computing capabilities of the GPU can be made. In order to extend the DEM to handle non-spherical particles or rigid bodies, an inner sphere representation of such shapes is used. Furthermore, a mesh adaptation technique to increase the numerical efficiency of the CFD-simulations is shown which is based on Laplacian smoothing with special weights. The proposed techniques are validated in various benchmark configurations or comparisons to experimental data

    Optimisation of LHCb Applications for Multi- and Manycore Job Submission

    Nowadays, the Worldwide LHC Computing Grid mainly consists of multi- and manycore processors. The thesis investigates how such resources can be used more efficiently at the example of the LHCb experiment. It analyses how to improve software in terms of memory requirements and concurrency. The research involves the implementation of a moldable job scheduler and a supervised learning algorithm which helps to better predict LHCb workloads

    Robot Navigation in Human Environments

    For the near future, we envision service robots that will help us with everyday chores in home, office, and urban environments. These robots need to work in environments that were designed for humans and they have to collaborate with humans to fulfill their tasks. In this thesis, we propose new methods for communicating, transferring knowledge, and collaborating between humans and robots in four different navigation tasks. In the first application, we investigate how automated services for giving wayfinding directions can be improved to better address the needs of the human recipients. We propose a novel method based on inverse reinforcement learning that learns from a corpus of human-written route descriptions what amount and type of information a route description should contain. By imitating the human teachers' description style, our algorithm produces new route descriptions that sound similarly natural and convey similar information content, as we show in a user study. In the second application, we investigate how robots can leverage background information provided by humans for exploring an unknown environment more efficiently. We propose an algorithm for exploiting user-provided information such as sketches or floor plans by combining a global exploration strategy based on the solution of a traveling salesman problem with a local nearest-frontier-first exploration scheme. Our experiments show that the exploration tours are significantly shorter and that our system allows the user to effectively select the areas that the robot should explore. In the second part of this thesis, we focus on humanoid robots in home and office environments. The human-like body plan allows humanoid robots to navigate in environments and operate tools that were designed for humans, making humanoid robots suitable for a wide range of applications. As localization and mapping are prerequisites for all navigation tasks, we first introduce a novel feature descriptor for RGB-D sensor data and integrate this building block into an appearance-based simultaneous localization and mapping system that we adapt and optimize for the usage on humanoid robots. Our optimized system is able to track a real Nao humanoid robot more accurately and more robustly than existing approaches. As the third application, we investigate how humanoid robots can cover known environments efficiently with their camera, for example for inspection or search tasks. We extend an existing next-best-view approach by integrating inverse reachability maps, allowing us to efficiently sample and check collision-free full-body poses. Our approach enables the robot to inspect as much of the environment as possible. In our fourth application, we extend the coverage scenario to environments that also include articulated objects that the robot has to actively manipulate to uncover obstructed regions. We introduce algorithms for navigation subtasks that run highly parallelized on graphics processing units for embedded devices. Together with a novel heuristic for estimating utility maps, our system allows to find high-utility camera poses for efficiently covering environments with articulated objects. All techniques presented in this thesis were implemented in software and thoroughly evaluated in user studies, simulations, and experiments in both artificial and real-world environments. Our approaches advance the state of the art towards universally usable robots in everyday environments.Roboternavigation in menschlichen Umgebungen In naher Zukunft erwarten wir Serviceroboter, die uns im Haushalt, im Büro und in der Stadt alltägliche Arbeiten abnehmen. Diese Roboter müssen in für Menschen gebauten Umgebungen zurechtkommen und sie müssen mit Menschen zusammenarbeiten um ihre Aufgaben zu erledigen. In dieser Arbeit schlagen wir neue Methoden für die Kommunikation, Wissenstransfer und Zusammenarbeit zwischen Menschen und Robotern bei Navigationsaufgaben in vier Anwendungen vor. In der ersten Anwendung untersuchen wir, wie automatisierte Dienste zur Generierung von Wegbeschreibungen verbessert werden können, um die Beschreibungen besser an die Bedürfnisse der Empfänger anzupassen. Wir schlagen eine neue Methode vor, die inverses bestärkendes Lernen nutzt, um aus einem Korpus von von Menschen geschriebenen Wegbeschreibungen zu lernen, wie viel und welche Art von Information eine Wegbeschreibung enthalten sollte. Indem unser Algorithmus den Stil der Wegbeschreibungen der menschlichen Lehrer imitiert, kann der Algorithmus neue Wegbeschreibungen erzeugen, die sich ähnlich natürlich anhören und einen ähnlichen Informationsgehalt vermitteln, was wir in einer Benutzerstudie zeigen. In der zweiten Anwendung untersuchen wir, wie Roboter von Menschen bereitgestellte Hintergrundinformationen nutzen können, um eine bisher unbekannte Umgebung schneller zu erkunden. Wir schlagen einen Algorithmus vor, der Hintergrundinformationen wie Gebäudegrundrisse oder Skizzen nutzt, indem er eine globale Explorationsstrategie basierend auf der Lösung eines Problems des Handlungsreisenden kombiniert mit einer lokalen Explorationsstrategie. Unsere Experimente zeigen, dass die Erkundungstouren signifikant kürzer werden und dass der Benutzer mit unserem System effektiv die zu erkundenden Regionen spezifizieren kann. Der zweite Teil dieser Arbeit konzentriert sich auf humanoide Roboter in Umgebungen zu Hause und im Büro. Der menschenähnliche Körperbau ermöglicht es humanoiden Robotern, in Umgebungen zu navigieren und Werkzeuge zu benutzen, die für Menschen gebaut wurden, wodurch humanoide Roboter für vielfältige Aufgaben einsetzbar sind. Da Lokalisierung und Kartierung Grundvoraussetzungen für alle Navigationsaufgaben sind, führen wir zunächst einen neuen Merkmalsdeskriptor für RGB-D-Sensordaten ein und integrieren diesen Baustein in ein erscheinungsbasiertes simultanes Lokalisierungs- und Kartierungsverfahren, das wir an die Besonderheiten von humanoiden Robotern anpassen und optimieren. Unser System kann die Position eines realen humanoiden Roboters genauer und robuster verfolgen, als es mit existierenden Ansätzen möglich ist. Als dritte Anwendung untersuchen wir, wie humanoide Roboter bekannte Umgebungen effizient mit ihrer Kamera abdecken können, beispielsweise zu Inspektionszwecken oder zum Suchen eines Gegenstands. Wir erweitern ein bestehendes Verfahren, das die nächstbeste Beobachtungsposition berechnet, durch inverse Erreichbarkeitskarten, wodurch wir kollisionsfreie Ganzkörperposen effizient generieren und prüfen können. Unser Ansatz ermöglicht es dem Roboter, so viel wie möglich von der Umgebung zu untersuchen. In unserer vierten Anwendung erweitern wir dieses Szenario um Umgebungen, die auch bewegbare Gegenstände enthalten, die der Roboter aktiv bewegen muss um verdeckte Regionen zu sehen. Wir führen Algorithmen für Teilprobleme ein, die hoch parallelisiert auf Grafikkarten von eingebetteten Systemen ausgeführt werden. Zusammen mit einer neuen Heuristik zur Schätzung von Nutzenkarten ermöglicht dies unserem System Beobachtungspunkte mit hohem Nutzen zu finden, um Umgebungen mit bewegbaren Objekten effizient zu inspizieren. Alle vorgestellten Techniken wurden in Software implementiert und sorgfältig evaluiert in Benutzerstudien, Simulationen und Experimenten in künstlichen und realen Umgebungen. Unsere Verfahren bringen den Stand der Forschung voran in Richtung universell einsetzbarer Roboter in alltäglichen Umgebungen

    A Contribution to Resource-Aware Architectures for Humanoid Robots

    The goal of this work is to provide building blocks for resource-aware robot architectures. The topic of these blocks are data-driven generation of context-sensitive resource models, prediction of future resource utilizations, and resource-aware computer vision and motion planning algorithms. The implementation of these algorithms is based on resource-aware concepts and methodologies originating from the Transregional Collaborative Research Center "Invasive Computing" (SFB/TR 89)
