244 research outputs found

    Parallelized Egocentric Fields for Autonomous Navigation

    Get PDF
    In this paper, we propose a general framework for local path-planning and steering that can be easily extended to perform high-level behaviors. Our framework is based on the concept of affordances: the possible ways an agent can interact with its environment. Each agent perceives the environment through a set of vector and scalar fields that are represented in the agent’s local space. This egocentric property allows us to efficiently compute a local space-time plan and has better parallel scalability than a global fields approach. We then use these perception fields to compute a fitness measure for every possible action, defined as an affordance field. The action that has the optimal value in the affordance field is the agent’s steering decision. We propose an extension to a linear space-time prediction model for dynamic collision avoidance and present our parallelization results on multicore systems. We analyze and evaluate our framework using a comprehensive suite of test cases provided in SteerBench and demonstrate autonomous virtual pedestrians that perform steering and path planning in unknown environments along with the emergence of high-level responses to never seen before situations

    EuroEXA - D2.6: Final ported application software

    Get PDF
    This document describes the ported software of the EuroEXA applications to the single CRDB testbed and it discusses the experiences extracted from porting and optimization activities that should be actively taken into account in future redesign and optimization. This document accompanies the ported application software, found in the EuroEXA private repository (https://github.com/euroexa). In particular, this document describes the status of the software for each of the EuroEXA applications, sketches the redesign and optimization strategy for each application, discusses issues and difficulties faced during the porting activities and the relative lesson learned. A few preliminary evaluation results have been presented, however the full evaluation will be discussed in deliverable 2.8

    Scalable Real-Time Rendering for Extremely Complex 3D Environments Using Multiple GPUs

    Get PDF
    In 3D visualization, real-time rendering of high-quality meshes in complex 3D environments is still one of the major challenges in computer graphics. New data acquisition techniques like 3D modeling and scanning have drastically increased the requirement for more complex models and the demand for higher display resolutions in recent years. Most of the existing acceleration techniques using a single GPU for rendering suffer from the limited GPU memory budget, the time-consuming sequential executions, and the finite display resolution. Recently, people have started building commodity workstations with multiple GPUs and multiple displays. As a result, more GPU memory is available across a distributed cluster of GPUs, more computational power is provided throughout the combination of multiple GPUs, and a higher display resolution can be achieved by connecting each GPU to a display monitor (resulting in a tiled large display configuration). However, using a multi-GPU workstation may not always give the desired rendering performance due to the imbalanced rendering workloads among GPUs and overheads caused by inter-GPU communication. In this dissertation, I contribute a multi-GPU multi-display parallel rendering approach for complex 3D environments. The approach has the capability to support a high-performance and high-quality rendering of static and dynamic 3D environments. A novel parallel load balancing algorithm is developed based on a screen partitioning strategy to dynamically balance the number of vertices and triangles rendered by each GPU. The overhead of inter-GPU communication is minimized by transferring only a small amount of image pixels rather than chunks of 3D primitives with a novel frame exchanging algorithm. The state-of-the-art parallel mesh simplification and GPU out-of-core techniques are integrated into the multi-GPU multi-display system to accelerate the rendering process

    A PHYSICS-BASED APPROACH TO MODELING WILDLAND FIRE SPREAD THROUGH POROUS FUEL BEDS

    Get PDF
    Wildfires are becoming increasingly erratic nowadays at least in part because of climate change. CFD (computational fluid dynamics)-based models with the potential of simulating extreme behaviors are gaining increasing attention as a means to predict such behavior in order to aid firefighting efforts. This dissertation describes a wildfire model based on the current understanding of wildfire physics. The model includes physics of turbulence, inhomogeneous porous fuel beds, heat release, ignition, and firebrands. A discrete dynamical system for flow in porous media is derived and incorporated into the subgrid-scale model for synthetic-velocity large-eddy simulation (LES), and a general porosity-permeability model is derived and implemented to investigate transport properties of flow through porous fuel beds. Note that these two developed models can also be applied to other situations for flow through porous media. Simulations of both grassland and forest fire spread are performed via an implicit LES code parallelized with OpenMP; the parallel performance of the algorithms are presented and discussed. The current model and numerical scheme produce reasonably correct wildfire results compared with previous wildfire experiments and simulations, but using coarser grids, and presenting complicated subgrid-scale behaviors. It is concluded that this physics-based wildfire model can be a good learning tool to examine some of the more complex wildfire behaviors, and may be predictive in the near future

    Algorithms for massively parallel generic hp-adaptive finite element methods

    Get PDF
    Efficient algorithms for the numerical solution of partial differential equations are required to solve problems on an economically viable timescale. In general, this is achieved by adapting the resolution of the discretization to the investigated problem, as well as exploiting hardware specifications. For the latter category, parallelization plays a major role for modern multi-core and multi-node architectures, especially in the context of high-performance computing. Using finite element methods, solutions are approximated by discretizing the function space of the problem with piecewise polynomials. With hp-adaptive methods, the polynomial degrees of these basis functions may vary on locally refined meshes. We present algorithms and data structures required for generic hp-adaptive finite element software applicable for both continuous and discontinuous Galerkin methods on distributed memory systems. Both function space and mesh may be adapted dynamically during the solution process. We cover details concerning the unique enumeration of degrees of freedom with continuous Galerkin methods, the communication of variable size data, and load balancing. Furthermore, we present strategies to determine the type of adaptation based on error estimation and prediction as well as smoothness estimation via the decay rate of coefficients of Fourier and Legendre series expansions. Both refinement and coarsening are considered. A reference implementation in the open-source library deal.II is provided and applied to the Laplace problem on a domain with a reentrant corner which invokes a singularity. With this example, we demonstrate the benefits of the hp-adaptive methods in terms of error convergence and show that our algorithm scales up to 49,152 MPI processes.Für die numerische Lösung partieller Differentialgleichungen sind effiziente Algorithmen erforderlich, um Probleme auf einer wirtschaftlich tragbaren Zeitskala zu lösen. Im Allgemeinen ist dies durch die Anpassung der Diskretisierungsauflösung an das untersuchte Problem sowie durch die Ausnutzung der Hardwarespezifikationen möglich. Für die letztere Kategorie spielt die Parallelisierung eine große Rolle für moderne Mehrkern- und Mehrknotenarchitekturen, insbesondere im Kontext des Hochleistungsrechnens. Mit Hilfe von Finite-Elemente-Methoden werden Lösungen durch Diskretisierung des assoziierten Funktionsraums mit stückweisen Polynomen approximiert. Bei hp-adaptiven Verfahren können die Polynomgrade dieser Basisfunktionen auf lokal verfeinerten Gittern variieren. In dieser Dissertation werden Algorithmen und Datenstrukturen vorgestellt, die für generische hp-adaptive Finite-Elemente-Software benötigt werden und sowohl für kontinuierliche als auch diskontinuierliche Galerkin-Verfahren auf Systemen mit verteiltem Speicher anwendbar sind. Sowohl der Funktionsraum als auch das Gitter können während des Lösungsprozesses dynamisch angepasst werden. Im Besonderen erläutert werden die eindeutige Nummerierung von Freiheitsgraden mit kontinuierlichen Galerkin-Verfahren, die Kommunikation von Daten variabler Größe und die Lastenverteilung. Außerdem werden Strategien zur Bestimmung des Adaptierungstyps auf der Grundlage von Fehlerschätzungen und -prognosen sowie Glattheitsschätzungen vorgestellt, die über die Zerfallsrate von Koeffizienten aus Reihenentwicklungen nach Fourier und Legendre bestimmt werden. Dabei werden sowohl Verfeinerung als auch Vergröberung berücksichtigt. Eine Referenzimplementierung erfolgt in der Open-Source-Bibliothek deal.II und wird auf das Laplace-Problem auf einem Gebiet mit einer einschneidenden Ecke angewandt, die eine Singularität aufweist. Anhand dieses Beispiels werden die Vorteile der hp-adaptiven Methoden hinsichtlich der Fehlerkonvergenz und die Skalierbarkeit der präsentierten Algorithmen auf bis zu 49.152 MPI-Prozessen demonstriert

    An Application Perspective on High-Performance Computing and Communications

    Get PDF
    We review possible and probable industrial applications of HPCC focusing on the software and hardware issues. Thirty-three separate categories are illustrated by detailed descriptions of five areas -- computational chemistry; Monte Carlo methods from physics to economics; manufacturing; and computational fluid dynamics; command and control; or crisis management; and multimedia services to client computers and settop boxes. The hardware varies from tightly-coupled parallel supercomputers to heterogeneous distributed systems. The software models span HPF and data parallelism, to distributed information systems and object/data flow parallelism on the Web. We find that in each case, it is reasonably clear that HPCC works in principle, and postulate that this knowledge can be used in a new generation of software infrastructure based on the WebWindows approach, and discussed in an accompanying paper
    • …
    corecore