72 research outputs found
Out-of-Core GPU Path Tracing on Large Instanced Scenes via Geometry Streaming
We present a technique for out-of-core GPU path tracing of arbitrarily large scenes that is compatible with hardware-accelerated ray-tracing. Our technique improves upon previous works by subdividing the scene spatially into streamable chunks that are loaded using a priority system that maximizes ray throughput and minimizes GPU memory usage. This allows for arbitrarily large scaling of scene complexity. Our system required under 19 minutes to render a solid color version of Disney\u27s Moana Island scene (39.3 million instances, 261.1 million unique quads, and 82.4 billion instanced quads at a resolution of 1024x429 and 1024spp on an RTX 5000 (24GB memory total, 22GB used, 13GB geometry cache, with the remainder for temporary buffers and storage) (Wald et al.). As a scalability test, our system rendered 26 Moana Island scenes without multi-level instancing (1.02 billion instances, 2.14 trillion instanced quads, ~230GB if all resident) in under 1h:28m. Compared to state-of-the-art hardware-accelerated renders of the Moana Island scene, our system can render larger scenes on a single GPU. Our system is faster than the previous out-of-core approach and is able to render larger scenes than previous in-core approaches given the same memory constraints (Hellmuth, Zellman et al, Wald)
Balancing Video Games: A Player-Driven Instrument
Video game balancing is a controversial and highly debated topic, especially between players of online games. Whether a game is sufficiently balanced greatly influences its reception, satisfaction, churn rates and success. In particular, different ideologies of balance can lead to worse player experiences than actual imbalances. This work succeeds a fine-grained investigation about the attitudes of the Guild Wars 2 community regarding balancing factors, and introduces a player-driven quantitative tool to approximate closer configurations of balance that could optimize player experience and satisfaction. After an initial evaluation, theoretical constellation outputs of this tool improved players’ perception of the balance between in-game build options – where aggregated opinions of (n = 64) players even showed benefits over individual opinions, indicating a potential “wisdom of the crowd” effect
Efficient Simulation of Spectral Light Transport in Dense Participating Media and Granular Materials
Engineering Complex Computational Ecosystems
Self-organising pervasive ecosystems of devices are set to become a major vehicle for delivering infrastructure and end-user services. The inherent complexity of such systems poses new challenges to those who want to dominate it by applying the principles of engineering.
The recent growth in number and distribution of devices with decent computational and communicational abilities, that suddenly accelerated with the massive diffusion of smartphones and tablets, is delivering a world with a much higher density of devices in space. Also, communication technologies seem to be focussing on short-range device-to-device (P2P) interactions, with technologies such as Bluetooth and Near-Field Communication gaining greater adoption.
Locality and situatedness become key to providing the best possible experience to users, and the classic model of a centralised, enormously powerful server gathering and processing data becomes less and less efficient with device density. Accomplishing complex global tasks without a centralised controller responsible of aggregating data, however, is a challenging task. In particular, there is a local-to-global issue that makes the application of engineering principles challenging at least: designing device-local programs that, through interaction, guarantee a certain global service level.
In this thesis, we first analyse the state of the art in coordination systems, then motivate the work by describing the main issues of pre-existing tools and practices and identifying the improvements that would benefit the design of such complex software ecosystems.
The contribution can be divided in three main branches. First, we introduce a novel simulation toolchain for pervasive ecosystems, designed for allowing good expressiveness still retaining high performance. Second, we leverage existing coordination models and patterns in order to create new spatial structures. Third, we introduce a novel language, based on the existing ``Field Calculus'' and integrated with the aforementioned toolchain, designed to be usable for practical aggregate programming
Master/worker parallel discrete event simulation
The execution of parallel discrete event simulation across metacomputing infrastructures is examined. A master/worker architecture for parallel discrete event simulation is proposed providing robust executions under a dynamic set of services with system-level support for fault tolerance, semi-automated client-directed load balancing, portability across heterogeneous machines, and the ability to run codes on idle or time-sharing clients without significant interaction by users. Research questions and challenges associated with issues and limitations with the work distribution paradigm, targeted computational domain, performance metrics, and the intended class of applications to be used in this context are analyzed and discussed. A portable web services approach to master/worker parallel discrete event simulation is proposed and evaluated with subsequent optimizations to increase the efficiency of large-scale simulation execution through distributed master service design and intrinsic overhead reduction. New techniques for addressing challenges associated with optimistic parallel discrete event simulation across metacomputing such as rollbacks and message unsending with an inherently different computation paradigm utilizing master services and time windows are proposed and examined. Results indicate that a master/worker approach utilizing loosely coupled resources is a viable means for high throughput parallel discrete event simulation by enhancing existing computational capacity or providing alternate execution capability for less time-critical codes.Ph.D.Committee Chair: Fujimoto, Richard; Committee Member: Bader, David; Committee Member: Perumalla, Kalyan; Committee Member: Riley, George; Committee Member: Vuduc, Richar
A Scalable Flash-Based Hardware Architecture for the Hierarchical Temporal Memory Spatial Pooler
Hierarchical temporal memory (HTM) is a biomimetic machine learning algorithm focused upon modeling the structural and algorithmic properties of the neocortex. It is comprised of two components, realizing pattern recognition of spatial and temporal data, respectively. HTM research has gained momentum in recent years, leading to both hardware and software exploration of its algorithmic formulation. Previous work on HTM has centered on addressing performance concerns; however, the memory-bound operation of HTM presents significant challenges to scalability.
In this work, a scalable flash-based storage processor unit, Flash-HTM (FHTM), is presented along with a detailed analysis of its potential scalability. FHTM leverages SSD flash technology to implement the HTM cortical learning algorithm spatial pooler. The ability for FHTM to scale with increasing model complexity is addressed with respect to design footprint, memory organization, and power efficiency. Additionally, a mathematical model of the hardware is evaluated against the MNIST dataset, yielding 91.98% classification accuracy. A fully custom layout is developed to validate the design in a TSMC 180nm process. The area and power footprints of the spatial pooler are 30.538mm2 and 5.171mW, respectively. Storage processor units have the potential to be viable platforms to support implementations of HTM at scale
Ray Tracing Gems
This book is a must-have for anyone serious about rendering in real time. With the announcement of new ray tracing APIs and hardware to support them, developers can easily create real-time applications with ray tracing as a core component. As ray tracing on the GPU becomes faster, it will play a more central role in real-time rendering. Ray Tracing Gems provides key building blocks for developers of games, architectural applications, visualizations, and more. Experts in rendering share their knowledge by explaining everything from nitty-gritty techniques that will improve any ray tracer to mastery of the new capabilities of current and future hardware. What you'll learn: The latest ray tracing techniques for developing real-time applications in multiple domains Guidance, advice, and best practices for rendering applications with Microsoft DirectX Raytracing (DXR) How to implement high-performance graphics for interactive visualizations, games, simulations, and more Who this book is for: Developers who are looking to leverage the latest APIs and GPU technology for real-time rendering and ray tracing Students looking to learn about best practices in these areas Enthusiasts who want to understand and experiment with their new GPU
Advances in Grid Computing
This book approaches the grid computing with a perspective on the latest achievements in the field, providing an insight into the current research trends and advances, and presenting a large range of innovative research papers. The topics covered in this book include resource and data management, grid architectures and development, and grid-enabled applications. New ideas employing heuristic methods from swarm intelligence or genetic algorithm and quantum encryption are considered in order to explain two main aspects of grid computing: resource management and data management. The book addresses also some aspects of grid computing that regard architecture and development, and includes a diverse range of applications for grid computing, including possible human grid computing system, simulation of the fusion reaction, ubiquitous healthcare service provisioning and complex water systems
Visualization and inspection of the geometry of particle packings
Gegenstand dieser Dissertation ist die Entwicklung von effizienten Verfahren zur Visualisierung und
Inspektion der Geometrie von Partikelmischungen. Um das Verhalten der Simulation für die
Partikelmischung besser zu verstehen und zu überwachen, sollten nicht nur die Partikel selbst, sondern auch
spezielle von den Partikeln gebildete Bereiche, die den Simulationsfortschritt und die räumliche Verteilung
von Hotspots anzeigen können, visualisiert werden können. Dies sollte auch bei großen Packungen mit
Millionen von Partikeln zumindest mit einer interaktiven Darstellungsgeschwindigkeit möglich sein. . Da
die Simulation auf der Grafikkarte (GPU) durchgeführt wird, sollten die Visualisierungstechniken die Daten
des GPU-Speichers vollständig nutzen.
Um die Qualität von trockenen Partikelmischungen wie Beton zu verbessern, wurde der
Korngrößenverteilung große Aufmerksamkeit gewidmet, die die Raumfüllungsrate hauptsächlich
beeinflusst und daher zwei der wichtigsten Eigenschaften des Betons bestimmt: die strukturelle Robustheit
und die Haltbarkeit. Anhand der Korngrößenverteilung kann die Raumfüllungsrate durch
Computersimulationen bestimmt werden, die analytischen Ansätzen in der Praxis wegen der breiten
Größenverteilung der Partikel oft überlegen sind. Eine der weit verbreiteten Simulationsmethoden ist das
Collective Rearrangement, bei dem die Partikel zunächst an zufälligen Positionen innerhalb eines Behälters
platziert werden. Später werden Überlappungen zwischen Partikeln aufgelöst, indem überlappende Partikel
voneinander weggedrückt werden. Durch geschickte Anpassung der Behältergröße während der Simulation,
kann die Collective Rearrangement-Methode am Ende eine ziemlich dichte Partikelpackung generieren.
Es ist jedoch sehr schwierig, den gesamten Simulationsprozess ohne ein interaktives Visualisierungstool zu
optimieren oder dort Fehler zu finden.
Ausgehend von der etablierten rasterisierungsbasierten Methode zum Darstellen einer großen Menge von
Kugeln, bietet diese Dissertation zunächst schnelle und pixelgenaue Methoden zur neuartigen
Visualisierung der Überlappungen und Freiräume zwischen kugelförmigen Partikeln innerhalb eines
Behälters.. Die auf Rasterisierung basierenden Verfahren funktionieren gut für kleinere Partikelpackungen
bis ca. eine Million Kugeln. Bei größeren Packungen entstehen Probleme durch die lineare Laufzeit und
den Speicherverbrauch. Zur Lösung dieses Problems werden neue Methoden mit Hilfe von Raytracing
zusammen mit zwei neuen Arten von Bounding-Volume-Hierarchien (BVHs) bereitgestellt. Diese können
den Raytracing-Prozess deutlich beschleunigen --- die erste kann die vorhandene Datenstruktur für die
Simulation wiederverwenden und die zweite ist speichereffizienter. Beide BVHs nutzen die Idee des Loose
Octree und sind die ersten ihrer Art, die die Größe von Primitiven für interaktives Raytracing mit häufig
aktualisierten Beschleunigungsdatenstrukturen berücksichtigen. Darüber hinaus können die
Visualisierungstechniken in dieser Dissertation auch angepasst werden, um Eigenschaften wie das
Volumen bestimmter Bereiche zu berechnen.
All diese Visualisierungstechniken werden dann auf den Fall nicht-sphärischer Partikel erweitert, bei denen
ein nicht-sphärisches Partikel durch ein starres System von Kugeln angenähert wird, um die vorhandene
kugelbasierte Simulation wiederverwenden zu können. Dazu wird auch eine neue GPU-basierte Methode
zum effizienten Füllen eines nicht-kugelförmigen Partikels mit polydispersen überlappenden Kugeln
vorgestellt, so dass ein Partikel mit weniger Kugeln gefüllt werden kann, ohne die Raumfüllungsrate zu
beeinträchtigen. Dies erleichtert sowohl die Simulation als auch die Visualisierung.
Basierend auf den Arbeiten in dieser Dissertation können ausgefeiltere Algorithmen entwickelt werden, um
großskalige nicht-sphärische Partikelmischungen effizienter zu visualisieren. Weiterhin kann in Zukunft
Hardware-Raytracing neuerer Grafikkarten anstelle des in dieser Dissertation eingesetzten Software-Raytracing verwendet werden. Die neuen Techniken können auch als Grundlage für die interaktive
Visualisierung anderer partikelbasierter Simulationen verwendet werden, bei denen spezielle Bereiche wie
Freiräume oder Überlappungen zwischen Partikeln relevant sind.The aim of this dissertation is to find efficient techniques for visualizing and inspecting the geometry of
particle packings. Simulations of such packings are used e.g. in material sciences to predict properties of
granular materials. To better understand and supervise the behavior of these simulations, not only the
particles themselves but also special areas formed by the particles that can show the progress of the
simulation and spatial distribution of hot spots, should be visualized. This should be possible with a frame
rate that allows interaction even for large scale packings with millions of particles. Moreover, given the
simulation is conducted in the GPU, the visualization techniques should take full use of the data in the GPU
memory.
To improve the performance of granular materials like concrete, considerable attention has been paid to the
particle size distribution, which is the main determinant for the space filling rate and therefore affects two
of the most important properties of the concrete: the structural robustness and the durability. Given the
particle size distribution, the space filling rate can be determined by computer simulations, which are often
superior to analytical approaches due to irregularities of particles and the wide range of size distribution in
practice. One of the widely adopted simulation methods is the collective rearrangement, for which particles
are first placed at random positions inside a container, later overlaps between particles will be resolved by
letting overlapped particles push away from each other to fill empty space in the container. By cleverly
adjusting the size of the container according to the process of the simulation, the collective rearrangement
method could get a pretty dense particle packing in the end. However, it is very hard to fine-tune or debug
the whole simulation process without an interactive visualization tool.
Starting from the well-established rasterization-based method to render spheres, this dissertation first
provides new fast and pixel-accurate methods to visualize the overlaps and free spaces between spherical
particles inside a container. The rasterization-based techniques perform well for small scale particle
packings but deteriorate for large scale packings due to the large memory requirements that are hard to be
approximated correctly in advance. To address this problem, new methods based on ray tracing are provided
along with two new kinds of bounding volume hierarchies (BVHs) to accelerate the ray tracing process ---
the first one can reuse the existing data structure for simulation and the second one is more memory efficient.
Both BVHs utilize the idea of loose octree and are the first of their kind to consider the size of primitives
for interactive ray tracing with frequently updated acceleration structures. Moreover, the visualization
techniques provided in this dissertation can also be adjusted to calculate properties such as volumes of the
specific areas.
All these visualization techniques are then extended to non-spherical particles, where a non-spherical
particle is approximated by a rigid system of spheres to reuse the existing simulation. To this end a new
GPU-based method is presented to fill a non-spherical particle with polydisperse possibly overlapping
spheres efficiently, so that a particle can be filled with fewer spheres without sacrificing the space filling
rate. This eases both simulation and visualization.
Based on approaches presented in this dissertation, more sophisticated algorithms can be developed to
visualize large scale non-spherical particle mixtures more efficiently. Besides, one can try to exploit the
hardware ray tracing of more recent graphic cards instead of maintaining the software ray tracing as in this
dissertation. The new techniques can also become the basis for interactively visualizing other particle-based
simulations, where special areas such as free space or overlaps between particles are of interest
- …