685 research outputs found

    Mobile graphics: SIGGRAPH Asia 2017 course

    Get PDF
    Peer ReviewedPostprint (published version

    Evaluating the Performance of Vulkan GLSL Compute Shaders in Real-Time Ray-Traced Audio Propagation Through 3D Virtual Environments

    Get PDF
    Real time ray tracing is a growing area of interest with applications in audio processing. However, real time audio processing comes with strict performance requirements, which parallel computing is often used to overcome. As graphics processing units (GPUs) have become more powerful and programmable, general-purpose computing on graphics processing units (GPGPU) has allowed GPUs to become extremely powerful parallel processors, leading them to become more prevalent in the domain of audio processing through platforms such as CUDA. The aim of this research was to investigate the potential of GLSL compute shaders in the domain of real time audio processing. Specifically regarding real time ray tracing tasks. To do this a number of GLSL compute shaders were created, along with a C++ Vulkan application with which to execute them. These shaders facilitate the propagation of audio, using ray tracing, through a virtual environment, and implement 3D space partitioning and ray intersection prediction in order to gauge the effectiveness of these optimisations for this task. Statistically significant results show that the GLSL compute shaders successfully propagated audio through a virtual environment, returning results to the host system in real time, within 30 milliseconds. However, while this capability was shown, significantly detailed virtual environments prevented results from being returned in real time. Indicating a potential for future research and optimisation

    The role of concurrency in an evolutionary view of programming abstractions

    Full text link
    In this paper we examine how concurrency has been embodied in mainstream programming languages. In particular, we rely on the evolutionary talking borrowed from biology to discuss major historical landmarks and crucial concepts that shaped the development of programming languages. We examine the general development process, occasionally deepening into some language, trying to uncover evolutionary lineages related to specific programming traits. We mainly focus on concurrency, discussing the different abstraction levels involved in present-day concurrent programming and emphasizing the fact that they correspond to different levels of explanation. We then comment on the role of theoretical research on the quest for suitable programming abstractions, recalling the importance of changing the working framework and the way of looking every so often. This paper is not meant to be a survey of modern mainstream programming languages: it would be very incomplete in that sense. It aims instead at pointing out a number of remarks and connect them under an evolutionary perspective, in order to grasp a unifying, but not simplistic, view of the programming languages development process

    Dynamic Volume Rendering of Functional Medical Data on Dissimilar Hardware Platforms

    Get PDF
    In the last 30 years, medical imaging has become one of the most used diagnostic tools in the medical profession. Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) technologies have become widely adopted because of their ability to capture the human body in a non-invasive manner. A volumetric dataset is a series of orthogonal 2D slices captured at a regular interval, typically along the axis of the body from the head to the feet. Volume rendering is a computer graphics technique that allows volumetric data to be visualized and manipulated as a single 3D object. Iso-surface rendering, image splatting, shear warp, texture slicing, and raycasting are volume rendering methods, each with associated advantages and disadvantages. Raycasting is widely regarded as the highest quality renderer of these methods. Originally, CT and MRI hardware was limited to providing a single 3D scan of the human body. The technology has improved to allow a set of scans capable of capturing anatomical movements like a beating heart. The capturing of anatomical data over time is referred to as functional imaging. Functional MRI (fMRI) is used to capture changes in the human body over time. While fMRI’s can be used to capture any anatomical data over time, one of the more common uses of fMRI is to capture brain activity. The fMRI scanning process is typically broken up into a time consuming high resolution anatomical scan and a series of quick low resolution scans capturing activity. The low resolution activity data is mapped onto the high resolution anatomical data to show changes over time. Academic research has advanced volume rendering and specifically fMRI volume rendering. Unfortunately, academic research is typically a one-off solution to a singular medical case or set of data, causing any advances to be problem specific as opposed to a general capability. Additionally, academic volume renderers are often designed to work on a specific device and operating system under controlled conditions. This prevents volume rendering from being used across the ever expanding number of different computing devices, such as desktops, laptops, immersive virtual reality systems, and mobile computers like phones or tablets. This research will investigate the feasibility of creating a generic software capability to perform real-time 4D volume rendering, via raycasting, on desktop, mobile, and immersive virtual reality platforms. Implementing a GPU-based 4D volume raycasting method for mobile devices will harness the power of the increasing number of mobile computational devices being used by medical professionals. Developing support for immersive virtual reality can enhance medical professionals’ interpretation of 3D physiology with the additional depth information provided by stereoscopic 3D. The results of this research will help expand the use of 4D volume rendering beyond the traditional desktop computer in the medical field. Developing the same 4D volume rendering capabilities across dissimilar platforms has many challenges. Each platform relies on their own coding languages, libraries, and hardware support. There are tradeoffs between using languages and libraries native to each platform and using a generic cross-platform system, such as a game engine. Native libraries will generally be more efficient during application run-time, but they require different coding implementations for each platform. The decision was made to use platform native languages and libraries in this research, whenever practical, in an attempt to achieve the best possible frame rates. 4D volume raycasting provides unique challenges independent of the platform. Specifically, fMRI data loading, volume animation, and multiple volume rendering. Additionally, real-time raycasting has never been successfully performed on a mobile device. Previous research relied on less computationally expensive methods, such as orthogonal texture slicing, to achieve real-time frame rates. These challenges will be addressed as the contributions of this research. The first contribution was exploring the feasibility of generic functional data input across desktop, mobile, and immersive virtual reality. To visualize 4D fMRI data it was necessary to build in the capability to read Neuroimaging Informatics Technology Initiative (NIfTI) files. The NIfTI format was designed to overcome limitations of 3D file formats like DICOM and store functional imagery with a single high-resolution anatomical scan and a set of low-resolution anatomical scans. Allowing input of the NIfTI binary data required creating custom C++ routines, as no object oriented APIs freely available for use existed. The NIfTI input code was built using C++ and the C++ Standard Library to be both light weight and cross-platform. Multi-volume rendering is another challenge of fMRI data visualization and a contribution of this work. fMRI data is typically broken into a single high-resolution anatomical volume and a series of low-resolution volumes that capture anatomical changes. Visualizing two volumes at the same time is known as multi-volume visualization. Therefore, the ability to correctly align and scale the volumes relative to each other was necessary. It was also necessary to develop a compositing method to combine data from both volumes into a single cohesive representation. Three prototype applications were built for the different platforms to test the feasibility of 4D volume raycasting. One each for desktop, mobile, and virtual reality. Although the backend implementations were required to be different between the three platforms, the raycasting functionality and features were identical. Therefore, the same fMRI dataset resulted in the same 3D visualization independent of the platform itself. Each platform uses the same NIfTI data loader and provides support for dataset coloring and windowing (tissue density manipulation). The fMRI data can be viewed changing over time by either animation through the time steps, like a movie, or using an interface slider to “scrub” through the different time steps of the data. The prototype applications data load times and frame rates were tested to determine if they achieved the real-time interaction goal. Real-time interaction was defined by achieving 10 frames per second (fps) or better, based on the work of Miller [1]. The desktop version was evaluated on a 2013 MacBook Pro running OS X 10.12 with a 2.6 GHz Intel Core i7 processor, 16 GB of RAM, and a NVIDIA GeForce GT 750M graphics card. The immersive application was tested in the C6 CAVE™, a 96 graphics node computer cluster comprised of NVIDIA Quadro 6000 graphics cards running Red Hat Enterprise Linux. The mobile application was evaluated on a 2016 9.7” iPad Pro running iOS 9.3.4. The iPad had a 64-bit Apple A9X dual core processor with 2 GB of built in memory. Two different fMRI brain activity datasets with different voxel resolutions were used as test datasets. Datasets were tested using both the 3D structural data, the 4D functional data, and a combination of the two. Frame rates for the desktop implementation were consistently above 10 fps, indicating that real-time 4D volume raycasting is possible on desktop hardware. The mobile and virtual reality platforms were able to perform real-time 3D volume raycasting consistently. This is a marked improvement for 3D mobile volume raycasting that was previously only able to achieve under one frame per second [2]. Both VR and mobile platforms were able to raycast the 4D only data at real-time frame rates, but did not consistently meet 10 fps when rendering both the 3D structural and 4D functional data simultaneously. However, 7 frames per second was the lowest frame rate recorded, indicating that hardware advances will allow consistent real-time raycasting of 4D fMRI data in the near future

    Heterogeneous Acceleration for 5G New Radio Channel Modelling Using FPGAs and GPUs

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    CUDA-Optimized GPU Acceleration of 3GPP 3D Channel Model Simulations for 5G Network Planning

    Get PDF
    Simulation of massive multiple-input multiple-output (MIMO) channel models is becoming increasingly important for testing and validation of fifth-generation new radio (5G NR) wireless networks and beyond. However, simulation performance tends to be limited when modeling a large number of antenna elements combined with a complex and realistic representation of propagation conditions. In this paper, we propose an efficient implementation of a 3rd Generation Partnership Project (3GPP) three-dimensional (3D) channel model, specifically designed for graphics processing unit (GPU) platforms, with the goal of minimizing the computational time required for channel simulation. The channel model is highly parameterized to encompass a wide range of configurations required for real-world optimized 5G NR network deployments. We use several compute unified device architecture (CUDA)-based optimization techniques to exploit the parallelism and memory hierarchy of the GPU. Experimental data show that the developed system achieves an overall speedup of about 240Ă— compared to the original C++ model executed on an Intel processor. Compared to a design previously accelerated on a datacenter-class field programmable gate array (FPGA), the GPU design has 33.3 % higher single precision performance, but for 7.5 % higher power consumption. The proposed GPU accelerator can provide fast and accurate channel simulations for 5G NR network planning and optimization

    Efficient Ray Tracing For Mobile Devices

    Get PDF
    The demand for mobile devices with higher graphics performance has increased substantially in the past few years. Most mobile applications that demand 3D graphics use commonly available frameworks, such as Unity and Unreal Engine. In mobile devices, these frameworks are built on top of OpenGL ES and use a graphics technique called rasterization, a simple concept that yields good performance without sacrificing graphic quality. However, rasterization cannot easily handle some physical phenomena of light (i.e. reflection and refraction). In order to support such effects, the graphics framework has to emulate them, thereby leading to suboptimal results in term of quality. Other techniques, such as ray tracing, do not require such emulation to be implemented, as the aforementioned phenomena of light are inherently considered. In this work, we first design and implement a 3D renderer that uses ray tracing to generate high quality graphics for mobile devices. Our implementation yielded high quality results but at a high computational cost, which impacted performance. To alleviate this problem, we then developed special algorithms and data structures to substantially improve the performance of the rendering engine. In our experiments, we achieved frame rates that were 7 to 15 times faster than the brute force approach. Being able to render high quality graphics with good performance can potentially revolutionize the mobile gaming industry. To the best of our knowledge, this has never before been implemented in commonly available devices, such as smartphones and tablets

    Lichttransportsimulation auf Spezialhardware

    Get PDF
    It cannot be denied that the developments in computer hardware and in computer algorithms strongly influence each other, with new instructions added to help with video processing, encryption, and in many other areas. At the same time, the current cap on single threaded performance and wide availability of multi-threaded processors has increased the focus on parallel algorithms. Both influences are extremely prominent in computer graphics, where the gaming and movie industries always strive for the best possible performance on the current, as well as future, hardware. In this thesis we examine the hardware-algorithm synergies in the context of ray tracing and Monte-Carlo algorithms. First, we focus on the very basic element of all such algorithms - the casting of rays through a scene, and propose a dedicated hardware unit to accelerate this common operation. Then, we examine existing and novel implementations of many Monte-Carlo rendering algorithms on massively parallel hardware, as full hardware utilization is essential for peak performance. Lastly, we present an algorithm for tackling complex interreflections of glossy materials, which is designed to utilize both powerful processing units present in almost all current computers: the Centeral Processing Unit (CPU) and the Graphics Processing Unit (GPU). These three pieces combined show that it is always important to look at hardware-algorithm mapping on all levels of abstraction: instruction, processor, and machine.Zweifelsohne beeinflussen sich Computerhardware und Computeralgorithmen gegenseitig in ihrer Entwicklung: Prozessoren bekommen neue Instruktionen, um zum Beispiel Videoverarbeitung, Verschlüsselung oder andere Anwendungen zu beschleunigen. Gleichzeitig verstärkt sich der Fokus auf parallele Algorithmen, bedingt durch die limitierte Leistung von für einzelne Threads und die inzwischen breite Verfügbarkeit von multi-threaded Prozessoren. Beide Einflüsse sind im Grafikbereich besonders stark , wo es z.B. für die Spiele- und Filmindustrie wichtig ist, die bestmögliche Leistung zu erreichen, sowohl auf derzeitiger und zukünftiger Hardware. In Rahmen dieser Arbeit untersuchen wir die Synergie von Hardware und Algorithmen anhand von Ray-Tracing- und Monte-Carlo-Algorithmen. Zuerst betrachten wir einen grundlegenden Hardware-Bausteins für alle diese Algorithmen, die Strahlenverfolgung in einer Szene, und präsentieren eine spezielle Hardware-Einheit zur deren Beschleunigung. Anschließend untersuchen wir existierende und neue Implementierungen verschiedener MonteCarlo-Algorithmen auf massiv-paralleler Hardware, wobei die maximale Auslastung der Hardware im Fokus steht. Abschließend stellen wir dann einen Algorithmus zur Berechnung von komplexen Beleuchtungseffekten bei glänzenden Materialien vor, der versucht, die heute fast überall vorhandene Kombination aus Hauptprozessor (CPU) und Grafikprozessor (GPU) optimal auszunutzen. Zusammen zeigen diese drei Aspekte der Arbeit, wie wichtig es ist, Hardware und Algorithmen auf allen Ebenen gleichzeitig zu betrachten: Auf den Ebenen einzelner Instruktionen, eines Prozessors bzw. eines gesamten Systems
    • …
    corecore