Search CORE

3,175 research outputs found

Parallel Rendering and Large Data Visualization

Author: Eilemann Stefan
Publication venue
Publication date: 01/02/2019
Field of study

We are living in the big data age: An ever increasing amount of data is being produced through data acquisition and computer simulations. While large scale analysis and simulations have received significant attention for cloud and high-performance computing, software to efficiently visualise large data sets is struggling to keep up. Visualization has proven to be an efficient tool for understanding data, in particular visual analysis is a powerful tool to gain intuitive insight into the spatial structure and relations of 3D data sets. Large-scale visualization setups are becoming ever more affordable, and high-resolution tiled display walls are in reach even for small institutions. Virtual reality has arrived in the consumer space, making it accessible to a large audience. This thesis addresses these developments by advancing the field of parallel rendering. We formalise the design of system software for large data visualization through parallel rendering, provide a reference implementation of a parallel rendering framework, introduce novel algorithms to accelerate the rendering of large amounts of data, and validate this research and development with new applications for large data visualization. Applications built using our framework enable domain scientists and large data engineers to better extract meaning from their data, making it feasible to explore more data and enabling the use of high-fidelity visualization installations to see more detail of the data.Comment: PhD thesi

arXiv.org e-Print Archive

ZORA

A Smart Move: \u3cem\u3eA System To Intelligently Model and Control Robots in Extreme Environments

Author: Singh Surya Priyadarshi
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/06/1999
Field of study

University of Tennessee, Knoxville: Trace

Interactive Medical Image Registration With Multigrid Methods and Bounded Biharmonic Functions

Author: Wu Baohua
Publication venue: ScholarlyCommons
Publication date: 01/01/2014
Field of study

Interactive image registration is important in some medical applications since automatic image registration is often slow and sometimes error-prone. We consider interactive registration methods that incorporate user-specified local transforms around control handles. The deformation between handles is interpolated by some smooth functions, minimizing some variational energies. Besides smoothness, we expect the impact of a control handle to be local. Therefore we choose bounded biharmonic weight functions to blend local transforms, a cutting-edge technique in computer graphics. However, medical images are usually huge, and this technique takes a lot of time that makes itself impracticable for interactive image registration. To expedite this process, we use a multigrid active set method to solve bounded biharmonic functions (BBF). The multigrid approach is for two scenarios, refining the active set from coarse to fine resolutions, and solving the linear systems constrained by working active sets. We\u27ve implemented both weighted Jacobi method and successive over-relaxation (SOR) in the multigrid solver. Since the problem has box constraints, we cannot directly use regular updates in Jacobi and SOR methods. Instead, we choose a descent step size and clamp the update to satisfy the box constraints. We explore the ways to choose step sizes and discuss their relation to the spectral radii of the iteration matrices. The relaxation factors, which are closely related to step sizes, are estimated by analyzing the eigenvalues of the bilaplacian matrices. We give a proof about the termination of our algorithm and provide some theoretical error bounds. Another minor problem we address is to register big images on GPU with limited memory. We\u27ve implemented an image registration algorithm with virtual image slices on GPU. An image slice is treated similarly to a page in virtual memory. We execute a wavefront of subtasks together to reduce the number of data transfers. Our main contribution is a fast multigrid method for interactive medical image registration that uses bounded biharmonic functions to blend local transforms. We report a novel multigrid approach to refine active set quickly and use clamped updates based on weighted Jacobi and SOR. This multigrid method can be used to efficiently solve other quadratic programs that have active sets distributed over continuous regions

ScholarlyCommons@Penn

Middleware services for distributed virtual environments

Author: Lu Fengyun
Publication venue: Newcastle University
Publication date: 01/01/2006
Field of study

PhD ThesisDistributed Virtual Environments (DVEs) are virtual environments which allow dispersed users to interact with each other and the virtual world through the underlying network. Scalability is a major challenge in building a successful DVE, which is directly affected by the volume of message exchange. Different techniques have been deployed to reduce the volume of message exchange in order to support large numbers of simultaneous participants in a DVE. Interest management is a popular technique for filtering unnecessary message exchange between users. The rationale behind interest management is to resolve the "interests" of users and decide whether messages should be exchanged between them. There are three basic interest management approaches: region-based, aura-based and hybrid approaches. However, if the time taken for an interest management approach to determine interests is greater than the duration of the interaction, it is not possible to guarantee interactions will occur correctly or at all. This is termed the Missed Interaction Problem, which all existing interest management approaches are susceptible to. This thesis provides a new aura-based interest management approach, termed Predictive Interest management (PIM), to alleviate the missed interaction problem. PIM uses an enlarged aura to detect potential aura-intersections and iii initiate message exchange. It utilises variable message exchange frequencies, proportional to the intersection degree of the objects' expanded auras, to restrict bandwidth usage. This thesis provides an experimental system, the PIM system, which couples predictive interest management with the de-centralised server communication model. It utilises the Common Object Request Broker Architecture (CORBA) middleware standard to provide an interoperable middleware for DVEs. Experimental results are provided to demonstrate that PIM provides a scalable interest management approach which alleviates the missed interaction problem

Newcastle University eTheses

OpenGrey Repository

Middleware services for distributed virtual environments

Author: Lu Fengyun
Publication venue: Newcastle University
Publication date: 01/01/2006
Field of study

Newcastle University eTheses

Energy-efficient mobile GPU systems

Author: Arnau Jose Maria
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

The design of mobile GPUs is all about saving energy. Smartphones and tablets are battery-operated and thus any type of rendering needs to use as little energy as possible. Furthermore, smartphones do not include sophisticated cooling systems due to their small size, making heat dissipation a primary concern. Improving the energy-efficiency of mobile GPUs will be absolutely necessary to achieve the performance required to satisfy consumer expectations, while maintaining operating time per battery charge and keeping the GPU in its thermal limits. The first step in optimizing energy consumption is to identify the sources of energy drain. Previous studies have demonstrated that the register file is one of the main sources of energy consumption in a GPU. As graphics workloads are highly data- and memory-parallel, GPUs rely on massive multithreading to hide the memory latency and keep the functional units busy. However, aggressive multithreading requires a huge register file to keep the registers of thousands of simultaneous threads. Such a big register file exceeds the power budget typically available for an embedded graphics processors and, hence, more energy-efficient memory latency tolerance techniques are necessary. On the other hand, prior research showed that the off-chip accesses to system memory are one of the most expensive operations in terms of energy in a mobile GPU. Therefore, optimizing memory bandwidth usage is a primary concern in mobile GPU design. Many bandwidth saving techniques, such as texture compression or ARM's transaction elimination, have been proposed in both industry and academia. The purpose of this thesis is to study the characteristics of mobile graphics processors and mobile workloads in order to propose different energy saving techniques specifically tailored for the low-power segment. Firstly, we focus on energy-efficient memory latency tolerance. We analyze several techniques such as multithreading and prefetching and conclude that they are effective but not energy-efficient. Next, we propose an architecture for the fragment processors of a mobile GPU that is based on the decoupled access/execute paradigm. The results obtained by using a cycle-accurate mobile GPU simulator and several commercial Android games show that the decoupled architecture combined with a small degree of multithreading provides the most energy efficient solution for hiding memory latency. More specifically, the decoupled access/execute-like design with just 4 SIMD threads/processor is able to achieve 97% of the performance of a larger GPU with 16 SIMD threads/processor, while providing 20.5% energy savings on average. Secondly, we focus on optimizing memory bandwidth in a mobile GPU. We analyze the bandwidth usage in a set of commercial Android games and find that most of the bandwidth is employed for fetching textures, and also that consecutive frames share most of the texture dataset as they tend to be very similar. However, the GPU cannot capture inter-frame texture re-use due to the big size of the texture dataset for one frame. Based on this analysis, we propose Parallel Frame Rendering (PFR), a technique that overlaps the processing of multiple frames in order to exploit inter-frame texture re-use and save bandwidth. By processing multiple frames in parallel textures are fetched once every two frames instead of being fetched in a frame basis as in conventional GPUs. PFR provides 23.8% memory bandwidth savings on average in our set of Android games, that result in 12% speedup and 20.1% energy savings. Finally, we improve PFR by introducing a hardware memoization system on top. We analyze the redundancy in mobile games and find that more than 38% of the Fragment Program executions are redundant on average. We thus propose a task-level hardware-based memoization system that provides 15% speedup and 12% energy savings on average over a PFR-enabled GPU.El diseño de las GPUs (Graphics Procesing Units) móviles se centra fundamentalmente en el ahorro energético. Los smartphones y las tabletas son dispositivos alimentados mediante baterías y, por lo tanto, cualquier tipo de renderizado debe utilizar la menor cantidad de energía posible. Mejorar la eficiencia energética de las GPUs móviles será absolutamente necesario para alcanzar el rendimiento requirido para satisfacer las expectativas de los usuarios, sin reducir el tiempo de vida de la batería. El primer paso para optimizar el consumo energético consiste en identificar qué componentes son los principales consumidores de la batería. Estudios anteriores han identificado al banco de registros y a los accessos a memoria principal como las mayores fuentes de consumo energético en una GPU. El propósito de esta tesis es estudiar las características de los procesadores gráficos móviles y de las aplicaciones móviles con el objetivo de proponer distintas técnicas de ahorro energético. En primer lugar, la investigación se centra en desarrollar métodos energéticamente eficientes para ocultar la latencia de la memoria principal. El resultado de la investigación es una arquitectura desacoplada para los Fragment Processors de la GPU. Los resultados experimentales utilizando un simulador de ciclo y distintos juegos de Android muestran que una arquitectura desacoplada, combinada con un nivel de multithreading moderado, proporciona la solución más eficiente desde el punto de vista energético para ocultar la latencia de la memoria prinicipal. Más específicamente, la arquitectura desacoplada con sólo 4 SIMD threads/processor es capaz de alcanzar el 97% del rendimiento de una GPU más grande con 16 SIMD threads/processor, al tiempo que se reduce el consumo energético en un 20.5%. En segundo lugar, el trabajo de investigación se centró en optimizar el ancho de banda en una GPU móvil. Se realizó un estudio del uso del ancho de banda en distintos juegos de Android y se observó que la mayor parte del ancho de banda se utiliza para leer texturas. Además, se observó que frames consecutivos comparten una gran parte de las texturas. Sin embargo, la GPU no puede capturar el reuso de texturas entre frames dado que el tamaño de las texturas utilizadas por un frame es mucho mayor que la caché de segundo nivel. Basándose en este análisis, se desarrolló Parallel Frame Rendering (PFR), una técnica que solapa el procesado de multiples frames consecutivos con el objetivo de explotar el reuso de texturas entre frames y ahorrar así ancho de bando. Al procesar múltiples frames en paralelo las texturas se leen de memoria principal una vez cada dos frames en lugar de leerse en cada frame como sucede en una GPU convencional. PFR proporciona un ahorro del 23.8% en ancho de banda en promedio para distintos juegos de Android, este ahorro de ancho de banda redunda en un incremento del rendimiento del 12% y un ahorro energético del 20.1%. Por último, se mejoró PFR introduciendo un sistema hardware capaz de evitar cómputos redundantes. Un análisis de distintos juegos de Android reveló que más de un 38% de las ejecuciones del Fragment Program eran redundantes en promedio. Así pues, se propuso un sistema hardware capaz de identificar y eliminar parte de los cómputos y accessos a memoria redundantes, dicho sistema proporciona un incremento del rendimiento del 15% y un ahorro energético del 12% en promedio con respecto a una GPU móvil basada en PFR

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Virtualising visualisation: A distributed service based approach to visualisation on the Grid

Author: Charters Stuart Muir
Publication venue
Publication date: 01/01/2006
Field of study

Context: Current visualisation systems are not designed to work with the large quantities of data produced by scientists today, they rely on the abilities of a single resource to perform all of the processing and visualisation of data which limits the problem size that they can investigate. Objectives: The objectives of this research are to address the issues encountered by scientists with current visualisation systems and the deficiencies highlighted in current visualisation systems. The research then addresses the question:” How do you design the ideal service oriented architecture for visualisation that meets the needs of scientists?” Method: A new design for a visualisation system based upon a Service Oriented Architecture is proposed to address the issues identified, the architecture is implemented using Java and web service technology. The implementation of the architecture also realised several case study scenarios as demonstrators. Evaluation: Evaluation was performed using case study scenarios of scientific problems and performance data was conducted through experimentation. The scenarios were assessed against the requirements for the architecture and the performance data against a base case simulating a single resource implementation. Conclusion: The virtualised visualisation architecture shows promise for applications where visualisation can be performed in a highly parallel manner and where the problem can be easily sub-divided into chunks for distributed processing

Durham e-Theses

OpenGrey Repository

Enabling Pipeline Parallelism in Heterogeneous Managed Runtime Environments via Batch Processing

Author: Blanaru Florin-Gabriel
Fumero Alfonso Juan
Kotselidis Christos-Efthymios
Stratikopoulos Athanasios
Publication venue
Publication date: 04/02/2022
Field of study

The University of Manchester - Institutional Repository