2,172 research outputs found

    Energy-efficient mobile GPU systems

    Get PDF
    The design of mobile GPUs is all about saving energy. Smartphones and tablets are battery-operated and thus any type of rendering needs to use as little energy as possible. Furthermore, smartphones do not include sophisticated cooling systems due to their small size, making heat dissipation a primary concern. Improving the energy-efficiency of mobile GPUs will be absolutely necessary to achieve the performance required to satisfy consumer expectations, while maintaining operating time per battery charge and keeping the GPU in its thermal limits. The first step in optimizing energy consumption is to identify the sources of energy drain. Previous studies have demonstrated that the register file is one of the main sources of energy consumption in a GPU. As graphics workloads are highly data- and memory-parallel, GPUs rely on massive multithreading to hide the memory latency and keep the functional units busy. However, aggressive multithreading requires a huge register file to keep the registers of thousands of simultaneous threads. Such a big register file exceeds the power budget typically available for an embedded graphics processors and, hence, more energy-efficient memory latency tolerance techniques are necessary. On the other hand, prior research showed that the off-chip accesses to system memory are one of the most expensive operations in terms of energy in a mobile GPU. Therefore, optimizing memory bandwidth usage is a primary concern in mobile GPU design. Many bandwidth saving techniques, such as texture compression or ARM's transaction elimination, have been proposed in both industry and academia. The purpose of this thesis is to study the characteristics of mobile graphics processors and mobile workloads in order to propose different energy saving techniques specifically tailored for the low-power segment. Firstly, we focus on energy-efficient memory latency tolerance. We analyze several techniques such as multithreading and prefetching and conclude that they are effective but not energy-efficient. Next, we propose an architecture for the fragment processors of a mobile GPU that is based on the decoupled access/execute paradigm. The results obtained by using a cycle-accurate mobile GPU simulator and several commercial Android games show that the decoupled architecture combined with a small degree of multithreading provides the most energy efficient solution for hiding memory latency. More specifically, the decoupled access/execute-like design with just 4 SIMD threads/processor is able to achieve 97% of the performance of a larger GPU with 16 SIMD threads/processor, while providing 20.5% energy savings on average. Secondly, we focus on optimizing memory bandwidth in a mobile GPU. We analyze the bandwidth usage in a set of commercial Android games and find that most of the bandwidth is employed for fetching textures, and also that consecutive frames share most of the texture dataset as they tend to be very similar. However, the GPU cannot capture inter-frame texture re-use due to the big size of the texture dataset for one frame. Based on this analysis, we propose Parallel Frame Rendering (PFR), a technique that overlaps the processing of multiple frames in order to exploit inter-frame texture re-use and save bandwidth. By processing multiple frames in parallel textures are fetched once every two frames instead of being fetched in a frame basis as in conventional GPUs. PFR provides 23.8% memory bandwidth savings on average in our set of Android games, that result in 12% speedup and 20.1% energy savings. Finally, we improve PFR by introducing a hardware memoization system on top. We analyze the redundancy in mobile games and find that more than 38% of the Fragment Program executions are redundant on average. We thus propose a task-level hardware-based memoization system that provides 15% speedup and 12% energy savings on average over a PFR-enabled GPU.El diseño de las GPUs (Graphics Procesing Units) móviles se centra fundamentalmente en el ahorro energético. Los smartphones y las tabletas son dispositivos alimentados mediante baterías y, por lo tanto, cualquier tipo de renderizado debe utilizar la menor cantidad de energía posible. Mejorar la eficiencia energética de las GPUs móviles será absolutamente necesario para alcanzar el rendimiento requirido para satisfacer las expectativas de los usuarios, sin reducir el tiempo de vida de la batería. El primer paso para optimizar el consumo energético consiste en identificar qué componentes son los principales consumidores de la batería. Estudios anteriores han identificado al banco de registros y a los accessos a memoria principal como las mayores fuentes de consumo energético en una GPU. El propósito de esta tesis es estudiar las características de los procesadores gráficos móviles y de las aplicaciones móviles con el objetivo de proponer distintas técnicas de ahorro energético. En primer lugar, la investigación se centra en desarrollar métodos energéticamente eficientes para ocultar la latencia de la memoria principal. El resultado de la investigación es una arquitectura desacoplada para los Fragment Processors de la GPU. Los resultados experimentales utilizando un simulador de ciclo y distintos juegos de Android muestran que una arquitectura desacoplada, combinada con un nivel de multithreading moderado, proporciona la solución más eficiente desde el punto de vista energético para ocultar la latencia de la memoria prinicipal. Más específicamente, la arquitectura desacoplada con sólo 4 SIMD threads/processor es capaz de alcanzar el 97% del rendimiento de una GPU más grande con 16 SIMD threads/processor, al tiempo que se reduce el consumo energético en un 20.5%. En segundo lugar, el trabajo de investigación se centró en optimizar el ancho de banda en una GPU móvil. Se realizó un estudio del uso del ancho de banda en distintos juegos de Android y se observó que la mayor parte del ancho de banda se utiliza para leer texturas. Además, se observó que frames consecutivos comparten una gran parte de las texturas. Sin embargo, la GPU no puede capturar el reuso de texturas entre frames dado que el tamaño de las texturas utilizadas por un frame es mucho mayor que la caché de segundo nivel. Basándose en este análisis, se desarrolló Parallel Frame Rendering (PFR), una técnica que solapa el procesado de multiples frames consecutivos con el objetivo de explotar el reuso de texturas entre frames y ahorrar así ancho de bando. Al procesar múltiples frames en paralelo las texturas se leen de memoria principal una vez cada dos frames en lugar de leerse en cada frame como sucede en una GPU convencional. PFR proporciona un ahorro del 23.8% en ancho de banda en promedio para distintos juegos de Android, este ahorro de ancho de banda redunda en un incremento del rendimiento del 12% y un ahorro energético del 20.1%. Por último, se mejoró PFR introduciendo un sistema hardware capaz de evitar cómputos redundantes. Un análisis de distintos juegos de Android reveló que más de un 38% de las ejecuciones del Fragment Program eran redundantes en promedio. Así pues, se propuso un sistema hardware capaz de identificar y eliminar parte de los cómputos y accessos a memoria redundantes, dicho sistema proporciona un incremento del rendimiento del 15% y un ahorro energético del 12% en promedio con respecto a una GPU móvil basada en PFR

    Real-time quality visualization of medical models on commodity and mobile devices

    Get PDF
    This thesis concerns the specific field of visualization of medical models using commodity and mobile devices. Mechanisms for medical imaging acquisition such as MRI, CT, and micro-CT scanners are continuously evolving, up to the point of obtaining volume datasets of large resolutions (> 512^3). As these datasets grow in resolution, its treatment and visualization become more and more expensive due to their computational requirements. For this reason, special techniques such as data pre-processing (filtering, construction of multi-resolution structures, etc.) and sophisticated algorithms have to be introduced in different points of the visualization pipeline to achieve the best visual quality without compromising performance times. The problem of managing big datasets comes from the fact that we have limited computational resources. Not long ago, the only physicians that were rendering volumes were radiologists. Nowadays, the outcome of diagnosis is the data itself, and medical doctors need to render them in commodity PCs (even patients may want to render the data, and the DVDs are commonly accompanied with a DICOM viewer software). Furthermore, with the increasing use of technology in daily clinical tasks, small devices such as mobile phones and tablets can fit the needs of medical doctors in some specific areas. Visualizing diagnosis images of patients becomes more challenging when it comes to using these devices instead of desktop computers, as they generally have more restrictive hardware specifications. The goal of this Ph.D. thesis is the real-time, quality visualization of medium to large medical volume datasets (resolutions >= 512^3 voxels) on mobile phones and commodity devices. To address this problem, we use multiresolution techniques that apply downsampling techniques on the full resolution datasets to produce coarser representations which are easier to handle. We have focused our efforts on the application of Volume Visualization in the clinical practice, so we have a particular interest in creating solutions that require short pre-processing times that quickly provide the specialists with the data outcome, maximize the preservation of features and the visual quality of the final images, achieve high frame rates that allow interactive visualizations, and make efficient use of the computational resources. The contributions achieved during this thesis comprise improvements in several stages of the visualization pipeline. The techniques we propose are located in the stages of multi-resolution generation, transfer function design and the GPU ray casting algorithm itself.Esta tesis se centra en la visualización de modelos médicos de volumen en dispositivos móviles y de bajas prestaciones. Los sistemas médicos de captación tales como escáners MRI, CT y micro-CT, están en constante evolución, hasta el punto de obtener modelos de volumen de gran resolución (> 512^3). A medida que estos datos crecen en resolución, su manejo y visualización se vuelve más y más costoso debido a sus requisitos computacionales. Por este motivo, técnicas especiales como el pre-proceso de datos (filtrado, construcción de estructuras multiresolución, etc.) y algoritmos específicos se tienen que introducir en diferentes puntos de la pipeline de visualización para conseguir la mejor calidad visual posible sin comprometer el rendimiento. El problema que supone manejar grandes volumenes de datos es debido a que tenemos recursos computacionales limitados. Hace no mucho, las únicas personas en el ámbito médico que visualizaban datos de volumen eran los radiólogos. Hoy en día, el resultado de la diagnosis son los datos en sí, y los médicos necesitan renderizar estos datos en PCs de características modestas (incluso los pacientes pueden querer visualizar estos datos, pues los DVDs con los resultados suelen venir acompañados de un visor de imágenes DICOM). Además, con el reciente aumento del uso de las tecnologías en la clínica práctica habitual, dispositivos pequeños como teléfonos móviles o tablets son los más convenientes en algunos casos. La visualización de volumen es más difícil en este tipo de dispositivos que en equipos de sobremesa, pues las limitaciones de su hardware son superiores. El objetivo de esta tesis doctoral es la visualización de calidad en tiempo real de modelos grandes de volumen (resoluciones >= 512^3 voxels) en teléfonos móviles y dispositivos de bajas prestaciones. Para enfrentarnos a este problema, utilizamos técnicas multiresolución que aplican técnicas de reducción de datos a los modelos en resolución original, para así obtener modelos de menor resolución. Hemos centrado nuestros esfuerzos en la aplicación de la visualización de volumen en la práctica clínica, así que tenemos especial interés en diseñar soluciones que requieran cortos tiempos de pre-proceso para que los especialistas tengan rápidamente los resultados a su disposición. También, queremos maximizar la conservación de detalles de interés y la calidad de las imágenes finales, conseguir frame rates altos que faciliten visualizaciones interactivas y que hagan un uso eficiente de los recursos computacionales. Las contribuciones aportadas por esta tesis són mejoras en varias etapas de la pipeline de visualización. Las técnicas que proponemos se situan en las etapas de generación de la estructura multiresolución, el diseño de la función de transferencia y el algoritmo de ray casting en la GPU.Postprint (published version

    Real-time quality visualization of medical models on commodity and mobile devices

    Get PDF
    This thesis concerns the specific field of visualization of medical models using commodity and mobile devices. Mechanisms for medical imaging acquisition such as MRI, CT, and micro-CT scanners are continuously evolving, up to the point of obtaining volume datasets of large resolutions (> 512^3). As these datasets grow in resolution, its treatment and visualization become more and more expensive due to their computational requirements. For this reason, special techniques such as data pre-processing (filtering, construction of multi-resolution structures, etc.) and sophisticated algorithms have to be introduced in different points of the visualization pipeline to achieve the best visual quality without compromising performance times. The problem of managing big datasets comes from the fact that we have limited computational resources. Not long ago, the only physicians that were rendering volumes were radiologists. Nowadays, the outcome of diagnosis is the data itself, and medical doctors need to render them in commodity PCs (even patients may want to render the data, and the DVDs are commonly accompanied with a DICOM viewer software). Furthermore, with the increasing use of technology in daily clinical tasks, small devices such as mobile phones and tablets can fit the needs of medical doctors in some specific areas. Visualizing diagnosis images of patients becomes more challenging when it comes to using these devices instead of desktop computers, as they generally have more restrictive hardware specifications. The goal of this Ph.D. thesis is the real-time, quality visualization of medium to large medical volume datasets (resolutions >= 512^3 voxels) on mobile phones and commodity devices. To address this problem, we use multiresolution techniques that apply downsampling techniques on the full resolution datasets to produce coarser representations which are easier to handle. We have focused our efforts on the application of Volume Visualization in the clinical practice, so we have a particular interest in creating solutions that require short pre-processing times that quickly provide the specialists with the data outcome, maximize the preservation of features and the visual quality of the final images, achieve high frame rates that allow interactive visualizations, and make efficient use of the computational resources. The contributions achieved during this thesis comprise improvements in several stages of the visualization pipeline. The techniques we propose are located in the stages of multi-resolution generation, transfer function design and the GPU ray casting algorithm itself.Esta tesis se centra en la visualización de modelos médicos de volumen en dispositivos móviles y de bajas prestaciones. Los sistemas médicos de captación tales como escáners MRI, CT y micro-CT, están en constante evolución, hasta el punto de obtener modelos de volumen de gran resolución (> 512^3). A medida que estos datos crecen en resolución, su manejo y visualización se vuelve más y más costoso debido a sus requisitos computacionales. Por este motivo, técnicas especiales como el pre-proceso de datos (filtrado, construcción de estructuras multiresolución, etc.) y algoritmos específicos se tienen que introducir en diferentes puntos de la pipeline de visualización para conseguir la mejor calidad visual posible sin comprometer el rendimiento. El problema que supone manejar grandes volumenes de datos es debido a que tenemos recursos computacionales limitados. Hace no mucho, las únicas personas en el ámbito médico que visualizaban datos de volumen eran los radiólogos. Hoy en día, el resultado de la diagnosis son los datos en sí, y los médicos necesitan renderizar estos datos en PCs de características modestas (incluso los pacientes pueden querer visualizar estos datos, pues los DVDs con los resultados suelen venir acompañados de un visor de imágenes DICOM). Además, con el reciente aumento del uso de las tecnologías en la clínica práctica habitual, dispositivos pequeños como teléfonos móviles o tablets son los más convenientes en algunos casos. La visualización de volumen es más difícil en este tipo de dispositivos que en equipos de sobremesa, pues las limitaciones de su hardware son superiores. El objetivo de esta tesis doctoral es la visualización de calidad en tiempo real de modelos grandes de volumen (resoluciones >= 512^3 voxels) en teléfonos móviles y dispositivos de bajas prestaciones. Para enfrentarnos a este problema, utilizamos técnicas multiresolución que aplican técnicas de reducción de datos a los modelos en resolución original, para así obtener modelos de menor resolución. Hemos centrado nuestros esfuerzos en la aplicación de la visualización de volumen en la práctica clínica, así que tenemos especial interés en diseñar soluciones que requieran cortos tiempos de pre-proceso para que los especialistas tengan rápidamente los resultados a su disposición. También, queremos maximizar la conservación de detalles de interés y la calidad de las imágenes finales, conseguir frame rates altos que faciliten visualizaciones interactivas y que hagan un uso eficiente de los recursos computacionales. Las contribuciones aportadas por esta tesis són mejoras en varias etapas de la pipeline de visualización. Las técnicas que proponemos se situan en las etapas de generación de la estructura multiresolución, el diseño de la función de transferencia y el algoritmo de ray casting en la GPU

    Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

    Get PDF
    Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with physical environments for mobile devices. MAR systems enable users to interact with MAR devices, such as smartphones and head-worn wearables, and perform seamless transitions from the physical world to a mixed world with digital entities. These MAR systems support user experiences using MAR devices to provide universal access to digital content. Over the past 20 years, several MAR systems have been developed, however, the studies and design of MAR frameworks have not yet been systematically reviewed from the perspective of user-centric design. This article presents the first effort of surveying existing MAR frameworks (count: 37) and further discuss the latest studies on MAR through a top-down approach: (1) MAR applications; (2) MAR visualisation techniques adaptive to user mobility and contexts; (3) systematic evaluation of MAR frameworks, including supported platforms and corresponding features such as tracking, feature extraction, and sensing capabilities; and (4) underlying machine learning approaches supporting intelligent operations within MAR systems. Finally, we summarise the development of emerging research fields and the current state-of-the-art, and discuss the important open challenges and possible theoretical and technical directions. This survey aims to benefit both researchers and MAR system developers alike.Peer reviewe

    Collaborative Augmented Reality

    Get PDF
    Over the past number of years augmented reality (AR) has become an increasingly pervasive as a consumer level technology. The principal drivers of its recent development has been the evolution of mobile and handheld devices, in conjunction with algorithms and techniques from fields such as 3D computer vision. Various commercial platforms and SDKs are now available that allow developers to quickly develop mobile AR apps requiring minimal understanding of the underlying technology. Much of the focus to date, both in the research and commercial environment, has been on single user AR applications. Just as collaborative mobile applications have a demonstrated role in the increasing popularity of mobile devices, and we believe collaborative AR systems present a compelling use-case for AR technology. The aim of this thesis is the development a mobile collaborative augmented reality framework. We identify the elements required in the design and implementation stages of collaborative AR applications. Our solution enables developers to easily create multi-user mobile AR applications in which the users can cooperatively interact with the real environment in real time. It increases the sense of collaborative spatial interaction without requiring complex infrastructure. Assuming the given low level communication and AR libraries have modular structures, the proposed approach is also modular and flexible enough to adapt to their requirements without requiring any major changes

    Collaborative Augmented Reality

    Get PDF
    Over the past number of years augmented reality (AR) has become an increasingly pervasive as a consumer level technology. The principal drivers of its recent development has been the evolution of mobile and handheld devices, in conjunction with algorithms and techniques from fields such as 3D computer vision. Various commercial platforms and SDKs are now available that allow developers to quickly develop mobile AR apps requiring minimal understanding of the underlying technology. Much of the focus to date, both in the research and commercial environment, has been on single user AR applications. Just as collaborative mobile applications have a demonstrated role in the increasing popularity of mobile devices, and we believe collaborative AR systems present a compelling use-case for AR technology. The aim of this thesis is the development a mobile collaborative augmented reality framework. We identify the elements required in the design and implementation stages of collaborative AR applications. Our solution enables developers to easily create multi-user mobile AR applications in which the users can cooperatively interact with the real environment in real time. It increases the sense of collaborative spatial interaction without requiring complex infrastructure. Assuming the given low level communication and AR libraries have modular structures, the proposed approach is also modular and flexible enough to adapt to their requirements without requiring any major changes

    Graphics Insertions into Real Video for Market Research

    Get PDF

    Architectures for ubiquitous 3D on heterogeneous computing platforms

    Get PDF
    Today, a wide scope for 3D graphics applications exists, including domains such as scientific visualization, 3D-enabled web pages, and entertainment. At the same time, the devices and platforms that run and display the applications are more heterogeneous than ever. Display environments range from mobile devices to desktop systems and ultimately to distributed displays that facilitate collaborative interaction. While the capability of the client devices may vary considerably, the visualization experiences running on them should be consistent. The field of application should dictate how and on what devices users access the application, not the technical requirements to realize the 3D output. The goal of this thesis is to examine the diverse challenges involved in providing consistent and scalable visualization experiences to heterogeneous computing platforms and display setups. While we could not address the myriad of possible use cases, we developed a comprehensive set of rendering architectures in the major domains of scientific and medical visualization, web-based 3D applications, and movie virtual production. To provide the required service quality, performance, and scalability for different client devices and displays, our architectures focus on the efficient utilization and combination of the available client, server, and network resources. We present innovative solutions that incorporate methods for hybrid and distributed rendering as well as means to manage data sets and stream rendering results. We establish the browser as a promising platform for accessible and portable visualization services. We collaborated with experts from the medical field and the movie industry to evaluate the usability of our technology in real-world scenarios. The presented architectures achieve a wide coverage of display and rendering setups and at the same time share major components and concepts. Thus, they build a strong foundation for a unified system that supports a variety of use cases.Heutzutage existiert ein großer Anwendungsbereich für 3D-Grafikapplikationen wie wissenschaftliche Visualisierungen, 3D-Inhalte in Webseiten, und Unterhaltungssoftware. Gleichzeitig sind die Geräte und Plattformen, welche die Anwendungen ausführen und anzeigen, heterogener als je zuvor. Anzeigegeräte reichen von mobilen Geräten zu Desktop-Systemen bis hin zu verteilten Bildschirmumgebungen, die eine kollaborative Anwendung begünstigen. Während die Leistungsfähigkeit der Geräte stark schwanken kann, sollten die dort laufenden Visualisierungen konsistent sein. Das Anwendungsfeld sollte bestimmen, wie und auf welchem Gerät Benutzer auf die Anwendung zugreifen, nicht die technischen Voraussetzungen zur Erzeugung der 3D-Grafik. Das Ziel dieser Thesis ist es, die diversen Herausforderungen zu untersuchen, die bei der Bereitstellung von konsistenten und skalierbaren Visualisierungsanwendungen auf heterogenen Plattformen eine Rolle spielen. Während wir nicht die Vielzahl an möglichen Anwendungsfällen abdecken konnten, haben wir eine repräsentative Auswahl an Rendering-Architekturen in den Kernbereichen wissenschaftliche Visualisierung, web-basierte 3D-Anwendungen, und virtuelle Filmproduktion entwickelt. Um die geforderte Qualität, Leistung, und Skalierbarkeit für verschiedene Client-Geräte und -Anzeigen zu gewährleisten, fokussieren sich unsere Architekturen auf die effiziente Nutzung und Kombination der verfügbaren Client-, Server-, und Netzwerkressourcen. Wir präsentieren innovative Lösungen, die hybrides und verteiltes Rendering als auch das Verwalten der Datensätze und Streaming der 3D-Ausgabe umfassen. Wir etablieren den Web-Browser als vielversprechende Plattform für zugängliche und portierbare Visualisierungsdienste. Um die Verwendbarkeit unserer Technologie in realitätsnahen Szenarien zu testen, haben wir mit Experten aus der Medizin und Filmindustrie zusammengearbeitet. Unsere Architekturen erreichen eine umfassende Abdeckung von Anzeige- und Rendering-Szenarien und teilen sich gleichzeitig wesentliche Komponenten und Konzepte. Sie bilden daher eine starke Grundlage für ein einheitliches System, das eine Vielzahl an Anwendungsfällen unterstützt

    Virtual Reality Games for Motor Rehabilitation

    Get PDF
    This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion