241 research outputs found

    Heterogeneity-aware scheduling and data partitioning for system performance acceleration

    Get PDF
    Over the past decade, heterogeneous processors and accelerators have become increasingly prevalent in modern computing systems. Compared with previous homogeneous parallel machines, the hardware heterogeneity in modern systems provides new opportunities and challenges for performance acceleration. Classic operating systems optimisation problems such as task scheduling, and application-specific optimisation techniques such as the adaptive data partitioning of parallel algorithms, are both required to work together to address hardware heterogeneity. Significant effort has been invested in this problem, but either focuses on a specific type of heterogeneous systems or algorithm, or a high-level framework without insight into the difference in heterogeneity between different types of system. A general software framework is required, which can not only be adapted to multiple types of systems and workloads, but is also equipped with the techniques to address a variety of hardware heterogeneity. This thesis presents approaches to design general heterogeneity-aware software frameworks for system performance acceleration. It covers a wide variety of systems, including an OS scheduler targeting on-chip asymmetric multi-core processors (AMPs) on mobile devices, a hierarchical many-core supercomputer and multi-FPGA systems for high performance computing (HPC) centers. Considering heterogeneity from on-chip AMPs, such as thread criticality, core sensitivity, and relative fairness, it suggests a collaborative based approach to co-design the task selector and core allocator on OS scheduler. Considering the typical sources of heterogeneity in HPC systems, such as the memory hierarchy, bandwidth limitations and asymmetric physical connection, it proposes an application-specific automatic data partitioning method for a modern supercomputer, and a topological-ranking heuristic based schedule for a multi-FPGA based reconfigurable cluster. Experiments on both a full system simulator (GEM5) and real systems (Sunway Taihulight Supercomputer and Xilinx Multi-FPGA based clusters) demonstrate the significant advantages of the suggested approaches compared against the state-of-the-art on variety of workloads."This work is supported by St Leonards 7th Century Scholarship and Computer Science PhD funding from University of St Andrews; by UK EPSRC grant Discovery: Pattern Discovery and Program Shaping for Manycore Systems (EP/P020631/1)." -- Acknowledgement

    Interactive Visualization on High-Resolution Tiled Display Walls with Network Accessible Compute- and Display-Resources

    Get PDF
    Papers number 2-7 and appendix B and C of this thesis are not available in Munin: 2. Hagen, T-M.S., Johnsen, E.S., Stødle, D., Bjorndalen, J.M. and Anshus, O.: 'Liberating the Desktop', First International Conference on Advances in Computer-Human Interaction (2008), pp 89-94. Available at http://dx.doi.org/10.1109/ACHI.2008.20 3. Tor-Magne Stien Hagen, Oleg Jakobsen, Phuong Hoai Ha, and Otto J. Anshus: 'Comparing the Performance of Multiple Single-Cores versus a Single Multi-Core' (manuscript)4. Tor-Magne Stien Hagen, Phuong Hoai Ha, and Otto J. Anshus: 'Experimental Fault-Tolerant Synchronization for Reliable Computation on Graphics Processors' (manuscript) 5. Tor-Magne Stien Hagen, Daniel Stødle and Otto J. Anshus: 'On-Demand High-Performance Visualization of Spatial Data on High-Resolution Tiled Display Walls', Proceedings of the International Conference on Imaging Theory and Applications and International Conference on Information Visualization Theory and Applications (2010), pages 112-119. Available at http://dx.doi.org/10.5220/0002849601120119 6. Bård Fjukstad, Tor-Magne Stien Hagen, Daniel Stødle, Phuong Hoai Ha, John Markus Bjørndalen and Otto Anshus: 'Interactive Weather Simulation and Visualization on a Display Wall with Many-Core Compute Nodes', Para 2010 – State of the Art in Scientific and Parallel Computing. Available at http://vefir.hi.is/para10/extab/para10-paper-60 7. Tor-Magne Stien Hagen, Daniel Stødle, John Markus Bjørndalen, and Otto Anshus: 'A Step towards Making Local and Remote Desktop Applications Interoperable with High-Resolution Tiled Display Walls', Lecture Notes in Computer Science (2011), Volume 6723/2011, 194-207. Available at http://dx.doi.org/10.1007/978-3-642-21387-8_15The vast volume of scientific data produced today requires tools that can enable scientists to explore large amounts of data to extract meaningful information. One such tool is interactive visualization. The amount of data that can be simultaneously visualized on a computer display is proportional to the display’s resolution. While computer systems in general have seen a remarkable increase in performance the last decades, display resolution has not evolved at the same rate. Increased resolution can be provided by tiling several displays in a grid. A system comprised of multiple displays tiled in such a grid is referred to as a display wall. Display walls provide orders of magnitude more resolution than typical desktop displays, and can provide insight into problems not possible to visualize on desktop displays. However, their distributed and parallel architecture creates several challenges for designing systems that can support interactive visualization. One challenge is compatibility issues with existing software designed for personal desktop computers. Another set of challenges include identifying characteristics of visualization systems that can: (i) Maintain synchronous state and display-output when executed over multiple display nodes; (ii) scale to multiple display nodes without being limited by shared interconnect bottlenecks; (iii) utilize additional computational resources such as desktop computers, clusters and supercomputers for workload distribution; and (iv) use data from local and remote compute- and data-resources with interactive performance. This dissertation presents Network Accessible Compute (NAC) resources and Network Accessible Display (NAD) resources for interactive visualization of data on displays ranging from laptops to high-resolution tiled display walls. A NAD is a display having functionality that enables usage over a network connection. A NAC is a computational resource that can produce content for network accessible displays. A system consisting of NACs and NADs is either push-based (NACs provide NADs with content) or pull-based (NADs request content from NACs). To attack the compatibility challenge, a push-based system was developed. The system enables several simultaneous users to mirror multiple regions from the desktop of their computers (NACs) onto nearby NADs (among others a 22 megapixel display wall) without requiring usage of separate DVI/VGA cables, permanent installation of third party software or opening firewall ports. The system has lower performance than that of a DVI/VGA cable approach, but increases flexibility such as the possibility to share network accessible displays from multiple computers. At a resolution of 800 by 600 pixels, the system can mirror dynamic content between a NAC and a NAD at 38.6 frames per second (FPS). At 1600x1200 pixels, the refresh rate is 12.85 FPS. The bottleneck of the system is frame buffer capturing and encoding/decoding of pixels. These two functional parts are executed in sequence, limiting the usage of additional CPU cores. By pipelining and executing these parts on separate CPU cores, higher frame rates can be expected and by a factor of two in the best case. To attack all presented challenges, a pull-based system, WallScope, was developed. WallScope enables interactive visualization of local and remote data sets on high-resolution tiled display walls. The WallScope architecture comprises a compute-side and a display-side. The compute-side comprises a set of static and dynamic NACs. Static NACs are considered permanent to the system once added. This type of NAC typically has strict underlying security and access policies. Examples of such NACs are clusters, grids and supercomputers. Dynamic NACs are compute resources that can register on-the-fly to become compute nodes in the system. Examples of this type of NAC are laptops and desktop computers. The display-side comprises of a set of NADs and a data set containing data customized for the particular application domain of the NADs. NADs are based on a sort-first rendering approach where a visualization client is executed on each display-node. The state of these visualization clients is provided by a separate state server, enabling central control of load and refresh-rate. Based on the state received from the state server, the visualization clients request content from the data set. The data set is live in that it translates these requests into compute messages and forwards them to available NACs. Results of the computations are returned to the NADs for the final rendering. The live data set is close to the NADs, both in terms of bandwidth and latency, to enable interactive visualization. WallScope can visualize the Earth, gigapixel images, and other data available through the live data set. When visualizing the Earth on a 28-node display wall by combining the Blue Marble data set with the Landsat data set using a set of static NACs, the bottleneck of WallScope is the computation involved in combining the data sets. However, the time used to combine data sets on the NACs decreases by a factor of 23 when going from 1 to 26 compute nodes. The display-side can decode 414.2 megapixels of images per second (19 frames per second) when visualizing the Earth. The decoding process is multi-threaded and higher frame rates are expected using multi-core CPUs. WallScope can rasterize a 350-page PDF document into 550 megapixels of image-tiles and display these image-tiles on a 28-node display wall in 74.66 seconds (PNG) and 20.66 seconds (JPG) using a single quad-core desktop computer as a dynamic NAC. This time is reduced to 4.20 seconds (PNG) and 2.40 seconds (JPG) using 28 quad-core NACs. This shows that the application output from personal desktop computers can be decoupled from the resolution of the local desktop and display for usage on high-resolution tiled display walls. It also shows that the performance can be increased by adding computational resources giving a resulting speedup of 17.77 (PNG) and 8.59 (JPG) using 28 compute nodes. Three principles are formulated based on the concepts and systems researched and developed: (i) Establishing the end-to-end principle through customization, is a principle stating that the setup and interaction between a display-side and a compute-side in a visualization context can be performed by customizing one or both sides; (ii) Personal Computer (PC) – Personal Compute Resource (PCR) duality states that a user’s computer is both a PC and a PCR, implying that desktop applications can be utilized locally using attached interaction devices and display(s), or remotely by other visualization systems for domain specific production of data based on a user’s personal desktop install; and (iii) domain specific best-effort synchronization stating that for distributed visualization systems running on tiled display walls, state handling can be performed using a best-effort synchronization approach, where visualization clients eventually will get the correct state after a given period of time. Compared to state-of-the-art systems presented in the literature, the contributions of this dissertation enable utilization of a broader range of compute resources from a display wall, while at the same time providing better control over where to provide functionality and where to distribute workload between compute-nodes and display-nodes in a visualization context

    Contributions to the efficient use of general purpose coprocessors: kernel density estimation as case study

    Get PDF
    142 p.The high performance computing landscape is shifting from assemblies of homogeneous nodes towards heterogeneous systems, in which nodes consist of a combination of traditional out-of-order execution cores and accelerator devices. Accelerators provide greater theoretical performance compared to traditional multi-core CPUs, but exploiting their computing power remains as a challenging task.This dissertation discusses the issues that arise when trying to efficiently use general purpose accelerators. As a contribution to aid in this task, we present a thorough survey of performance modeling techniques and tools for general purpose coprocessors. Then we use as case study the statistical technique Kernel Density Estimation (KDE). KDE is a memory bound application that poses several challenges for its adaptation to the accelerator-based model. We present a novel algorithm for the computation of KDE that reduces considerably its computational complexity, called S-KDE. Furthermore, we have carried out two parallel implementations of S-KDE, one for multi and many-core processors, and another one for accelerators. The latter has been implemented in OpenCL in order to make it portable across a wide range of devices. We have evaluated the performance of each implementation of S-KDE in a variety of architectures, trying to highlight the bottlenecks and the limits that the code reaches in each device. Finally, we present an application of our S-KDE algorithm in the field of climatology: a novel methodology for the evaluation of environmental models

    Parameterschätzung für marine Ökosystemmodelle in 3-D

    Get PDF
    The aim of this work is to provide a computational-science-based foundation for the parameter identification of marine ecosystem models. For this purpose a general programming interface is introduced to enable a flexible coupling of marine ecosystems to fluid dynamics on source code level. This interface fits into the biogeochemical model structure as well as into an optimization context. Moreover, a parallel simulation and solver software is implemented that combines the introduced interface with an efficient, transport-matrix-based simulation. The software is founded on a free and portable programming library. It is written from scratch, basically validated and exemplary used for a derivative-based optimization experiment. Part of the software additionally provides a basis for the numerical experiments carried out subsequently. They address an approach used for the computation of sensitivities with respect to model parameters and an alternative optimization approach that does not require model evaluations, respectively. In addition, results are included in this work that has been achieved in collaboration with other authors. The first joint work is about porting the software to graphic processing units, the second is about its usage for surrogate-based optimization.Das Ziel der vorliegenden Arbeit ist es, eine informationstechnische Grundlage für die Parameteridentifikation bei marinen Ökosystemmodellen zu schaffen. Dazu wird eine Programmierschnittstelle vorgestellt, die es allgemein ermöglicht, marine Ökosysteme auf Quelltextebene an Strömungsmodelle zu koppeln. Diese Schnittstelle fügt sich sowohl in die biogeochemische Modellstruktur als auch in den Optimierungskontext ein. Des weiteren wird eine parallele Simulations- und Lösungssoftware implementiert, die eine effiziente, transportmatrixbasierte Strömungssimulation und die vorgestellte Schnittstelle miteinander kombiniert. Die Software wird auf der Grundlage einer freien und portablen Bibliothek neu erstellt, grundsätzlich validiert und exemplarisch für ein ableitungsbasiertes Optimierungsverfahren eingesetzt. Ein Teil der Software bildet zusätzlich eine Basis für die im weiteren Verlauf der Arbeit durchgeführten numerischen Experimente. Dabei handelt es sich um einen Ansatz zur Berechnung von Sensitivitäten bezüglich der Modellparameter, beziehungsweise um einen alternativen Optimierungsansatz, der ohne Modellauswertung auskommt. Zusätzlich werden Resultate in die Arbeit aufgenommen, die in Zusammenarbeit mit anderen Autoren erzielt wurden. Es handelt sich dabei um die Portierung der Software auf Grafikkarten und um deren Einsatz im Bereich der surrogat-basierten Optimierung

    Realistic rendering and reconstruction of astronomical objects and an augmented reality application for astronomy

    Get PDF
    These days, there is an ever increasing need for realistic models, renderings and visualization of astronomical objects to be used in planetarium and as a tool in modern astrophysical research. One of the major goals of this dissertation is to develop novel algorithms for recovering and rendering 3D models of a specific set of astronomical objects. We first present a method to render the color and shape of the solar disc in different climate conditions as well as for different height to temperature atmospheric profiles. We then present a method to render and reconstruct the 3D distribution of reflection nebulae. The rendering model takes into account scattering and absorption to generate physically realistic visualization of reflection nebulae. Further, we propose a reconstruction method for another type of astronomical objects, planetary nebulae. We also present a novel augmented reality application called the augmented astronomical telescope, tailored for educational astronomy. The real-time application augments the view through a telescope by projecting additional information such as images, text and video related to the currently observed object during observation. All methods previously proposed for rendering and reconstructing astronomical objects can be used to create novel content for the presented augmented reality application.Realistische Modelle, Visualisierungen und Renderings von astronomischen Objekten gewinnen heuzutage in Planetarium Shows oder als Werkzeug fßr die Astrophysikalische Forschung immer mehr an Bedeutung. Eines der Hauptziele dieser Dissertation ist es, neue Algorithmen zum Rendering und zur Rekonstruktion von Astronomischen Objekten zu entwickeln. Wir beschreiben zuerst ein Verfahren zum Rendering von Farbe und Form der Sonnenscheibe fßr verschiedene Klimate und gegebenen HÜhe zu Temperatur Profilen. Im weiterem wird eine Methode zum Rendering und zur Rekonstruktion von 3D Modellen von Reflexionsnebeln präsentiert. Das Renderingmodell berßcksichtigt Streuung und Absorption, um physikalisch realistische Visualisierungen von Reflexionsnebeln zu erzeugen. Weiter, wird ein Rekonstruktionsalgorithmus fßr eine andere Art astronomischer Objekte, Planetarische Nebel, vorgeschlagen. Wir stellen eine neuartige Erweiterte Realität Anwendung vor, welche fßr die astronomische Bildung zugeschnitten ist. Die Anwedung erweitert die Sicht durch das Okular des Teleskopes und projiziert zusätzliche Informationen wie Bilder, Text und Video online, während des Betrachtens. Alle vorher erwähnten Verfahren zum Rendering und zur Rekonstruktion von Astronomischen Objekten kÜnnen verwendet werden, um Inhalte fßr die vorgestellte Erweiterte Realität Anwendung zu entwerfen

    Realistic rendering and reconstruction of astronomical objects and an augmented reality application for astronomy

    Get PDF
    These days, there is an ever increasing need for realistic models, renderings and visualization of astronomical objects to be used in planetarium and as a tool in modern astrophysical research. One of the major goals of this dissertation is to develop novel algorithms for recovering and rendering 3D models of a specific set of astronomical objects. We first present a method to render the color and shape of the solar disc in different climate conditions as well as for different height to temperature atmospheric profiles. We then present a method to render and reconstruct the 3D distribution of reflection nebulae. The rendering model takes into account scattering and absorption to generate physically realistic visualization of reflection nebulae. Further, we propose a reconstruction method for another type of astronomical objects, planetary nebulae. We also present a novel augmented reality application called the augmented astronomical telescope, tailored for educational astronomy. The real-time application augments the view through a telescope by projecting additional information such as images, text and video related to the currently observed object during observation. All methods previously proposed for rendering and reconstructing astronomical objects can be used to create novel content for the presented augmented reality application.Realistische Modelle, Visualisierungen und Renderings von astronomischen Objekten gewinnen heuzutage in Planetarium Shows oder als Werkzeug fßr die Astrophysikalische Forschung immer mehr an Bedeutung. Eines der Hauptziele dieser Dissertation ist es, neue Algorithmen zum Rendering und zur Rekonstruktion von Astronomischen Objekten zu entwickeln. Wir beschreiben zuerst ein Verfahren zum Rendering von Farbe und Form der Sonnenscheibe fßr verschiedene Klimate und gegebenen HÜhe zu Temperatur Profilen. Im weiterem wird eine Methode zum Rendering und zur Rekonstruktion von 3D Modellen von Reflexionsnebeln präsentiert. Das Renderingmodell berßcksichtigt Streuung und Absorption, um physikalisch realistische Visualisierungen von Reflexionsnebeln zu erzeugen. Weiter, wird ein Rekonstruktionsalgorithmus fßr eine andere Art astronomischer Objekte, Planetarische Nebel, vorgeschlagen. Wir stellen eine neuartige Erweiterte Realität Anwendung vor, welche fßr die astronomische Bildung zugeschnitten ist. Die Anwedung erweitert die Sicht durch das Okular des Teleskopes und projiziert zusätzliche Informationen wie Bilder, Text und Video online, während des Betrachtens. Alle vorher erwähnten Verfahren zum Rendering und zur Rekonstruktion von Astronomischen Objekten kÜnnen verwendet werden, um Inhalte fßr die vorgestellte Erweiterte Realität Anwendung zu entwerfen

    DYNAMIC VISUALIZATIONS: Developing a Framework for Crowd-Based Simulations

    Get PDF
    Since its conception in the 1960s, digital computation has experienced both exponential growth in power and reduction in cost. This has allowed the production of relatively cheap electronics, which are now integrated ubiquitously in daily life. With so much computational data and an ever-increasing accessibility to intelligent objects, the potential for integrating such technologies within architectural systems becomes increasingly viable. Today, dynamic architecture is already emerging across the world; it is inevitable that one day computation will be fully integrated within the infrastructures of our cities. However, as these new forms of dynamic architecture becomes increasingly commonplace, the standard static medium of architectural visualization is no longer satisfactory for representing and visualizing these dynamic spaces, let alone the human interactions within them. Occupancy within a space is already inherently dynamic and becomes even more so with the introduction of these new forms of architecture. This in turn challenges our conventional means of visualizing spaces both in design and communication. To fully represent dynamic architecture, the visualization must be dynamic as well. As such, current single image rendering methods within most existing architectural design pipelines becomes inadequate in portraying both the architectural dynamics of the space, as well as the interaction and influences these dynamics will have with the occupants. This thesis aims to mitigate these shortcomings in architectural visualization by investigating the creation of a crowd simulation tool to facilitate a foundation for a visualization framework that can be continuously built upon based on project needs, which answers the question of how one can utilize current technologies to not only better represent responsive architecture but also to optimize existing visualization methodologies. By using an interdisciplinary approach that brings together architecture, computer science, and game design, it becomes possible to establish a more powerful, flexible, and efficient workflow in creating architectural visualizations. Part One will establish a foundation to this thesis by looking at the state of the current world, its buildings in the sense of dynamic, and the current state of visualization technologies that are being utilized both within architectural design as well as outside of it. Part Two will investigate complex systems and simulation models, as well as investigating ways of integrating them with human behaviors to establish a methodology for creating a working crowd simulation system. Part Three will take the methodology developed within Part Two and integrate it within modern game engines, with the intent of creating an architectural visualization pipeline that can utilize the game engine for both crowd analytics as well as visualization. Part Four will look at some of the various spatial typologies that can be visualized with this tool. Finally, Part Five will speculate on various future directions to improve this tool beyond the current scope of this thesis

    Simulated Annealing

    Get PDF
    The book contains 15 chapters presenting recent contributions of top researchers working with Simulated Annealing (SA). Although it represents a small sample of the research activity on SA, the book will certainly serve as a valuable tool for researchers interested in getting involved in this multidisciplinary field. In fact, one of the salient features is that the book is highly multidisciplinary in terms of application areas since it assembles experts from the fields of Biology, Telecommunications, Geology, Electronics and Medicine

    Instrumentation, Data, And Algorithms For Visually Understanding Haptic Surface Properties

    Get PDF
    Autonomous robots need to efficiently walk over varied surfaces and grasp diverse objects. We hypothesize that the association between how such surfaces look and how they physically feel during contact can be learned from a database of matched haptic and visual data recorded from various end-effectors\u27 interactions with hundreds of real-world surfaces. Testing this hypothesis required the creation of a new multimodal sensing apparatus, the collection of a large multimodal dataset, and development of a machine-learning pipeline. This thesis begins by describing the design and construction of the Portable Robotic Optical/Tactile ObservatioN PACKage (PROTONPACK, or Proton for short), an untethered handheld sensing device that emulates the capabilities of the human senses of vision and touch. Its sensory modalities include RGBD vision, egomotion, contact force, and contact vibration. Three interchangeable end-effectors (a steel tooling ball, an OptoForce three-axis force sensor, and a SynTouch BioTac artificial fingertip) allow for different material properties at the contact point and provide additional tactile data. We then detail the calibration process for the motion and force sensing systems, as well as several proof-of-concept surface discrimination experiments that demonstrate the reliability of the device and the utility of the data it collects. This thesis then presents a large-scale dataset of multimodal surface interaction recordings, including 357 unique surfaces such as furniture, fabrics, outdoor fixtures, and items from several private and public material sample collections. Each surface was touched with one, two, or three end-effectors, comprising approximately one minute per end-effector of tapping and dragging at various forces and speeds. We hope that the larger community of robotics researchers will find broad applications for the published dataset. Lastly, we demonstrate an algorithm that learns to estimate haptic surface properties given visual input. Surfaces were rated on hardness, roughness, stickiness, and temperature by the human experimenter and by a pool of purely visual observers. Then we trained an algorithm to perform the same task as well as infer quantitative properties calculated from the haptic data. Overall, the task of predicting haptic properties from vision alone proved difficult for both humans and computers, but a hybrid algorithm using a deep neural network and a support vector machine achieved a correlation between expected and actual regression output between approximately ρ = 0.3 and ρ = 0.5 on previously unseen surfaces
    • …
    corecore