Search CORE

410 research outputs found

ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code

Author: De Gonzalo Simon Garcia
Matsumura Kazuaki
Peña Antonio J.
Publication venue
Publication date: 22/06/2023
Field of study

Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equality saturation, allow for exhaustive term rewriting at various levels of inputs, thereby simplifying compiler design. In this paper, we propose equality saturation to optimize sequential codes utilized in directive-based programming for GPUs. Our approach simultaneously realizes less computation, less memory access, and high memory throughput. Our fully-automated framework constructs single-assignment forms from inputs to be entirely rewritten while keeping dependencies and extracts optimal cases. Through practical benchmarks, we demonstrate a significant performance improvement on several compilers. Furthermore, we highlight the advantages of computational reordering and emphasize the significance of memory-access order for modern GPUs

arXiv.org e-Print Archive

Advanced semantics for accelerated graph processing

Author: Stark Dylan Thomas
Publication venue: LSU Digital Commons
Publication date: 01/01/2011
Field of study

Large-scale graph applications are of great national, commercial, and societal importance, with direct use in ﬁelds such as counter-intelligence, proteomics, and data mining. Unfortunately, graph-based problems exhibit certain basic characteristics that make them a poor match for conventional computing systems in terms of structure, scale, and semantics. Graph processing kernels emphasize sparse data structures and computations with irregular memory access patterns that destroy the temporal and spatial locality upon which modern processors rely for performance. Furthermore, applications in this area utilize large data sets, and have been shown to be more data intensive than typical ﬂoating-point applications, two properties that lead to inefficient utilization of the hierarchical memory system. Current approaches to processing large graph data sets leverage traditional HPC systems and programming models, for shared memory and message-passing computation, and are thus limited in efficiency, scalability, and programmability. The research presented in this thesis investigates the potential of a new model of execution that is hypothesized as a promising alternative for graph-based applications to conventional practices. A new approach to graph processing is developed and presented in this thesis. The application of the experimental ParalleX execution model to graph processing balances continuation-migration style ﬁne-grain concurrency with constraint-based synchronization through embedded futures. A collection of parallel graph application kernels provide experiment control drivers for analysis and evaluation of this innovative strategy. Finally, an experimental software library for scalable graph processing, the ParalleX Graph Library, is deﬁned using the HPX runtime system, providing an implementation of the key concepts and a framework for development of ParalleX-based graph applications

Louisiana State University

Dagstuhl Reports : Volume 1, Issue 2, February 2011

Author: Schloss Dagstuhl Leibniz-Zentrum für Informatik
Publication venue
Publication date: 09/09/2011
Field of study

Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

Hochschulschriftenserver - Universität Frankfurt am Main

On the Porting and Optimisation of Physics Simulations for Heterogeneous Parallel Processors

Author: Martineau Matt J
Publication venue
Publication date: 25/06/2019
Field of study

Explore Bristol Research

An Outline of a Proposed System that Learns from Experts How to Discharge Proof Obligations Automatically

Author: Bundy Alan
Grov Gudmund
Jones Cliff B.
Publication venue
Publication date: 01/01/2009
Field of study

Edinburgh Research Explorer

Cryptography as a service in a cloud computing environment

Author: Ideler H.A.W.
Publication venue
Publication date: 01/01/2013
Field of study

Repository TU/e

Pure OAI Repository

Cohérences basées sur les valeurs en échec

Author: Lecoutre Christophe
Roussel Olivier
Publication venue: HAL CCSD
Publication date: 02/06/2009
Field of study

International audienceNon disponibl

HAL Descartes

HAL-Artois

Behavioural access control in distributed environments

Author: Zhao Yining
Publication venue: University of York
Publication date: 01/06/2013
Field of study

Applications and services in distributed environments are an increasingly important topic. Hence approaches to security issues in such applications are also becoming essential. Crucial information is needed to be protected properly and mechanisms must be developed for this protection. Access control is one of the topics that underline security problems. It concerns assuring that data or resources are accessed by the correct entities. A commonly used access control approach is called access control lists, which is widely applied in most operating systems. However, this approach has some weaknesses with regard to scalability, and so it is not very suitable for distributed environments that usually have variable populations. Capabilities on the other hand offer scalability and adaptability advantages over access control lists. Capabilities are unforgeable tickets that can be propagated between entities, and fit well in distributed environments. But capabilities also have limits due to their simple structure. They grant infinite number of accesses for given types of actions, but are not able to capture sequences and branches of actions, which may be called aspects of behaviours. In this thesis, behaviour control approaches are introduced, through Vistas to Treaties. Vistas can provide explicit access control for each component of objects, and provide primitive control over action sequences. Treaties develop behaviour control further by containing behaviour descriptors which can specify those sequencing, branching and terminating aspects, and hence can provide much finer control over behaviours. Because treaties inherit the scalable attributes of capabilities, they also fit well in distributed environments. An interesting feature in treaty systems is that they allow users to refine the specifications of behaviours and generate new treaties from existing ones. A number of treaty combinator operations are proposed to realize this functionality, and they are shown to be safe with respect to the security of access control. A novel issue created by the treaty approach is identified in the thesis. The new problem is called the duplication problem, which could cause users being able to gain more permissions than they should have by making copies of unprotected treaties. Any treaty systems must provide solutions to this problem. Three models which solve the duplication problem are proposed, with an analysis of their differences, and advantages and disadvantages. Treaties are a general concept and in real cases they can be represented in various ways. There are components in treaties that have given a variety of implementation options, and the developers of services and applications can choose to combine these options to fit their special requirements. This makes treaties more flexible and adaptable. The implementations of concreted treaties and treaty systems are introduced, and these implemented treaties are used to test their behaviour control abilities. Evaluations for different treaty representations are provided to compare their performance. Scalability of treaty systems is also evaluated, showing that treaties are good to be deployed in distributed environments

White Rose E-theses Online

Probabilistic intraday PV power forecast using ensembles of deep Gaussian mixture density networks

Author: Ament Christoph
Amthor Arvid
Dölle Oliver
Klinkenberg Nico
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

There is a growing interest of estimating the inherent uncertainty of photovoltaic (PV) power forecasts with probability forecasting methods to mitigate accompanying risks for system operators. This study aims to advance the field of probabilistic PV power forecast by introducing and extending deep Gaussian mixture density networks (MDNs). Using the sum of the weighted negative log likelihood of multiple Gaussian distributions as a minimizing objective, MDNs can estimate flexible uncertainty distributions with nearly all neural network structures. Thus, the advantages of advances in machine learning, in this case deep neural networks, can be exploited. To account for the epistemic (e.g., model) uncertainty as well, this study applies two ensemble approaches to MDNs. This is particularly relevant for industrial applications, as there is often no extensive (manual) adjustment of the forecast model structure for each site, and only a limited amount of training data are available during commissioning. The results of this study suggest that already seven days of training data are sufficient to generate significant improvements of 23.9% in forecasting quality measured by normalized continuous ranked probability score (NCRPS) compared to the reference case. Furthermore, the use of multiple Gaussian distributions and ensembles increases the forecast quality relatively by up to 20.5% and 19.5%, respectively

OPUS Augsburg