24,285 research outputs found
Towards an Adaptive Skeleton Framework for Performance Portability
The proliferation of widely available, but very different, parallel architectures
makes the ability to deliver good parallel performance
on a range of architectures, or performance portability, highly desirable.
Irregularly-parallel problems, where the number and size
of tasks is unpredictable, are particularly challenging and require
dynamic coordination.
The paper outlines a novel approach to delivering portable parallel
performance for irregularly parallel programs. The approach
combines declarative parallelism with JIT technology, dynamic
scheduling, and dynamic transformation.
We present the design of an adaptive skeleton library, with a task
graph implementation, JIT trace costing, and adaptive transformations.
We outline the architecture of the protoype adaptive skeleton
execution framework in Pycket, describing tasks, serialisation,
and the current scheduler.We report a preliminary evaluation of the
prototype framework using 4 micro-benchmarks and a small case
study on two NUMA servers (24 and 96 cores) and a small cluster
(17 hosts, 272 cores). Key results include Pycket delivering good
sequential performance e.g. almost as fast as C for some benchmarks;
good absolute speedups on all architectures (up to 120 on
128 cores for sumEuler); and that the adaptive transformations do
improve performance
A Compiler and Runtime Infrastructure for Automatic Program Distribution
This paper presents the design and the implementation of a compiler and runtime infrastructure for automatic program distribution. We are building a research infrastructure that enables experimentation with various program partitioning and mapping strategies and the study of automatic distribution's effect on resource consumption (e.g., CPU, memory, communication). Since many optimization techniques are faced with conflicting optimization targets (e.g., memory and communication), we believe that it is important to be able to study their interaction.
We present a set of techniques that enable flexible resource modeling and program distribution. These are: dependence analysis, weighted graph partitioning, code and communication generation, and profiling. We have developed these ideas in the context of the Java language. We present in detail the design and implementation of each of the techniques as part of our compiler and runtime infrastructure. Then, we evaluate our design and present preliminary experimental data for each component, as well as for the entire system
AIOCJ: A Choreographic Framework for Safe Adaptive Distributed Applications
We present AIOCJ, a framework for programming distributed adaptive
applications. Applications are programmed using AIOC, a choreographic language
suited for expressing patterns of interaction from a global point of view. AIOC
allows the programmer to specify which parts of the application can be adapted.
Adaptation takes place at runtime by means of rules, which can change during
the execution to tackle possibly unforeseen adaptation needs. AIOCJ relies on a
solid theory that ensures applications to be deadlock-free by construction also
after adaptation. We describe the architecture of AIOCJ, the design of the AIOC
language, and an empirical validation of the framework.Comment: Technical Repor
Tupleware: Redefining Modern Analytics
There is a fundamental discrepancy between the targeted and actual users of
current analytics frameworks. Most systems are designed for the data and
infrastructure of the Googles and Facebooks of the world---petabytes of data
distributed across large cloud deployments consisting of thousands of cheap
commodity machines. Yet, the vast majority of users operate clusters ranging
from a few to a few dozen nodes, analyze relatively small datasets of up to a
few terabytes, and perform primarily compute-intensive operations. Targeting
these users fundamentally changes the way we should build analytics systems.
This paper describes the design of Tupleware, a new system specifically aimed
at the challenges faced by the typical user. Tupleware's architecture brings
together ideas from the database, compiler, and programming languages
communities to create a powerful end-to-end solution for data analysis. We
propose novel techniques that consider the data, computations, and hardware
together to achieve maximum performance on a case-by-case basis. Our
experimental evaluation quantifies the impact of our novel techniques and shows
orders of magnitude performance improvement over alternative systems
Managing Climatic Risks to Combat Land Degradation and Enhance Food security: Key Information Needs
This paper discusses the key information needs to reduce the negative impacts of weather variability and climate change on land degradation and food security, and identifies the opportunities and barriers between the information and services needed. It suggests that vulnerability assessments based on a livelihood concept that includes climate information and key socio-economic variables can overcome the narrow focus of common one-dimensional vulnerability studies. Both current and future climatic risks can be managed better if there is appropriate policy and institutional support together with technological interventions to address the complexities of multiple risks that agriculture has to face. This would require effective partnerships among agencies dealing with meteorological and hydrological services, agricultural research, land degradation and food security issues. In addition a state-of-the-art infrastructure to measure, record, store and disseminate data on weather variables, and access to weather and seasonal climate forecasts at desired spatial and temporal scales would be needed
Digital zero noise extrapolation for quantum error mitigation
Zero-noise extrapolation (ZNE) is an increasingly popular technique for
mitigating errors in noisy quantum computations without using additional
quantum resources. We review the fundamentals of ZNE and propose several
improvements to noise scaling and extrapolation, the two key components in the
technique. We introduce unitary folding and parameterized noise scaling. These
are digital noise scaling frameworks, i.e. one can apply them using only
gate-level access common to most quantum instruction sets. We also study
different extrapolation methods, including a new adaptive protocol that uses a
statistical inference framework. Benchmarks of our techniques show error
reductions of 18X to 24X over non-mitigated circuits and demonstrate ZNE
effectiveness at larger qubit numbers than have been tested previously. In
addition to presenting new results, this work is a self-contained introduction
to the practical use of ZNE by quantum programmers.Comment: 11 pages, 7 figure
- …