200 research outputs found
An Enhancement of Futures Runtime in Presence of Cache Memory Hierarchy
A future is a simple abstraction mechanism for exposing potential concurrency in programs. In this paper, we propose an enhancement of our previously developed runtime for scheduling and executing futures based on the lazy task creation technique that aims to reflect the cache memory hierarchy present in modern multi-core and multiprocessor systems
Architecture aware parallel programming in Glasgow parallel Haskell (GPH)
General purpose computing architectures are evolving quickly to become manycore
and hierarchical: i.e. a core can communicate more quickly locally than
globally. To be effective on such architectures, programming models must be
aware of the communications hierarchy. This thesis investigates a programming
model that aims to share the responsibility of task placement, load balance, thread
creation, and synchronisation between the application developer and the runtime
system.
The main contribution of this thesis is the development of four new architectureaware
constructs for Glasgow parallel Haskell that exploit information about task
size and aim to reduce communication for small tasks, preserve data locality, or to
distribute large units of work. We define a semantics for the constructs that specifies the sets of PEs that each construct identifies, and we check four properties
of the semantics using QuickCheck.
We report a preliminary investigation of architecture aware programming
models that abstract over the new constructs. In particular, we propose architecture
aware evaluation strategies and skeletons. We investigate three common
paradigms, such as data parallelism, divide-and-conquer and nested parallelism,
on hierarchical architectures with up to 224 cores. The results show that the
architecture-aware programming model consistently delivers better speedup and
scalability than existing constructs, together with a dramatic reduction in the
execution time variability.
We present a comparison of functional multicore technologies and it reports
some of the first ever multicore results for the Feedback Directed Implicit Parallelism
(FDIP) and the semi-explicit parallelism (GpH and Eden) languages. The
comparison reflects the growing maturity of the field by systematically evaluating
four parallel Haskell implementations on a common multicore architecture.
The comparison contrasts the programming effort each language requires with
the parallel performance delivered.
We investigate the minimum thread granularity required to achieve satisfactory
performance for three implementations parallel functional language on a
multicore platform. The results show that GHC-GUM requires a larger thread
granularity than Eden and GHC-SMP. The thread granularity rises as the number
of cores rises
Speeding up Energy System Models - a Best Practice Guide
Background
Energy system models (ESM) are widely used in research and industry to analyze todays and future energy systems and potential pathways for the European energy transition. Current studies address future policy design, analysis of technology pathways and of future energy systems. To address these questions and support the transformation of today’s energy systems, ESM have to increase in complexity to provide valuable quantitative insights for policy makers and industry. Especially when dealing with uncertainty and in integrating large shares of renewable energies, ESM require a detailed implementation of the underlying electricity system. The increased complexity of the models makes the application of ESM more and more difficult, as the models are limited by the available computational power of today’s decentralized workstations. Severe simplifications of the models are common strategies to solve problems in a reasonable amount of time – naturally significantly influencing the validity of results and reliability of the models in general.
Solutions for Energy-System Modelling
Within BEAM-ME a consortium of researchers from different research fields (system analysis, mathematics, operations research and informatics) develop new strategies to increase the computational performance of energy system models and to transform energy system models for usage on high performance computing clusters. Within the project, an ESM will be applied on two of Germany’s fastest supercomputers. To further demonstrate the general application of named techniques on ESM, a model experiment is implemented as part of the project. Within this experiment up to six energy system models will jointly develop, implement and benchmark speed-up methods. Finally, continually collecting all experiences from the project and the experiment, identified efficient strategies will be documented and general standards for increasing computational performance and for applying ESM to high performance computing will be documented in a best-practice guide
Technical Viewpoint of Challenges, Opportunities, and Future Directions of Policy Change and Information-Flow in Digital Healthcare Systems
Source: https://www.thinkmind.org/.Digital healthcare systems often run on heterogeneous
devices in a distributed multi-cluster environment, and
maintain their healthcare policies for managing data, securing
information flow, and controlling interactions among systems
components. As healthcare systems become more digitally distributed,
lack of integration and safe interpretation between
heterogeneous systems clusters become problematic and might
lead to healthcare policy violations. Communication overhead
and high computation consumption might impact the system
at different levels and affect the flow of information among
system clusters. This paper provides a technical viewpoint of the
challenges, opportunities, and future work in digital healthcare
systems, focusing on the mechanisms of monitoring, detecting,
and recovering healthcare policy change/update and its imprint
on information flow
Strategies for improving efficiency and efficacy of image quality assessment algorithms
Image quality assessment (IQA) research aims to predict the qualities of images in a manner that agrees with subjective quality ratings. Over the last several decades, the major impetus in IQA research has focused on improving prediction efficacy globally (across images) of distortion-specific types or general types; very few studies have explored local image quality (within images), or IQA algorithm for improved JPEG2000 coding. Even fewer studies have focused on analyzing and improving the runtime performance of IQA algorithms. Moreover, reduced-reference (RR) IQA is also a new field to be explored, when the transmitting bandwidth is limited, side information about original image was received with distorted image at the receiver. This report explored these four topics. For local image quality, we provided a local sharpness database, and we analyzed the database along with current sharpness metrics. We revealed that human highly agreed when rating sharpness of small blocks. Overall, this sharpness database is a true representation of human subjective ratings and current sharpness algorithms could reach 0.87 in terms of SROCC score. For JPEG2000 coding using IQA, we provided a new JPEG2000 image database, which includes only same total distortion images. Analysis of existing IQA algorithms on this database revealed that even though current algorithms perform reasonably well on JPEG2000-compressed images in popular image-quality databases, they often fail to predict the correct rankings on our database's images. Based on the framework of Most Apparent Distortion (MAD), a new algorithm, MADDWT is then proposed using local DWT coefficient statistics to predict the perceived distortion due to subband quantization. MADDWT outperforms all others algorithms on this database, and shows a promising use in JPEG2000 coding. For efficiency of IQA algorithms, this paper is the first to examine IQA algorithms from the perspective of their interaction with the underlying hardware and microarchitectural resources, and to perform a systematic performance analysis using state-of-the-art tools and techniques from other computing disciplines. We implemented four popular full-reference IQA algorithms and two no-reference algorithms in C++ based on the code provided by their respective authors. Hotspot analysis and microarchitectural analysis of each algorithm were performed and compared. Despite the fact that all six algorithms share common algorithmic operations (e.g., filterbanks and statistical computations), our results revealed that different IQA algorithms overwhelm different microarchitectural resources and give rise to different types of bottlenecks. For RR IQA, we also provide a new framework based on multiscale sharpness map. This framework employs multiscale sharpness maps as reduced information. As we will demonstrate, our framework with 2% reduced information can outperform other frameworks, which employ from 2% to 3% reduced information. Our framework is also competitive to current state-of-the-art FR algorithms
Adaptive object management for distributed systems
This thesis describes an architecture supporting the management of pluggable software components and evaluates it against the requirement for an enterprise integration platform for the manufacturing and petrochemical industries. In a distributed environment, we need mechanisms to manage objects and their interactions. At the least, we must be able to create objects in different processes on different nodes; we must be able to link them together so that they can pass messages to each other across the network; and we must deliver their messages in a timely and reliable manner. Object based environments which support these services already exist, for example ANSAware(ANSA, 1989), DEC's Objectbroker(ACA,1992), Iona's Orbix(Orbix,1994)Yet such environments provide limited support for composing applications from pluggable components. Pluggability is the ability to install and configure a component into an environment dynamically when the component is used, without specifying static dependencies between components when they are produced. Pluggability is supported to a degree by dynamic binding. Components may be programmed to import references to other components and to explore their interfaces at runtime, without using static type dependencies. Yet thus overloads the component with the responsibility to explore bindings. What is still generally missing is an efficient general-purpose binding model for managing bindings between independently produced components. In addition, existing environments provide no clear strategy for dealing with fine grained objects. The overhead of runtime binding and remote messaging will severely reduce performance where there are a lot of objects with complex patterns of interaction. We need an adaptive approach to managing configurations of pluggable components according to the needs and constraints of the environment. Management is made difficult by embedding bindings in component implementations and by relying on strong typing as the only means of verifying and validating bindings. To solve these problems we have built a set of configuration tools on top of an existing distributed support environment. Specification tools facilitate the construction of independent pluggable components. Visual composition tools facilitate the configuration of components into applications and the verification of composite behaviours. A configuration model is constructed which maintains the environmental state. Adaptive management is made possible by changing the management policy according to this state. Such policy changes affect the location of objects, their bindings, and the choice of messaging system
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
- …