Search CORE

21,843 research outputs found

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam J.
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan P.
Publication venue
Publication date: 01/12/2013
Field of study

Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance.Comment: Transaction on Architecture and Code Optimization (2014

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

A framework for the simulation of structural software evolution

Author: Benjamin Stopford
Bieman J.
Capiluppi A.
Counsell S.
Fenton N.
Ferenc R.
Forrester J.
Fowler M.
Gamma E.
Kelsen P.
Kemerer C. F.
Kiczales G.
Mubarak A.
Munch J.
Nasseri E.
Parnas D.
Steve Counsell
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2008
Field of study

This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2008 ACM.As functionality is added to an aging piece of software, its original design and structure will tend to erode. This can lead to high coupling, low cohesion and other undesirable effects associated with spaghetti architectures. The underlying forces that cause such degradation have been the subject of much research. However, progress in this field is slow, as its complexity makes it difficult to isolate the causal flows leading to these effects. This is further complicated by the difficulty of generating enough empirical data, in sufficient quantity, and attributing such data to specific points in the causal chain. This article describes a framework for simulating the structural evolution of software. A complete simulation model is built by incrementally adding modules to the framework, each of which contributes an individual evolutionary effect. These effects are then combined to form a multifaceted simulation that evolves a fictitious code base in a manner approximating real-world behavior. We describe the underlying principles and structures of our framework from a theoretical and user perspective; a validation of a simple set of evolutionary parameters is then provided and three empirical software studies generated from open-source software (OSS) are used to support claims and generated results. The research illustrates how simulation can be used to investigate a complex and under-researched area of the development cycle. It also shows the value of incorporating certain human traits into a simulation—factors that, in real-world system development, can significantly influence evolutionary structures

Crossref

Brunel University Research Archive

Static locality analysis for cache management

Author: González Colás Antonio María
Sánchez Baeza Francisco J.
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Most memory references in numerical codes correspond to array references whose indices are affine functions of surrounding loop indices. These array references follow a regular predictable memory pattern that can be analysed at compile time. This analysis can provide valuable information like the locality exhibited by the program, which can be used to implement more intelligent caching strategy. In this paper we propose a static locality analysis oriented to the management of data caches. We show that previous proposals on locality analysis are not appropriate when the proposals have a high conflict miss ratio. This paper examines those proposals by introducing a compile-time interference analysis that significantly improve the performance of them. We first show how this analysis can be used to characterize the dynamic locality properties of numerical codes. This evaluation show for instance that a large percentage of references exhibit any type of locality. This motivates the use of a dual data cache, which has a module specialized to exploit temporal locality, and a selective cache respectively. Then, the performance provided by these two cache organizations is evaluated. In both organizations, the static locality analysis is responsible for tagging each memory instruction accordingly to the particular type(s) of locality that it exhibits.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

CBR and MBR techniques: review for an application in the emergencies domain

Author: Merida-Campos Carlos
Rollón Rico Emma
Publication venue
Publication date: 01/01/2003
Field of study

The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system. RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to: a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location. In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations. This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A protocol reconfiguration and optimization system for MPI

Author: Gorentla Venkata Manjunath
Publication venue: UNM Digital Repository
Publication date: 01/05/2010
Field of study

Modern high performance computing (HPC) applications, for example adaptive mesh refinement and multi-physics codes, have dynamic communication characteristics which result in poor performance on current Message Passing Interface (MPI) implementations. The degraded application performance can be attributed to a mismatch between changing application requirements and static communication library functionality. To improve the performance of these applications, MPI libraries should change their protocol functionality in response to changing application requirements, and tailor their functionality to take advantage of hardware capabilities. This dissertation describes Protocol Reconfiguration and Optimization system for MPI (PRO-MPI), a framework for constructing profile-driven reconfigurable MPI libraries; these libraries use past application characteristics (profiles) to dynamically change their functionality to match the changing application requirements. The framework addresses the challenges of designing and implementing the reconfigurable MPI libraries, which include collecting and reasoning about application characteristics to drive the protocol reconfiguration and defining abstractions required for implementing these reconfigurations. Two prototype reconfigurable MPI implementations based on the framework - Open PRO-MPI and Cactus PRO-MPI - are also presented to demonstrate the utility of the framework. To demonstrate the effectiveness of reconfigurable MPI libraries, this dissertation presents experimental results to show the impact of using these libraries on the application performance. The results show that PRO-MPI improves the performance of important HPC applications and benchmarks. They also show that HyperCLaw performance improves by approximately 22% when exact profiles are available, and HyperCLaw performance improves by approximately 18% when only approximate profiles are available

Compiler analysis for trace-level speculative multithreaded architectures

Author: González Colás Antonio María
Molina Clemente Carlos
Tubella Murgadas Jordi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Trace-level speculative multithreaded processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several traces. The other thread executes speculated traces and verifies the speculation made by the first thread. In this paper, we propose a static program analysis for identifying candidate traces to be speculated. This approach identifies large regions of code whose live-output values may be successfully predicted. We present several heuristics to determine the best opportunities for dynamic speculation, based on compiler analysis and program profiling information. Simulation results show that the proposed trace recognition techniques achieve on average a speed-up close to 38% for a collection of SPEC2000 benchmarks.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Education alignment

Author: Corrall S.
Coulbourne G.
Davidson J.
Rauber A.
Publication venue: Educopia Institute Publications
Publication date: 01/01/2012
Field of study

This essay reviews recent developments in embedding data management and curation skills into information technology, library and information science, and research-based postgraduate courses in various national contexts. The essay also investigates means of joining up formal education with professional development training opportunities more coherently. The potential for using professional internships as a means of improving communication and understanding between disciplines is also explored. A key aim of this essay is to identify what level of complementarity is needed across various disciplines to most effectively and efficiently support the entire data curation lifecycle

D-Scholarship@Pitt

Enlighten

DRIVER Technology Watch Report

Author: Hochstenbach Patrick
Karstens Elbaek Mikael
Russell Rosemary
Schmelz Pedersen Gerd
Van Godtsenhoven Karen
Vanderfeesten Maurice
Publication venue: DRIVER project
Publication date: 01/01/2008
Field of study

This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

Ghent University Academic Bibliography

Servicing the federation : the case for metadata harvesting

Author: Agosti Maristella
Ferro Nicola
Frommholz I.
Thiel U.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

The paper presents a comparative analysis of data harvesting and distributed computing as complementary models of service delivery within large-scale federated digital libraries. Informed by requirements of flexibility and scalability of federated services, the analysis focuses on the identification and assessment of model invariants. In particular, it abstracts over application domains, services, and protocol implementations. The analytical evidence produced shows that the harvesting model offers stronger guarantees of satisfying the identified requirements. In addition, it suggests a first characterisation of services based on their suitability to either model and thus indicates how they could be integrated in the context of a single federated digital library

CiteSeerX

OPUS

Crossref

University of Strathclyde Institutional Repository

CWI's Institutional Repository

Fraunhofer-ePrints

DR-NTU (Digital Repository of NTU)

Archivio istituzionale della ricerca - Università di Padova

University of Queensland eSpace