Search CORE

271 research outputs found

XARK: an extensible framework for automatic recognition of computational kernels

Author: Arenaz Silva Manuel
Doallo Ramón
Touriño Juan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

This is a post-peer-review, pre-copyedit version of an article published in ACM Transactions on Programming Languages and Systems. The final authenticated version is available online at: http://dx.doi.org/10.1145/1391956.1391959[Abstract] The recognition of program constructs that are frequently used by software developers is a powerful mechanism for optimizing and parallelizing compilers to improve the performance of the object code. The development of techniques for automatic recognition of computational kernels such as inductions, reductions and array recurrences has been an intensive research area in the scope of compiler technology during the 90's. This article presents a new compiler framework that, unlike previous techniques that focus on specific and isolated kernels, recognizes a comprehensive collection of computational kernels that appear frequently in full-scale real applications. The XARK compiler operates on top of the Gated Single Assignment (GSA) form of a high-level intermediate representation (IR) of the source code. Recognition is carried out through a demand-driven analysis of this high-level IR at two different levels. First, the dependences between the statements that compose the strongly connected components (SCCs) of the data-dependence graph of the GSA form are analyzed. As a result of this intra-SCC analysis, the computational kernels corresponding to the execution of the statements of the SCCs are recognized. Second, the dependences between statements of different SCCs are examined in order to recognize more complex kernels that result from combining simpler kernels in the same code. Overall, the XARK compiler builds a hierarchical representation of the source code as kernels and dependence relationships between those kernels. This article describes in detail the collection of computational kernels recognized by the XARK compiler. Besides, the internals of the recognition algorithms are presented. The design of the algorithms enables to extend the recognition capabilities of XARK to cope with new kernels, and provides an advanced symbolic analysis framework to run other compiler techniques on demand. Finally, extensive experiments showing the effectiveness of XARK for a collection of benchmarks from different application domains are presented. In particular, the SparsKit-II library for the manipulation of sparse matrices, the Perfect benchmarks, the SPEC CPU2000 collection and the PLTMG package for solving elliptic partial differential equations are analyzed in detail.Ministeiro de Educación y Ciencia; TIN2004-07797-C02Ministeiro de Educación y Ciencia; TIN2007-67537-C03Xunta de Galicia; PGIDIT05PXIC10504PNXunta de Galicia; PGIDIT06PXIB105228P

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

NoCo: ILP-based worst-case contention estimation for mesh real-time manycores

Author: Abella Ferrer Jaume
Cardona Nadal Jordi
Cazorla Almeida Francisco Javier
Hernández Luz Carles
Mezzetti Enrico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/01/2019
Field of study

Manycores are capable of providing the computational demands required by functionally-advanced critical applications in domains such as automotive and avionics. In manycores a network-on-chip (NoC) provides access to shared caches and memories and hence concentrates most of the contention that tasks suffer, with effects on the worst-case contention delay (WCD) of packets and tasks' WCET. While several proposals minimize the impact of individual NoC parameters on WCD, e.g. mapping and routing, there are strong dependences among these NoC parameters. Hence, finding the optimal NoC configurations requires optimizing all parameters simultaneously, which represents a multidimensional optimization problem. In this paper we propose NoCo, a novel approach that combines ILP and stochastic optimization to find NoC configurations in terms of packet routing, application mapping, and arbitration weight allocation. Our results show that NoCo improves other techniques that optimize a subset of NoC parameters.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015- 65316-P and the HiPEAC Network of Excellence. It also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (agreement No. 772773). Carles Hernández is jointly supported by the MINECO and FEDER funds through grant TIN2014-60404-JIN. Jaume Abella has been partially supported by the Spanish Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. Enrico Mezzetti has been partially supported by the Spanish Ministry of Economy and Competitiveness under Juan de la Cierva-Incorporaci´on postdoctoral fellowship number IJCI-2016-27396.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

CBR and MBR techniques: review for an application in the emergencies domain

Author: Merida-Campos Carlos
Rollón Rico Emma
Publication venue
Publication date: 01/01/2003
Field of study

The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system. RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to: a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location. In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations. This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

Path-based dynamic impact analysis

Author: Law James (James Bruce)
Publication venue: 'Oregon State University'
Publication date
Field of study

Successful software systems evolve over their lifetimes through the cumulative changes made by software maintainers. As software evolves, the problems resulting from software change worsen, exacerbated by increased system size and complexity, lack of program understanding, amount of effort required to make changes, and number of personnel involved. Experience shows that software changes made without visibility into their effects can lead to poor effort estimates, delays in release schedules, degraded software design, unreliable software products, increased costs, and premature retirement of the software system. Software change impact analysis, impact analysis, is a software maintenance technique meant to address these problems, by assessing the effects of changes made to a software system. While impact analysis is frequently cited as a motivation or a potential application for program analysis and software maintenance research, research specific to the task of impact analysis has languished for more than 10 years. In addition, few researchers have examined the empirical factors underlying common impact analysis techniques or the tradeoffs inherent in known techniques, and none have performed empirical studies comparing impact analysis techniques. In this dissertation we introduce a new impact analysis approach, named PathImpact, that addresses a set of tradeoffs not addressed by any current impact analysis approach. Ours is the first fully-dynamic impact analysis approach. PathImpact uses light-weight instrumentation to record program execution at the level of procedure calls and returns, then efficiently builds a compressed representation that can be directly used to estimate change impact. We next extend PathImpact to accomodate system evolution yielding a technique we call EvolveImpact. EvolveImpact updates the impact representation after a system change, whereas PathImpact requires a complete recompution. In addition, we show how our approaches can be extended to a large class of emerging software architectures, including Java component-based systems and large-scale systems. Finally, we discuss the implementation of our approaches, present the first cost models for impact analysis techniques, and report the results of the first empirical studies that compare impact analysis techniques. We also empirically examine the performance of our approaches and the factors affecting the use of our techniques in practice. We found that our approach has linear time and space complexity (in the size of the dynamic information collected) and achieved a mean compression value of 0.955 on the subjects we used in our experiments. Our investigation of program evolution across multiple versions of three of our subject programs showed that, depending on the level of change activity, EvolveImpact can update the impact representation more efficiently than recomputing it in a majority of cases

ScholarsArchive@OSU

Automated Failure Explanation Through Execution Comparison

Author: Sumner William Nicholas
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2013
Field of study

When fixing a bug in software, developers must build an understanding or explanation of the bug and how the bug flows through a program. The effort that developers must put into building this explanation is costly and laborious. Thus, developers need tools that can assist them in explaining the behavior of bugs. Dynamic slicing is one technique that can effectively show how a bug propagates through an execution up to the point where a program fails. However, dynamic slices are large because they do not just explain the bug itself; they include extra information that explains any observed behavior that might be connected to the bug. Thus, the explanation of the bug is hidden within this other tangentially related information. This dissertation addresses the problem and shows how a failing execution and a correct execution may be compared in order to construct explanations that include only information about what caused the bug. As a result, these automated explanations are significantly more concise than those explanations produced by existing dynamic slicing techniques. To enable the comparison of executions, we develop new techniques for dynamic analyses that identify the commonalities and differences between executions. First, we devise and implement the notion of a point within an execution that may exist across multiple executions. We also note that comparing executions involves comparing the state or variables and their values that exist within the executions at different execution points. Thus, we design an approach for identifying the locations of variables in different executions so that their values may be compared. Leveraging these tools, we design a system for identifying the behaviors within an execution that can be blamed for a bug and that together compose an explanation for the bug. These explanations are up to two orders of magnitude smaller than those produced by existing state of the art techniques. We also examine how different choices of a correct execution for comparison can impact the practicality or potential quality of the explanations produced via our system

CiteSeerX

Purdue E-Pubs

Automatically generating complex test cases from simple ones

Author: Pezzè Mauro
Rubinov Konstantin
Publication venue
Publication date: 29/08/2014
Field of study

While source code expresses and implements design considerations for software system, test cases capture and represent the domain knowledge of software developer, her assumptions on the implicit and explicit interaction protocols in the system, and the expected behavior of different modules of the system in normal and exceptional conditions. Moreover, test cases capture information about the environment and the data the system operates on. As such, together with the system source code, test cases integrate important system and domain knowledge. Besides being an important project artifact, test cases embody up to the half the overall software development cost and effort. Software projects produce many test cases of different kind and granularity to thoroughly check the system functionality, aiming to prevent, detect, and remove different types of faults. Simple test cases exercise small parts of the system aiming to detect faults in single modules. More complex integration and system test cases exercise larger parts of the system aiming to detect problems in module interactions and verify the functionality of the system as a whole. Not surprisingly, the test case complexity comes at a cost -- developing complex test cases is a laborious and expensive task that is hard to automate. Our intuition is that important information that is naturally present in test cases can be reused to reduce the effort in generation of new test cases. This thesis develops this intuition and investigates the phenomenon of information reuse among test cases. We first empirically investigated many test cases from real software projects and demonstrated that test cases of different granularity indeed share code fragments and build upon each other. Then we proposed an approach for automatically generating complex test cases by extracting and exploiting information in existing simple ones. In particular, our approach automatically generates integration test cases from unit ones. We implemented our approach in a prototype to evaluate its ability to generate new and useful test cases for real software systems. Our studies show that test cases generated with our approach reveal new interaction faults even in well tested applications. We evaluated the effectiveness of our approach by comparing it with the state of the art test generation techniques. The evaluation results show that our approach is effective, it finds relevant faults differently from other approaches that tend to find different and usually less relevant faults

RERO DOC Digital Library

Reduction Drawing: Language Constructs and Polyhedral Compilation for Reductions on GPUs

Author: Cohen Albert
Kruse Michael
Reddy Chandan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/09/2016
Field of study

International audienceReductions are common in scientific and data-crunching codes, and a typical source of bottlenecks on massively parallel architectures such as GPUs. Reductions are memory-bound, and achieving peak performance involves sophisticated optimizations. There exist libraries such as CUB and Thrust providing highly tuned implementations of reductions on GPUs. However, library APIs are not flexible enough to express user-defined reductions on arbitrary data types and array indexing schemes. Languages such as OpenACC provide declarative syntax to express reductions. Such approaches support a limited range of reduction operators and do not facilitate the application of complex program transformations in presence of reductions. We present language constructs that let a programmer express arbitrary reductions on user-defined data types matching the performance of tuned library implementations. We also extend a polyhedral compilation flow to process these user-defined reductions, enabling optimizations such as the fusion of multiple reductions, combining reductions with other loop transformations, and optimizing data transfers and storage in the presence of reductions. We implemented these language constructs and compilation methods in the PPCG framework and conducted experiments on multiple GPU targets. For single reductions the generated code performs on par with highly tuned libraries, and for multiple reductions it significantly outperforms both libraries and OpenACC on all platforms

Crossref

INRIA a CCSD electronic archive server