Search CORE

657 research outputs found

A general framework to realize an abstract machine as an ILP processor with application to java

Author: WANG HAICHEN
Publication venue
Publication date: 05/05/2007
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Performance Enhancement of Multicore Processors using Dynamic Load Balancing

Author: Ajay Tiwari
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2014
Field of study

Introduction of Multi-core Architecture has opened new area for researchers where dynamic load balancing can be applied to distribute the work load among the cores. Multi-core Architecture provides hardware parallelism through cores inside CPU. Its increased performance and low cost as compared to single-core machines, attracts High Performance Computing (HPC) community. The paper proposes a user level dynamic load balancing model for multi-core processors using Java multi-threading and use of Java I/O framework for I/O operations

International Journal on Recent and Innovation Trends in Computing and Communication

A Survey on Compiler Autotuning using Machine Learning

Author: Ashouri Amir H.
Cavazos John
Killian William
Palermo Gianluca
Silvano Cristina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/09/2018
Field of study

Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Fast, Interactive Worst-Case Execution Time Analysis With Back-Annotation

Author: Harmon Trevor
Kim Kwang H.
Kirner Raimund
Klefstad Raymond
Lowry Michael R.
Schoeberl Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Abstract—For hard real-time systems, static code analysis is needed to derive a safe bound on the worst-case execution time (WCET). Virtually all prior work has focused on the accuracy of WCET analysis without regard to the speed of analysis. The resulting algorithms are often too slow to be integrated into the development cycle, requiring WCET analysis to be postponed until a final verification phase. In this paper we propose interactive WCET analysis as a new method to provide near-instantaneous WCET feedback to the developer during software programming. We show that interactive WCET analysis is feasible using tree-based WCET calculation. The feedback is realized with a plugin for the Java editor jEdit, where the WCET values are back-annotated to the Java source at the statement level. Comparison of this treebased approach with the implicit path enumeration technique (IPET) shows that tree-based analysis scales better with respect to program size and gives similar WCET values. Index Terms—Real time systems, performance analysis, software performance, software reliability, software algorithms, safety I

CiteSeerX

Online Research Database In Technology

University of Hertfordshire Research Archive

An automated OpenCL FPGA compilation framework targeting a configurable, VLIW chip multiprocessor

Author: Samuel J. Parker (7203041)
Publication venue
Publication date: 01/01/2015
Field of study

Modern system-on-chips augment their baseline CPU with coprocessors and accelerators to increase overall computational capacity and power efficiency, and thus have evolved into heterogeneous systems. Several languages have been developed to enable this paradigm shift, including CUDA and OpenCL. This thesis discusses a unified compilation environment to enable heterogeneous system design through the use of OpenCL and a customised VLIW chip multiprocessor (CMP) architecture, known as the LE1. An LLVM compilation framework was researched and a prototype developed to enable the execution of OpenCL applications on the LE1 CPU. The framework fully automates the compilation flow and supports work-item coalescing to better utilise the CPU cores and alleviate the effects of thread divergence. This thesis discusses in detail both the software stack and target hardware architecture and evaluates the scalability of the proposed framework on a highly precise cycle-accurate simulator. This is achieved through the execution of 12 benchmarks across 240 different machine configurations, as well as further results utilising an incomplete development branch of the compiler. It is shown that the problems generally scale well with the LE1 architecture, up to eight cores, when the memory system becomes a serious bottleneck. Results demonstrate superlinear performance on certain benchmarks (x9 for the bitonic sort benchmark with 8 dual-issue cores) with further improvements from compiler optimisations (x14 for bitonic with the same configuration

Loughborough University Institutional Repository

Exploiting managed language semantics to optimize for hardware heterogeneity

Author: Akram Shoaib
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore Architecture

Author
Publication venue: 'Linkoping University Electronic Press'
Publication date
Field of study

Crossref

Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach

Author: Christian W. Probst
Ens De Lyon
Florian Br
Martin Schoeberl
Pascal Schleuniger
Sven Karlsson
Tommy Thorn
Wolfgang Puffitsch
Publication venue: OASICS
Publication date: 01/01/2011
Field of study

Current processors are optimized for average case performance, often leading to a high worst-case execution time (WCET). Many architectural features that increase the average case performance are hard to be modeled for the WCET analysis. In this paper we present Patmos, a processor optimized for low WCET bounds rather than high average case performance. Patmos is a dual-issue, statically scheduled RISC processor. The instruction cache is organized as a method cache and the data cache is organized as a split cache in order to simplify the cache WCET analysis. To fill the dual-issue pipeline with enough useful instructions, Patmos relies on a customized compiler. The compiler also plays a central role in optimizing the application for the WCET instead of average case performance

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Online Research Database In Technology

Hal-Diderot