Search CORE

19,183 research outputs found

Parameter dependencies for reusable performance specifications of software components

Author: Koziolek Heiko
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2008
Field of study

To avoid design-related performance problems, model-driven performance prediction methods analyse the response times, throughputs, and resource utilizations of software architectures before and during implementation. This thesis proposes new modeling languages and according model transformations, which allow a reusable description of usage profile dependencies to the performance of software components. Predictions based on this new methods can support performance-related design decisions

KITopen

Directory of Open Access Books (DOAB)

ILP and TLP in Shared Memory Applications: A Limit Study

Author: Fatehi Ehsan
Publication venue
Publication date: 21/09/2015
Field of study

The work in this dissertation explores the limits of Chip-multiprocessors (CMPs) with respect to shared-memory, multi-threaded benchmarks, which will help aid in identifying microarchitectural bottlenecks. This, in turn, will lead to more efficient CMP design. In the first part we introduce DotSim, a trace-driven toolkit designed to explore the limits of instruction and thread-level scaling and identify microarchitectural bottlenecks in multi-threaded applications. DotSim constructs an instruction-level Data Flow Graph (DFG) from each thread in multi-threaded applications, adjusting for inter-thread dependencies. The DFGs dynamically change depending on the microarchitectural constraints applied. Exploiting these DFGs allows for the easy extraction of the performance upper bound. We perform a case study on modeling the upper-bound performance limits of a processor microarchitecture modeled off a AMD Opteron. In the second part, we conduct a limit study simultaneously analyzing the two dominant forms of parallelism exploited by modern computer architectures: Instruction Level Parallelism (ILP) and Thread Level Parallelism (TLP). This study gives insight into the upper bounds of performance that future architectures can achieve. Furthermore, it identifies the bottlenecks of emerging workloads. To the best of our knowledge, our work is the first study that combines the two forms of parallelism into one study with modern applications. We evaluate the PARSEC multithreaded benchmark suite using DotSim. We make several contributions describing the high-level behavior of next-generation applications. For example, we show that these applications contain up to a factor of 929X more ILP than what is currently being extracted from real machines. We then show the effects of breaking the application into increasing numbers of threads (exploiting TLP), instruction window size, realistic branch prediction, realistic memory latency, and thread dependencies on exploitable ILP. Our examination shows that theses benchmarks differ vastly from one another. As a result, we expect that no single, homogeneous, micro-architecture will work optimally for all, arguing for reconfigurable, heterogeneous designs. In the third part of this thesis, we use our novel simulator DotSim to study the benefits of prefetching shared memory within critical sections. In this chapter we calculate the upper bound of performance under our given constraints. Our intent is to provide motivation for new techniques to exploit the potential benefits of reducing latency of shared memory among threads. We conduct an idealized workload characterization study focusing on the data that is truly shared among threads, using a simplified memory model. We explore the degree of shared memory criticality, and characterize the benefits of being able to use latency reducing techniques to reduce execution time and increase ILP. We find that on average true sharing among benchmarks is quite low compared to overall memory accesses on the critical path and overall program. We also find that truly shared memory between threads does not affect the critical path for the majority of benchmarks, and when it does the impact is less than 1%. Therefore, we conclude that it is not worth exploring latency reducing techniques of truly shared memory within critical sections

Texas A&M Repository

BRAHMS: Novel middleware for integrated systems computation

Author: Ben Mitchinson
Brown
Charles Fox
Chavarriaga
Dennett
Djurfeldt
Dominey
Fleischer
Franz
Gewaltig
Girard
Gurney
Howell
Jon Chambers
Kevin Gurney
Mall
Mark Humphries
Martin Pearson
Michel
Mitchinson
Mitchinson
Parnas
Pearson
Pearson
Prescott
Tak-Shing Chan
Tony J. Prescott
Weitzenfeld
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Biological computational modellers are becoming increasingly interested in building large, eclectic models, including components on many different computational substrates, both biological and non-biological. At the same time, the rise of the philosophy of embodied modelling is generating a need to deploy biological models as controllers for robots in real-world environments. Finally, robotics engineers are beginning to find value in seconding biomimetic control strategies for use on practical robots. Together with the ubiquitous desire to make good on past software development effort, these trends are throwing up new challenges of intellectual and technological integration (for example across scales, across disciplines, and even across time) - challenges that are unmet by existing software frameworks. Here, we outline these challenges in detail, and go on to describe a newly developed software framework, BRAHMS. that meets them. BRAHMS is a tool for integrating computational process modules into a viable, computable system: its generality and flexibility facilitate integration across barriers, such as those described above, in a coherent and effective way. We go on to describe several cases where BRAHMS has been successfully deployed in practical situations. We also show excellent performance in comparison with a monolithic development approach. Additional benefits of developing in the framework include source code self-documentation, automatic coarse-grained parallelisation, cross-language integration, data logging, performance monitoring, and will include dynamic load-balancing and 'pause and continue' execution. BRAHMS is built on the nascent, and similarly general purpose, model markup language, SystemML. This will, in future, also facilitate repeatability and accountability (same answers ten years from now), transparent automatic software distribution, and interfacing with other SystemML tools. (C) 2009 Elsevier Ltd. All rights reserved

Crossref

UWE Bristol Research Repository

White Rose Research Online

Making intelligent systems team players: Overview for designers

Author: Malin Jane T.
Schreckenghost Debra L.
Publication venue
Publication date
Field of study

This report is a guide and companion to the NASA Technical Memorandum 104738, 'Making Intelligent Systems Team Players,' Volumes 1 and 2. The first two volumes of this Technical Memorandum provide comprehensive guidance to designers of intelligent systems for real-time fault management of space systems, with the objective of achieving more effective human interaction. This report provides an analysis of the material discussed in the Technical Memorandum. It clarifies what it means for an intelligent system to be a team player, and how such systems are designed. It identifies significant intelligent system design problems and their impacts on reliability and usability. Where common design practice is not effective in solving these problems, we make recommendations for these situations. In this report, we summarize the main points in the Technical Memorandum and identify where to look for further information

NASA Technical Reports Server