Search CORE

164 research outputs found

Scalability of microkernel-based systems

Author: Uhlig Volkmar
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2005
Field of study

Performance visualization for parallel programs: task-based, object-oriented approach

Author: Kim Jungsun
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1994
Field of study

Developing and analyzing the performance of concurrent programs on distributed memory concurrent systems is normally a challenging task. Recently, performance visualization gains its importance as a critical tool for programmers. Programmers can have an insight into the development of parallel programs through a performance visualization. Most of the visualization tool to date have been developed for ad-hoc environments in hardware and software, and therefore its lifetime is limited. Since, however, new architectures keep emerging and application domains for distributed memory concurrent computer systems keep growing, the visualization tool should be flexible enough to accommodate unknown future demands of users (eg. new performance perspectives, application-specific views and disparate trace record formats);The Concurrent Object-Oriented ParaGraph (COOPG) is a prototype, general-purpose performance visualization package developed using an object-oriented approach. An object-oriented approach, both in design and implementation, provides a mechanism to build a simple, flexible, effective, and extensible performance visualization tool. The salient features of the COOPG include its flexible adaptability to disparate trace record formats and the incremental extensibility for incorporating user\u27s special-purpose views

Digital Repository @ Iowa State University (ISU)

Recommended from our members

Better Semijoins Using Tuple Bit-Vectors

Author: Li Zhe
Ross Kenneth A.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1994
Field of study

This paper presents the idea of "tuple-bit-vectors" for distributed query processing. Using tuple bit-vectors, a new two-way semijoin operator called 2SJ++ that enhances the semijoin with an essentially "free" backward reduction capability is proposed. We explore in detail the benefits and costs of 2SJ++ compared with other semijoin variants, and its effect on distributed query processing performance. We then focus on one particular distributed query processing algorithm, called the "one-shot" algorithm. We modify the one-shot algorithm by using 2SJ++ and demonstrate the improvements achieved in network transmission cost compared with the original one-shot technique. We use this improvement to demonstrate that equipped with the 2SJ++ technique, one can improve the performance of distributed query processing algorithms significantly without adding much complexity to the algorithms

Columbia University Academic Commons

An evaluation between Bloom Filter join and PERF join in Distributed Query Processing

Author: Pei Ming
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2008
Field of study

Nowadays, with the explosion of information and the telecommunication era\u27s coming, more and more huge applications encourage decentralization of data while accessing data from different sites [HFB00]. The process of retrieving data from different sites called Distributed Query Processing. The objective of distributed query optimization is to find the most cost-effective of executing query across the network [OV99]. Semijoin [BC81] [BG+81] is known as an effective operator to eliminate the tuples of a relation which are not contributive to a query. 2-way semijoin [KR87] is an extended version of semijoin which not only performs forward reduction like traditional semijoin does, but also provides backward reduction always in cost-effective way. Bloom Filter[B70] and PERF [LR95] are 2 filter based techniques which use a bit vector to represent of the original join attributes projection during the data transmission. Compare with generating a bit array with hash function in bloom filter, Perf join is based on the tuples scan order to avoid losing information caused by hash collision. In the thesis, we will apply both bloom filter and pert on 2-way semijoin algorithms to reduce transmission cost of distributed queries. Performance of propose algorithms will compare against each others and IFS (Initial Feasible Solution) through amount of experiments. \u27Keywords:\u27 Distributed Query Processing, Semijoin, Bloom Filter, Perf Join

Scholarship at UWindsor

Doctor of Philosophy

Author: Burtsev Anton
Publication venue: University of Utah
Publication date: 01/05/2013
Field of study

dissertationA modern software system is a composition of parts that are themselves highly complex: operating systems, middleware, libraries, servers, and so on. In principle, compositionality of interfaces means that we can understand any given module independently of the internal workings of other parts. In practice, however, abstractions are leaky, and with every generation, modern software systems grow in complexity. Traditional ways of understanding failures, explaining anomalous executions, and analyzing performance are reaching their limits in the face of emergent behavior, unrepeatability, cross-component execution, software aging, and adversarial changes to the system at run time. Deterministic systems analysis has a potential to change the way we analyze and debug software systems. Recorded once, the execution of the system becomes an independent artifact, which can be analyzed offline. The availability of the complete system state, the guaranteed behavior of re-execution, and the absence of limitations on the run-time complexity of analysis collectively enable the deep, iterative, and automatic exploration of the dynamic properties of the system. This work creates a foundation for making deterministic replay a ubiquitous system analysis tool. It defines design and engineering principles for building fast and practical replay machines capable of capturing complete execution of the entire operating system with an overhead of several percents, on a realistic workload, and with minimal installation costs. To enable an intuitive interface of constructing replay analysis tools, this work implements a powerful virtual machine introspection layer that enables an analysis algorithm to be programmed against the state of the recorded system through familiar terms of source-level variable and type names. To support performance analysis, the replay engine provides a faithful performance model of the original execution during replay

The University of Utah: J. Willard Marriott Digital Library

Development of an Intelligent Monitoring and Control System for a Heterogeneous Numerical Propulsion System Simulation

Author: Afjeh Abdollah A.
Homer Patrick T.
Lewandowski Henry
Reed John A.
Schlichting Richard D.
Publication venue
Publication date
Field of study

The NASA Numerical Propulsion System Simulation (NPSS) project is exploring the use of computer simulation to facilitate the design of new jet engines. Several key issues raised in this research are being examined in an NPSS-related research project: zooming, monitoring and control, and support for heterogeneity. The design of a simulation executive that addresses each of these issues is described. In this work, the strategy of zooming, which allows codes that model at different levels of fidelity to be integrated within a single simulation, is applied to the fan component of a turbofan propulsion system. A prototype monitoring and control system has been designed for this simulation to support experimentation with expert system techniques for active control of the simulation. An interconnection system provides a transparent means of connecting the heterogeneous systems that comprise the prototype

NASA Technical Reports Server

A Uniform Approach to Constraint Satisfaction and Constraint Satisfiability in Deductive Databases

Author: Bry François
Decker Hendrik
Manthey Rainer
Schmidt Joachim W.
Publication venue
Publication date: 01/01/1988
Field of study

Open Access LMU

Proceedings Work-In-Progress Session of the 13th Real-Time and Embedded Technology and Applications Symposium

Author: Lu Chenyang
Publication venue: Washington University Open Scholarship
Publication date: 03/04/2007
Field of study

The Work-In-Progress session of the 13th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS\u2707) presents papers describing contributions both to state of the art and state of the practice in the broad field of real-time and embedded systems. The 17 accepted papers were selected from 19 submissions. This proceedings is also available as Washington University in St. Louis Technical Report WUCSE-2007-17, at http://www.cse.seas.wustl.edu/Research/FileDownload.asp?733. Special thanks go to the General Chairs – Steve Goddard and Steve Liu and Program Chairs - Scott Brandt and Frank Mueller for their support and guidance

Washington University St. Louis: Open Scholarship

Distributed Duplicate Removal

Author: Schlag Sebastian
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2013
Field of study

Ziel der verteilten Duplikaterkennung ist die Identifikation von Elementen, welche mehrfach in einer großen, über mehrere Rechenknoten verteilten Datenmenge vorkommen. Sanders et al. [48] präsentieren einen verteilten Algorithmus, welcher dieses Problem in einer besonders kommunikationseffizienten Art und Weise löst. In einer Vorverarbeitungsphase werden mit Hilfe eines verteilten, platz-effizienten Bloom Filters zunächst möglichst viele distinkte Elemente als solche identifiziert und somit die Gesamtmenge der noch zu betrachtenden Elemente stark reduziert. Da hierbei jedoch auch falsch positive Ergebnisse auftreten, müssen alle als potentiell nicht distinkt erkannten Elemente in einer zweiten Phase noch einmal überprüft werden. Hierzu wird ein klassischer Hash-basierter Algorithmus zur verteilten Duplikaterkennung angewendet. Die vorliegende Arbeit ergänzt die theoretische Analyse durch eine praktische Evaluation. Wir erarbeiten hierzu eine effiziente Implementierung für Shared-Nothing Systeme. Besonders rechenintensive Schritte des Algorithmus werden zusätzlich durch Shared-Memory-Programmierung innerhalb eines Knotens parallelisiert. Die Ergebnisse unserer experimentellen Untersuchung untermauern die durch die Theorie vorhergesagten Vorteile des Algorithmus. Unsere Implementierung ist signifikant schneller als der am besten geeignete klassische Ansatz solange die Eingabedaten zu weniger als 50% aus Duplikaten bestehen. Wird der Algorithmus auf Datensätzen ausgeführt, die zu weniger als 10% aus Duplikaten bestehen, so ist das gesamte Kommunikationsvolumen zudem mehr als eine Größenordnung kleiner als das des klassischen Konkurrenten

KITopen

Summarizing multiprocessor program execution with versatile, microarchitecture-independent snapshots

Author: Barr Kenneth C. (Kenneth Charles), 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 131-137).Computer architects rely heavily on software simulation to evaluate, refine, and validate new designs before they are implemented. However, simulation time continues to increase as computers become more complex and multicore designs become more common. This thesis investigates software structures and algorithms for quickly simulating modern cache-coherent multiprocessors by amortizing the time spent to simulate the memory system and branch predictors. The Memory Timestamp Record (MTR) summarizes the directory and cache state of a multiprocessor system in a compact data structure. A single MTR snapshot is versatile enough to reconstruct the microarchitectural state resulting from various coherence protocols and cache organizations. The MTR may be quickly updated by each simulated processor during a fast-forwarding phase and optionally stored off-line for reuse. To fill large branch prediction tables, we introduce Branch Predictor-based Compression (BPC) which compactly stores a branch trace so that it may be used to fill in any branch predictor structure. An entire BPC trace requires less space than single discrete predictor snapshots, and it may be decompressed 3-6x faster than performing functional simulation.by Kenneth C. Barr.Ph.D

DSpace@MIT