Search CORE

15,759 research outputs found

Tarmo: A Framework for Parallelized Bounded Model Checking

Author: Alessandro Cimatti
Antti E. J. Hyvärinen
Antti Eero Johannes Hyvärinen
Armin Biere
Armin Biere
Armin Biere
Daniel Le Berre
Dimosthenis Mpekas
Erika Ábrahám
Hantao Zhang
Jaco van de Pol
Keijo Heljanko
Keijo Heljanko
Lubos Brim
Martin Davis
Matthew W. Moskewicz
Matti Niemenmaa
Niklas Eén
Niklas Eén
Siert Wieringa
Tobias Schubert
William Gropp
Youssef Hamadi
Publication venue: 'Open Publishing Association'
Publication date: 01/12/2009
Field of study

This paper investigates approaches to parallelizing Bounded Model Checking (BMC) for shared memory environments as well as for clusters of workstations. We present a generic framework for parallelized BMC named Tarmo. Our framework can be used with any incremental SAT encoding for BMC but for the results in this paper we use only the current state-of-the-art encoding for full PLTL. Using this encoding allows us to check both safety and liveness properties, contrary to an earlier work on distributing BMC that is limited to safety properties only. Despite our focus on BMC after it has been translated to SAT, existing distributed SAT solvers are not well suited for our application. This is because solving a BMC problem is not solving a set of independent SAT instances but rather involves solving multiple related SAT instances, encoded incrementally, where the satisfiability of each instance corresponds to the existence of a counterexample of a specific length. Our framework includes a generic architecture for a shared clause database that allows easy clause sharing between SAT solver threads solving various such instances. We present extensive experimental results obtained with multiple variants of our Tarmo implementation. Our shared memory variants have a significantly better performance than conventional single threaded approaches, which is a result that many users can benefit from as multi-core and multi-processor technology is widely available. Furthermore we demonstrate that our framework can be deployed in a typical cluster of workstations, where several multi-core machines are connected by a network

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

The Design of a System Architecture for Mobile Multimedia Computers

Author: Havinga Paul Johannes Mattheus
Publication venue: University of Twente
Publication date: 01/01/2000
Field of study

This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies

CiteSeerX

University of Twente Research Information

The role of the host in a cooperating mainframe and workstation environment, volumes 1 and 2

Author: Kusmanoff Antone
Martin Nancy L.
Publication venue
Publication date
Field of study

In recent years, advancements made in computer systems have prompted a move from centralized computing based on timesharing a large mainframe computer to distributed computing based on a connected set of engineering workstations. A major factor in this advancement is the increased performance and lower cost of engineering workstations. The shift to distributed computing from centralized computing has led to challenges associated with the residency of application programs within the system. In a combined system of multiple engineering workstations attached to a mainframe host, the question arises as to how does a system designer assign applications between the larger mainframe host and the smaller, yet powerful, workstation. The concepts related to real time data processing are analyzed and systems are displayed which use a host mainframe and a number of engineering workstations interconnected by a local area network. In most cases, distributed systems can be classified as having a single function or multiple functions and as executing programs in real time or nonreal time. In a system of multiple computers, the degree of autonomy of the computers is important; a system with one master control computer generally differs in reliability, performance, and complexity from a system in which all computers share the control. This research is concerned with generating general criteria principles for software residency decisions (host or workstation) for a diverse yet coupled group of users (the clustered workstations) which may need the use of a shared resource (the mainframe) to perform their functions

NASA Technical Reports Server

The role of graphics super-workstations in a supercomputing environment

Author: Levin E.
Publication venue
Publication date
Field of study

A new class of very powerful workstations has recently become available which integrate near supercomputer computational performance with very powerful and high quality graphics capability. These graphics super-workstations are expected to play an increasingly important role in providing an enhanced environment for supercomputer users. Their potential uses include: off-loading the supercomputer (by serving as stand-alone processors, by post-processing of the output of supercomputer calculations, and by distributed or shared processing), scientific visualization (understanding of results, communication of results), and by real time interaction with the supercomputer (to steer an iterative computation, to abort a bad run, or to explore and develop new algorithms)

NASA Technical Reports Server

Proceedings of the NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications

Author: Blackwell Kim
Blasso Len
Lipscomb Ann
Publication venue
Publication date
Field of study

The proceedings of the National Space Science Data Center Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications held July 23 through 25, 1991 at the NASA/Goddard Space Flight Center are presented. The program includes a keynote address, invited technical papers, and selected technical presentations to provide a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include magnetic disk and tape technologies, optical disk and tape, software storage and file management systems, and experiences with the use of a large, distributed storage system. The technical presentations describe integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's

NASA Technical Reports Server

Distributed systems status and control

Author: Kreidler David
Vickers David
Publication venue
Publication date
Field of study

Concepts are investigated for an automated status and control system for a distributed processing environment. System characteristics, data requirements for health assessment, data acquisition methods, system diagnosis methods and control methods were investigated in an attempt to determine the high-level requirements for a system which can be used to assess the health of a distributed processing system and implement control procedures to maintain an accepted level of health for the system. A potential concept for automated status and control includes the use of expert system techniques to assess the health of the system, detect and diagnose faults, and initiate or recommend actions to correct the faults. Therefore, this research included the investigation of methods by which expert systems were developed for real-time environments and distributed systems. The focus is on the features required by real-time expert systems and the tools available to develop real-time expert systems

NASA Technical Reports Server

Characterizing Deep-Learning I/O Workloads in TensorFlow

Author: Chien Steven W. D.
Herman Pawel
Laure Erwin
Markidis Stefano
Narasimhamurthy Sai
Santos Luis
Sishtla Chaitanya Prasad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/10/2018
Field of study

The performance of Deep-Learning (DL) computing frameworks rely on the performance of data ingestion and checkpointing. In fact, during the training, a considerable high number of relatively small files are first loaded and pre-processed on CPUs and then moved to accelerator for computation. In addition, checkpointing and restart operations are carried out to allow DL computing frameworks to restart quickly from a checkpoint. Because of this, I/O affects the performance of DL applications. In this work, we characterize the I/O performance and scaling of TensorFlow, an open-source programming framework developed by Google and specifically designed for solving DL problems. To measure TensorFlow I/O performance, we first design a micro-benchmark to measure TensorFlow reads, and then use a TensorFlow mini-application based on AlexNet to measure the performance cost of I/O and checkpointing in TensorFlow. To improve the checkpointing performance, we design and implement a burst buffer. We find that increasing the number of threads increases TensorFlow bandwidth by a maximum of 2.3x and 7.8x on our benchmark environments. The use of the tensorFlow prefetcher results in a complete overlap of computation on accelerator and input pipeline on CPU eliminating the effective cost of I/O on the overall performance. The use of a burst buffer to checkpoint to a fast small capacity storage and copy asynchronously the checkpoints to a slower large capacity storage resulted in a performance improvement of 2.6x with respect to checkpointing directly to slower storage on our benchmark environment.Comment: Accepted for publication at pdsw-DISCS 201

arXiv.org e-Print Archive

Crossref