3,461 research outputs found
ReSHAPE: A Framework for Dynamic Resizing and Scheduling of Homogeneous Applications in a Parallel Environment
Applications in science and engineering often require huge computational
resources for solving problems within a reasonable time frame. Parallel
supercomputers provide the computational infrastructure for solving such
problems. A traditional application scheduler running on a parallel cluster
only supports static scheduling where the number of processors allocated to an
application remains fixed throughout the lifetime of execution of the job. Due
to the unpredictability in job arrival times and varying resource requirements,
static scheduling can result in idle system resources thereby decreasing the
overall system throughput. In this paper we present a prototype framework
called ReSHAPE, which supports dynamic resizing of parallel MPI applications
executed on distributed memory platforms. The framework includes a scheduler
that supports resizing of applications, an API to enable applications to
interact with the scheduler, and a library that makes resizing viable.
Applications executed using the ReSHAPE scheduler framework can expand to take
advantage of additional free processors or can shrink to accommodate a high
priority application, without getting suspended. In our research, we have
mainly focused on structured applications that have two-dimensional data arrays
distributed across a two-dimensional processor grid. The resize library
includes algorithms for processor selection and processor mapping. Experimental
results show that the ReSHAPE framework can improve individual job turn-around
time and overall system throughput.Comment: 15 pages, 10 figures, 5 tables Submitted to International Conference
on Parallel Processing (ICPP'07
Characterizing Deep-Learning I/O Workloads in TensorFlow
The performance of Deep-Learning (DL) computing frameworks rely on the
performance of data ingestion and checkpointing. In fact, during the training,
a considerable high number of relatively small files are first loaded and
pre-processed on CPUs and then moved to accelerator for computation. In
addition, checkpointing and restart operations are carried out to allow DL
computing frameworks to restart quickly from a checkpoint. Because of this, I/O
affects the performance of DL applications. In this work, we characterize the
I/O performance and scaling of TensorFlow, an open-source programming framework
developed by Google and specifically designed for solving DL problems. To
measure TensorFlow I/O performance, we first design a micro-benchmark to
measure TensorFlow reads, and then use a TensorFlow mini-application based on
AlexNet to measure the performance cost of I/O and checkpointing in TensorFlow.
To improve the checkpointing performance, we design and implement a burst
buffer. We find that increasing the number of threads increases TensorFlow
bandwidth by a maximum of 2.3x and 7.8x on our benchmark environments. The use
of the tensorFlow prefetcher results in a complete overlap of computation on
accelerator and input pipeline on CPU eliminating the effective cost of I/O on
the overall performance. The use of a burst buffer to checkpoint to a fast
small capacity storage and copy asynchronously the checkpoints to a slower
large capacity storage resulted in a performance improvement of 2.6x with
respect to checkpointing directly to slower storage on our benchmark
environment.Comment: Accepted for publication at pdsw-DISCS 201
Exploiting parallel computing with limited program changes using a network of microcomputers
Network computing and multiprocessor computers are two discernible trends in parallel processing. The computational behavior of an iterative distributed process in which some subtasks are completed later than others because of an imbalance in computational requirements is of significant interest. The effects of asynchronus processing was studied. A small existing program was converted to perform finite element analysis by distributing substructure analysis over a network of four Apple IIe microcomputers connected to a shared disk, simulating a parallel computer. The substructure analysis uses an iterative, fully stressed, structural resizing procedure. A framework of beams divided into three substructures is used as the finite element model. The effects of asynchronous processing on the convergence of the design variables are determined by not resizing particular substructures on various iterations
Boosting Multi-Core Reachability Performance with Shared Hash Tables
This paper focuses on data structures for multi-core reachability, which is a
key component in model checking algorithms and other verification methods. A
cornerstone of an efficient solution is the storage of visited states. In
related work, static partitioning of the state space was combined with
thread-local storage and resulted in reasonable speedups, but left open whether
improvements are possible. In this paper, we present a scaling solution for
shared state storage which is based on a lockless hash table implementation.
The solution is specifically designed for the cache architecture of modern
CPUs. Because model checking algorithms impose loose requirements on the hash
table operations, their design can be streamlined substantially compared to
related work on lockless hash tables. Still, an implementation of the hash
table presented here has dozens of sensitive performance parameters (bucket
size, cache line size, data layout, probing sequence, etc.). We analyzed their
impact and compared the resulting speedups with related tools. Our
implementation outperforms two state-of-the-art multi-core model checkers (SPIN
and DiVinE) by a substantial margin, while placing fewer constraints on the
load balancing and search algorithms.Comment: preliminary repor
Web-based user interface prototyping and simulation
PVSio-web is a platform for the simulation and prototyping of user interfaces that has been developed by researchers at Queen Mary University of London. This platform aims to reduce barriers to the use of PVS by users unfamiliar with formal methods.
The main features of PVSio-web focus on creating, opening and saving projects, loading images and creating widget areas over them, and not least, editing PVS files. Editing files is limited to only one file per project and the platform also does not have image editing features. PVSio-web can then be improved by implementing features to support editing multiple files and images (such as cropping and resizing). The interaction areas can also be improved to thereby
enhance the quality of the prototype, by adjusting the precision of the dimensions and positioning of the area relatively to the image.
In this dissertation the improvements achieved on the editing of files, images and interaction areas in PVSio-web, in order to increase the quality and optimize its use in a real environment, are described.PVSio-web é uma plataforma para prototipagem e simulação de interfaces de utilizador
que tem vindo a ser desenvolvida por investigadores da Queen Mary Universidade de Londres.
Esta plataforma tem por objectivo diminuir as barreiras à utilização do PVS por parte de
utilizadores não familiarizados com métodos formais.
Para dar suporte à criação de protótipos, o PVSio-web possui funcionalidades para criar,
abrir e gravar projectos, carregar de imagens e definir áreas de interação sobre esta e, não
menos importante, edição de ficheiros PVS. A edição de ficheiros está limitada a apenas um
único ficheiro por projecto e a plataforma não possui também funcionalidades de edição de
imagem. O PVSio-web pode então ser melhorado com a implementação de funcionalidades para
o suporte de edição de múltiplos ficheiros e de imagem (por exemplo corte e
redimensionamento). As áreas de interação podem também ser melhoradas para assim
aumentar a qualidade do protótipo, ajustando a precisão das dimensões e posicionamento da
área em relação à imagem.
Nesta dissertação são descritos os melhoramentos realizados a nível de edição de
ficheiros, imagens e áreas de interação no PVSio-web de modo a aumentar a qualidade e
otimizar o seu uso em ambiente real
Operating-system support for distributed multimedia
Multimedia applications place new demands upon processors, networks and operating systems. While some network designers, through ATM for example, have considered revolutionary approaches to supporting multimedia, the same cannot be said for operating systems designers. Most work is evolutionary in nature, attempting to identify additional features that can be added to existing systems to support multimedia. Here we describe the Pegasus project's attempt to build an integrated hardware and operating system environment from\ud
the ground up specifically targeted towards multimedia
- …