Search CORE

15,978 research outputs found

Efficient Task Replication for Fast Response Times in Parallel Computation

Author: Joshi Gauri
Wang Da
Wornell Gregory
Publication venue
Publication date: 04/04/2014
Field of study

One typical use case of large-scale distributed computing in data centers is to decompose a computation job into many independent tasks and run them in parallel on different machines, sometimes known as the "embarrassingly parallel" computation. For this type of computation, one challenge is that the time to execute a task for each machine is inherently variable, and the overall response time is constrained by the execution time of the slowest machine. To address this issue, system designers introduce task replication, which sends the same task to multiple machines, and obtains result from the machine that finishes first. While task replication reduces response time, it usually increases resource usage. In this work, we propose a theoretical framework to analyze the trade-off between response time and resource usage. We show that, while in general, there is a tension between response time and resource usage, there exist scenarios where replicating tasks judiciously reduces completion time and resource usage simultaneously. Given the execution time distribution for machines, we investigate the conditions for a scheduling policy to achieve optimal performance trade-off, and propose efficient algorithms to search for optimal or near-optimal scheduling policies. Our analysis gives insights on when and why replication helps, which can be used to guide scheduler design in large-scale distributed computing systems.Comment: Extended version of the 2-page paper accepted to ACM SIGMETRICS 201

arXiv.org e-Print Archive

CiteSeerX

Rethinking State-Machine Replication for Parallelism

Author: Bezerra Carlos Eduardo
Marandi Parisa Jalili
Pedone Fernando
Publication venue
Publication date: 24/11/2013
Field of study

State-machine replication, a fundamental approach to designing fault-tolerant services, requires commands to be executed in the same order by all replicas. Moreover, command execution must be deterministic: each replica must produce the same output upon executing the same sequence of commands. These requirements usually result in single-threaded replicas, which hinders service performance. This paper introduces Parallel State-Machine Replication (P-SMR), a new approach to parallelism in state-machine replication. P-SMR scales better than previous proposals since no component plays a centralizing role in the execution of independent commands---those that can be executed concurrently, as defined by the service. The paper introduces P-SMR, describes a "commodified architecture" to implement it, and compares its performance to other proposals using a key-value store and a networked file system

arXiv.org e-Print Archive

Crossref

Dynamic scheduling in a multi-product manufacturing system

Author: Hassan Adnan
Mohd. Shaharoun Awaluddin
Oktaviandri Muchamad
Publication venue: Universiti Teknologi Malaysia
Publication date: 30/03/2005
Field of study

To remain competitive in global marketplace, manufacturing companies need to improve their operational practices. One of the methods to increase competitiveness in manufacturing is by implementing proper scheduling system. This is important to enable job orders to be completed on time, minimize waiting time and maximize utilization of equipment and machineries. The dynamics of real manufacturing system are very complex in nature. Schedules developed based on deterministic algorithms are unable to effectively deal with uncertainties in demand and capacity. Significant differences can be found between planned schedules and actual schedule implementation. This study attempted to develop a scheduling system that is able to react quickly and reliably for accommodating changes in product demand and manufacturing capacity. A case study, 6 by 6 job shop scheduling problem was adapted with uncertainty elements added to the data sets. A simulation model was designed and implemented using ARENA simulation package to generate various job shop scheduling scenarios. Their performances were evaluated using scheduling rules, namely, first-in-first-out (FIFO), earliest due date (EDD), and shortest processing time (SPT). An artificial neural network (ANN) model was developed and trained using various scheduling scenarios generated by ARENA simulation. The experimental results suggest that the ANN scheduling model can provided moderately reliable prediction results for limited scenarios when predicting the number completed jobs, maximum flowtime, average machine utilization, and average length of queue. This study has provided better understanding on the effects of changes in demand and capacity on the job shop schedules. Areas for further study includes: (i) Fine tune the proposed ANN scheduling model (ii) Consider more variety of job shop environment (iii) Incorporate an expert system for interpretation of results. The theoretical framework proposed in this study can be used as a basis for further investigation

Universiti Teknologi Malaysia Institutional Repository

MOON: MapReduce On Opportunistic eNvironments

Author: Archuleta Jeremy
Feng Wu-chun
Gardner Mark
Lin Heshan
Ma Xiaosong
Zhang Zhe
Publication venue
Publication date: 01/01/2009
Field of study

Abstract—MapReduce offers a ﬂexible programming model for processing and generating large data sets on dedicated resources, where only a small fraction of such resources are every unavailable at any given time. In contrast, when MapReduce is run on volunteer computing systems, which opportunistically harness idle desktop computers via frameworks like Condor, it results in poor performance due to the volatility of the resources, in particular, the high rate of node unavailability. Specifically, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. The adaptive task and data scheduling algorithms in MOON distinguish between (1) different types of MapReduce data and (2) different types of node outages in order to strategically place tasks and data on both volatile and dedicated nodes. Our tests demonstrate that MOON can deliver a 3-fold performance improvement to Hadoop in volatile, volunteer computing environments

Computer Science Technical Reports @Virginia Tech

CSP channels for CAN-bus connected embedded control systems

Author: Broenink Jan F.
Orlic Bojan
Publication venue: STW Technology Foundation
Publication date: 01/01/2002
Field of study

Closed loop control system typically contains multitude of sensors and actuators operated simultaneously. So they are parallel and distributed in its essence. But when mapping this parallelism to software, lot of obstacles concerning multithreading communication and synchronization issues arise. To overcome this problem, the CT kernel/library based on CSP algebra has been developed. This project (TES.5410) is about developing communication extension to the CT library to make it applicable in distributed systems. Since the library is tailored for control systems, properties and requirements of control systems are taken into special consideration. Applicability of existing middleware solutions is examined. A comparison of applicable fieldbus protocols is done in order to determine most suitable ones and CAN fieldbus is chosen to be first fieldbus used. Brief overview of CSP and existing CSP based libraries is given. Middleware architecture is proposed along with few novel ideas

University of Twente Research Information

The "MIND" Scalable PIM Architecture

Author: Brodowicz Maciej
Sterling Thomas
Publication venue
Publication date: 01/01/2005
Field of study

MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

Caltech Authors