464 research outputs found
Network calculus for parallel processing
In this note, we present preliminary results on the use of "network calculus"
for parallel processing systems, specifically MapReduce
Parallel discrete event simulation: A shared memory approach
With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models
Deterministic Consistency: A Programming Model for Shared Memory Parallelism
The difficulty of developing reliable parallel software is generating
interest in deterministic environments, where a given program and input can
yield only one possible result. Languages or type systems can enforce
determinism in new code, and runtime systems can impose synthetic schedules on
legacy parallel code. To parallelize existing serial code, however, we would
like a programming model that is naturally deterministic without language
restrictions or artificial scheduling. We propose "deterministic consistency",
a parallel programming model as easy to understand as the "parallel assignment"
construct in sequential languages such as Perl and JavaScript, where concurrent
threads always read their inputs before writing shared outputs. DC supports
common data- and task-parallel synchronization abstractions such as fork/join
and barriers, as well as non-hierarchical structures such as producer/consumer
pipelines and futures. A preliminary prototype suggests that software-only
implementations of DC can run applications written for popular parallel
environments such as OpenMP with low (<10%) overhead for some applications.Comment: 7 pages, 3 figure
Efficient Redundancy Techniques in Cloud and Desktop Grid Systems using MAP/G/c-type Queues
Cloud computing is continuing to prove its flexibility and versatility in helping industries and businesses as well as academia as a way of providing needed computing capacity. As an important alternative to cloud computing, desktop grids allow to utilize the idle computer resources of an enterprise/community by means of distributed computing system, providing a more secure and controllable environment with lower operational expenses. Further, both cloud computing and desktop grids are meant to optimize limited resources and at the same time to decrease the expected latency for users. The crucial parameter for optimization both in cloud computing and in desktop grids is the level of redundancy (replication) for service requests/workunits. In this paper we study the optimal replication policies by considering three variations of Fork-Join systems in the context of a multi-server queueing system with a versatile point process for the arrivals. For services we consider phase type distributions as well as shifted exponential and Weibull. We use both analytical and simulation approach in our analysis and report some interesting qualitative results
Nested Fork-Join Queuing Networks and Their Application to Mobility Airfield Operations Analysis
A single-chain nested fork-join queuing network (FJQN) model of mobility airfield ground processing is proposed. In order to analyze the queuing network model, advances on two fronts are made. First, a general technique for decomposing nested FJQNs with probabilistic forks is proposed, which consists of incorporating feedback loops into the embedded Markov chain of the synchronization station, then using Marie\u27s Method to decompose the network. Numerical studies show this strategy to be effective, with less than two percent relative error in the approximate performance measures in most realistic cases. The second contribution is the identification of a quick, efficient method for solving for the stationary probabilities of the λn/Ck/r/N queue. Unpreconditioned Conjugate Gradient Squared is shown to be the method of choice in the context of decomposition using Marie\u27s Method, thus broadening the class of networks where the method is of practical use. The mobility airfield model is analyzed using the strategies described above, and accurate approximations of airfield performance measures are obtained in a fraction of the time needed for a simulation study. The proposed airfield modeling approach is especially effective for quick-look studies and sensitivity analysis
A Simple, Practical Prioritization Scheme for a Job Shop Processing Multiple Job Types
The maintenance, repair, and overhaul (MRO) process is used to recondition equipment in the railroad, off-shore drilling, aircraft, and shipping industries. In the typical MRO process, the equipment is disassembled into component parts and these parts are routed to back-shops for repair. Repaired parts are returned for reassembling the equipment. Scheduling the back-shop for smooth flow often requires prioritizing the repair of component parts from different original assemblies at different machines. To enable such prioritization, we model the back-shop as a multi-class queueing network with a ConWIP execution system and introduce a new priority scheme to maximize the system performance. In this scheme, we identify the bottleneck machine based on overall workload and classify machines into two categories: the bottleneck machine and the non-bottleneck machine(s). Assemblies with the lowest cycle time receive the highest priority on the bottleneck machine and the lowest priority on non-bottleneck machine(s). Our experimental results show that this priority scheme increases the system performance by lowering the average cycle times without adversely impacting the total throughput.
The contribution of this thesis consists primarily of three parts. First, we develop a simple priority scheme for multi-class, multi-server, ConWIP queueing systems with the disassembly/reassembly feature so that schedulers for a job-shop environment would be able to know which part should be given priority, in what order and where. Next, we provide an exact analytical solution to a two-class, two-server closed queueing model with mixed non-preemptive priority scheme. The queueing network model we study has not been analyzed in the literature, and there are no existing models that address the underlying problem of deciding prioritization by job types to maximize the system performance. Finally, we explore conditions under which the non-preemptive priority discipline can be approximated by a preemptive priority discipline
Hierarchical Analyses Applied to Computer System Performance: Review and Call for Further Studies
We review studies based on analytic and simulation methods for hierarchical
performance analysis of Queueing Network - QN models, which result in an order
of magnitude reduction in performance evaluation cost with respect to
simulation. The computational cost at the lower level is reduced when the
computer system is modeled as a product-form QN. A Continuous Time Markov Chain
- CTMC or discrete-event simulation can then be used at the higher level. We
first consider a multiprogrammed transaction - txn processing system with
Poisson arrivals and predeclared locks requests. Txn throughputs obtained by
the analysis of multiprogrammed computer systems serve as the transition rates
in a higher level CTMC to determine txn response times. We next analyze a task
system where task precedence relationships are specified by a directed acyclic
graph to determine its makespan. Task service demands are specified on the
devices of a computer system. The composition of tasks in execution determines
txn throughputs, which serve as transition rates among the states of the higher
level CTMC model. As a third example we consider the hierarchical simulation of
a timesharing system with two user classes. Txn throughputs in processing
various combinations of requests are obtained by analyzing a closed
product-form QN model. A discrete event simulator is provided. More detailed QN
modeling parameters, such as the distribution of the number of cycles in
central server model - CSM affects the performance of a fork/join queueing
system. This detail can be taken into account in Schwetman's hybrid simulation
method, which counts remaining cycles in CSM. We propose an extension to hybrid
simulation to adjust job service demands according to elapsed time, rather than
counting cycles. An example where Equilibrium Point Analysis to reduce
computaional cost is privided
- …