1,804 research outputs found
Computationally Efficient Simulation of Queues: The R Package queuecomputer
Large networks of queueing systems model important real-world systems such as
MapReduce clusters, web-servers, hospitals, call centers and airport passenger
terminals. To model such systems accurately, we must infer queueing parameters
from data. Unfortunately, for many queueing networks there is no clear way to
proceed with parameter inference from data. Approximate Bayesian computation
could offer a straightforward way to infer parameters for such networks if we
could simulate data quickly enough.
We present a computationally efficient method for simulating from a very
general set of queueing networks with the R package queuecomputer. Remarkable
speedups of more than 2 orders of magnitude are observed relative to the
popular DES packages simmer and simpy. We replicate output from these packages
to validate the package.
The package is modular and integrates well with the popular R package dplyr.
Complex queueing networks with tandem, parallel and fork/join topologies can
easily be built with these two packages together. We show how to use this
package with two examples: a call center and an airport terminal.Comment: Updated for queuecomputer_0.8.
Low-Impact Profiling of Streaming, Heterogeneous Applications
Computer engineers are continually faced with the task of translating improvements in fabrication process technology: i.e., Moore\u27s Law) into architectures that allow computer scientists to accelerate application performance. As feature-size continues to shrink, architects of commodity processors are designing increasingly more cores on a chip. While additional cores can operate independently with some tasks: e.g. the OS and user tasks), many applications see little to no improvement from adding more processor cores alone. For many applications, heterogeneous systems offer a path toward higher performance. Significant performance and power gains have been realized by combining specialized processors: e.g., Field-Programmable Gate Arrays, Graphics Processing Units) with general purpose multi-core processors. Heterogeneous applications need to be programmed differently than traditional software. One approach, stream processing, fits these systems particularly well because of the segmented memories and explicit expression of parallelism. Unfortunately, debugging and performance tools that support streaming, heterogeneous applications do not exist. This dissertation presents TimeTrial, a performance measurement system that enables performance optimization of streaming applications by profiling the application deployed on a heterogeneous system. TimeTrial performs low-impact measurements by dedicating computing resources to monitoring and by aggressively compressing performance traces into statistical summaries guided by user specification of the performance queries of interest
Adaptive heterogeneous parallelism for semi-empirical lattice dynamics in computational materials science.
With the variability in performance of the multitude of parallel environments available today, the conceptual overhead created by the need to anticipate runtime information to make design-time decisions has become overwhelming. Performance-critical applications and libraries carry implicit assumptions based on incidental metrics that are not portable to emerging computational platforms or even alternative contemporary architectures. Furthermore, the significance of runtime concerns such as makespan, energy efficiency and fault tolerance depends on the situational context. This thesis presents a case study in the application of both Mattsons prescriptive pattern-oriented approach and the more principled structured parallelism formalism to the computational simulation of inelastic neutron scattering spectra on hybrid CPU/GPU platforms. The original ad hoc implementation as well as new patternbased and structured implementations are evaluated for relative performance and scalability. Two new structural abstractions are introduced to facilitate adaptation by lazy optimisation and runtime feedback. A deferred-choice abstraction represents a unified space of alternative structural program variants, allowing static adaptation through model-specific exhaustive calibration with regards to the extrafunctional concerns of runtime, average instantaneous power and total energy usage. Instrumented queues serve as mechanism for structural composition and provide a representation of extrafunctional state that allows realisation of a market-based decentralised coordination heuristic for competitive resource allocation and the Lyapunov drift algorithm for cooperative scheduling
Sprinklers: A Randomized Variable-Size Striping Approach to Reordering-Free Load-Balanced Switching
Internet traffic continues to grow exponentially, calling for switches that
can scale well in both size and speed. While load-balanced switches can achieve
such scalability, they suffer from a fundamental packet reordering problem.
Existing proposals either suffer from poor worst-case packet delays or require
sophisticated matching mechanisms. In this paper, we propose a new family of
stable load-balanced switches called "Sprinklers" that has comparable
implementation cost and performance as the baseline load-balanced switch, but
yet can guarantee packet ordering. The main idea is to force all packets within
the same virtual output queue (VOQ) to traverse the same "fat path" through the
switch, so that packet reordering cannot occur. At the core of Sprinklers are
two key innovations: a randomized way to determine the "fat path" for each VOQ,
and a way to determine its "fatness" roughly in proportion to the rate of the
VOQ. These innovations enable Sprinklers to achieve near-perfect load-balancing
under arbitrary admissible traffic. Proving this property rigorously using
novel worst-case large deviation techniques is another key contribution of this
work
A Robust Aggregation Approach To Simplification Of Manufacturing Flow Line Models
One of the more difficult tasks facing a modeler in developing a simulation model of a discrete part manufacturing system is deciding at what level of abstraction to represent the resources of the system. For example, questions about plant capacity can be modeled with a simple model, whereas questions regarding the efficiency of different part scheduling rules can only be answered with a more detailed model. In developing a simulation model, most of the actual features of the system under study must be ignored and an abstraction must be developed. If done correctly, this idealization provides a useful approximation of the real system. Unfortunately, many individuals claim that the process of building a simulation model is an “intuitive art.” The objective of this research is to introduce aspects of “science” to model development by defining quantitative techniques for developing an aggregate simulation model for estimating part cycle time of a manufacturing flow line. The methodology integrates aspects of queueing theory, a recursive algorithm, and simulation to develop the specifications necessary for combining resources of a flow line into a reduced set of aggregation resources. Experimentation shows that developing a simulation model with the aggregation resources results in accurate interval estimates of the average part cycle time
A Review of Models of Urban Traffic Networks (With Particular reference to the Requirements for Modelling Dynamic Route Guidance Systems)
This paper reviews a number of existing models of urban traffic networks developed in Europe and North America. The primary intention is to evaluate the various models with regard to their suitability to simulate traffic conditions and driver behavior when a dynamic route guidance system is in operation
Collaborative Uploading in Heterogeneous Networks: Optimal and Adaptive Strategies
Collaborative uploading describes a type of crowdsourcing scenario in
networked environments where a device utilizes multiple paths over neighboring
devices to upload content to a centralized processing entity such as a cloud
service. Intermediate devices may aggregate and preprocess this data stream.
Such scenarios arise in the composition and aggregation of information, e.g.,
from smartphones or sensors. We use a queuing theoretic description of the
collaborative uploading scenario, capturing the ability to split data into
chunks that are then transmitted over multiple paths, and finally merged at the
destination. We analyze replication and allocation strategies that control the
mapping of data to paths and provide closed-form expressions that pinpoint the
optimal strategy given a description of the paths' service distributions.
Finally, we provide an online path-aware adaptation of the allocation strategy
that uses statistical inference to sequentially minimize the expected waiting
time for the uploaded data. Numerical results show the effectiveness of the
adaptive approach compared to the proportional allocation and a variant of the
join-the-shortest-queue allocation, especially for bursty path conditions.Comment: 15 pages, 11 figures, extended version of a conference paper accepted
for publication in the Proceedings of the IEEE International Conference on
Computer Communications (INFOCOM), 201
- …