1,804 research outputs found

    Computationally Efficient Simulation of Queues: The R Package queuecomputer

    Get PDF
    Large networks of queueing systems model important real-world systems such as MapReduce clusters, web-servers, hospitals, call centers and airport passenger terminals. To model such systems accurately, we must infer queueing parameters from data. Unfortunately, for many queueing networks there is no clear way to proceed with parameter inference from data. Approximate Bayesian computation could offer a straightforward way to infer parameters for such networks if we could simulate data quickly enough. We present a computationally efficient method for simulating from a very general set of queueing networks with the R package queuecomputer. Remarkable speedups of more than 2 orders of magnitude are observed relative to the popular DES packages simmer and simpy. We replicate output from these packages to validate the package. The package is modular and integrates well with the popular R package dplyr. Complex queueing networks with tandem, parallel and fork/join topologies can easily be built with these two packages together. We show how to use this package with two examples: a call center and an airport terminal.Comment: Updated for queuecomputer_0.8.

    Low-Impact Profiling of Streaming, Heterogeneous Applications

    Get PDF
    Computer engineers are continually faced with the task of translating improvements in fabrication process technology: i.e., Moore\u27s Law) into architectures that allow computer scientists to accelerate application performance. As feature-size continues to shrink, architects of commodity processors are designing increasingly more cores on a chip. While additional cores can operate independently with some tasks: e.g. the OS and user tasks), many applications see little to no improvement from adding more processor cores alone. For many applications, heterogeneous systems offer a path toward higher performance. Significant performance and power gains have been realized by combining specialized processors: e.g., Field-Programmable Gate Arrays, Graphics Processing Units) with general purpose multi-core processors. Heterogeneous applications need to be programmed differently than traditional software. One approach, stream processing, fits these systems particularly well because of the segmented memories and explicit expression of parallelism. Unfortunately, debugging and performance tools that support streaming, heterogeneous applications do not exist. This dissertation presents TimeTrial, a performance measurement system that enables performance optimization of streaming applications by profiling the application deployed on a heterogeneous system. TimeTrial performs low-impact measurements by dedicating computing resources to monitoring and by aggressively compressing performance traces into statistical summaries guided by user specification of the performance queries of interest

    On packet switch design

    Get PDF

    Adaptive heterogeneous parallelism for semi-empirical lattice dynamics in computational materials science.

    Get PDF
    With the variability in performance of the multitude of parallel environments available today, the conceptual overhead created by the need to anticipate runtime information to make design-time decisions has become overwhelming. Performance-critical applications and libraries carry implicit assumptions based on incidental metrics that are not portable to emerging computational platforms or even alternative contemporary architectures. Furthermore, the significance of runtime concerns such as makespan, energy efficiency and fault tolerance depends on the situational context. This thesis presents a case study in the application of both Mattsons prescriptive pattern-oriented approach and the more principled structured parallelism formalism to the computational simulation of inelastic neutron scattering spectra on hybrid CPU/GPU platforms. The original ad hoc implementation as well as new patternbased and structured implementations are evaluated for relative performance and scalability. Two new structural abstractions are introduced to facilitate adaptation by lazy optimisation and runtime feedback. A deferred-choice abstraction represents a unified space of alternative structural program variants, allowing static adaptation through model-specific exhaustive calibration with regards to the extrafunctional concerns of runtime, average instantaneous power and total energy usage. Instrumented queues serve as mechanism for structural composition and provide a representation of extrafunctional state that allows realisation of a market-based decentralised coordination heuristic for competitive resource allocation and the Lyapunov drift algorithm for cooperative scheduling

    Sprinklers: A Randomized Variable-Size Striping Approach to Reordering-Free Load-Balanced Switching

    Full text link
    Internet traffic continues to grow exponentially, calling for switches that can scale well in both size and speed. While load-balanced switches can achieve such scalability, they suffer from a fundamental packet reordering problem. Existing proposals either suffer from poor worst-case packet delays or require sophisticated matching mechanisms. In this paper, we propose a new family of stable load-balanced switches called "Sprinklers" that has comparable implementation cost and performance as the baseline load-balanced switch, but yet can guarantee packet ordering. The main idea is to force all packets within the same virtual output queue (VOQ) to traverse the same "fat path" through the switch, so that packet reordering cannot occur. At the core of Sprinklers are two key innovations: a randomized way to determine the "fat path" for each VOQ, and a way to determine its "fatness" roughly in proportion to the rate of the VOQ. These innovations enable Sprinklers to achieve near-perfect load-balancing under arbitrary admissible traffic. Proving this property rigorously using novel worst-case large deviation techniques is another key contribution of this work

    A Robust Aggregation Approach To Simplification Of Manufacturing Flow Line Models

    Get PDF
    One of the more difficult tasks facing a modeler in developing a simulation model of a discrete part manufacturing system is deciding at what level of abstraction to represent the resources of the system. For example, questions about plant capacity can be modeled with a simple model, whereas questions regarding the efficiency of different part scheduling rules can only be answered with a more detailed model. In developing a simulation model, most of the actual features of the system under study must be ignored and an abstraction must be developed. If done correctly, this idealization provides a useful approximation of the real system. Unfortunately, many individuals claim that the process of building a simulation model is an “intuitive art.” The objective of this research is to introduce aspects of “science” to model development by defining quantitative techniques for developing an aggregate simulation model for estimating part cycle time of a manufacturing flow line. The methodology integrates aspects of queueing theory, a recursive algorithm, and simulation to develop the specifications necessary for combining resources of a flow line into a reduced set of aggregation resources. Experimentation shows that developing a simulation model with the aggregation resources results in accurate interval estimates of the average part cycle time

    A Review of Models of Urban Traffic Networks (With Particular reference to the Requirements for Modelling Dynamic Route Guidance Systems)

    Get PDF
    This paper reviews a number of existing models of urban traffic networks developed in Europe and North America. The primary intention is to evaluate the various models with regard to their suitability to simulate traffic conditions and driver behavior when a dynamic route guidance system is in operation

    Collaborative Uploading in Heterogeneous Networks: Optimal and Adaptive Strategies

    Full text link
    Collaborative uploading describes a type of crowdsourcing scenario in networked environments where a device utilizes multiple paths over neighboring devices to upload content to a centralized processing entity such as a cloud service. Intermediate devices may aggregate and preprocess this data stream. Such scenarios arise in the composition and aggregation of information, e.g., from smartphones or sensors. We use a queuing theoretic description of the collaborative uploading scenario, capturing the ability to split data into chunks that are then transmitted over multiple paths, and finally merged at the destination. We analyze replication and allocation strategies that control the mapping of data to paths and provide closed-form expressions that pinpoint the optimal strategy given a description of the paths' service distributions. Finally, we provide an online path-aware adaptation of the allocation strategy that uses statistical inference to sequentially minimize the expected waiting time for the uploaded data. Numerical results show the effectiveness of the adaptive approach compared to the proportional allocation and a variant of the join-the-shortest-queue allocation, especially for bursty path conditions.Comment: 15 pages, 11 figures, extended version of a conference paper accepted for publication in the Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), 201
    corecore