6,831 research outputs found

    SQPR: Stream Query Planning with Reuse

    Get PDF
    When users submit new queries to a distributed stream processing system (DSPS), a query planner must allocate physical resources, such as CPU cores, memory and network bandwidth, from a set of hosts to queries. Allocation decisions must provide the correct mix of resources required by queries, while achieving an efficient overall allocation to scale in the number of admitted queries. By exploiting overlap between queries and reusing partial results, a query planner can conserve resources but has to carry out more complex planning decisions. In this paper, we describe SQPR, a query planner that targets DSPSs in data centre environments with heterogeneous resources. SQPR models query admission, allocation and reuse as a single constrained optimisation problem and solves an approximate version to achieve scalability. It prevents individual resources from becoming bottlenecks by re-planning past allocation decisions and supports different allocation objectives. As our experimental evaluation in comparison with a state-of-the-art planner shows SQPR makes efficient resource allocation decisions, even with a high utilisation of resources, with acceptable overheads

    Influences on Throughput and Latency in Stream Programs

    Get PDF
    Vu Thien Nga Nguyen and Raimund Kirner, 'Influences on Throughput and Latency in Stream Programs' paper presented at the 2nd Workshop on Feedback-Directed Compiler Optimization for Multi-Core Architectures. Berlin, Germany. 22 January 2013Stream programming is a promising approach to execute programs on parallel hardware such as multi-core systems. It allows to reuse sequential code at component level and to extend such code with concurrency-handling at the communication level. In this paper we investigate in the performance of stream programs in terms of throughput and latency. We identify factors that affect these performance metrics and propose an efficient scheduling approach to obtain the maximal performance

    Working with OpenCL to Speed Up a Genetic Programming Financial Forecasting Algorithm: Initial Results

    Get PDF
    The genetic programming tool EDDIE has been shown to be a successful financial forecasting tool, however it has suffered from an increase in execution time as new features have been added. Speed is an important aspect in financial problems, especially in the field of algorithmic trading, where a delay in taking a decision could cost millions. To offset this performance loss, EDDIE has been modified to take advantage of multi-core CPUs and dedicated GPUs. This has been achieved by modifying the candidate solution evaluation to use an OpenCL kernel, allowing the parallel evaluation of solutions. Our computational results have shown improvements in the running time of EDDIE when the evaluation was delegated to the OpenCL kernel running on a multi-core CPU, with speed ups up to 21 times faster than the original EDDIE algorithm. While most previous works in the literature reported significantly improvements in performance when running an OpenCL kernel on a GPU device, we did not observe this in our results. Further investigation revealed that memory copying overheads and branching code in the kernel are potentially causes of the (under-)performance of the OpenCL kernel when running on the GPU device

    Automatizing Price Negotiation in Commodities Markets

    Get PDF
    This is an introductory work to trade automatization of the futures market, so far operated by human traders. We are not focusing on maximizing individual profits of any trader as done in many studies, but rather we try to build a stable electronic trading system allowing to obtain a fair price, based on supply and demand dynamics, in order to avoid speculative bubbles and crashes. In our setup, producers and consumers release regularly their forecasts of output and consumption respectively. Automated traders will use this information to negotiate price of the underlying commodity. We suggested a set of analytical criteria allowing to measure the efficiency of the automatic trading strategy in respect to market stability.Automated Traders, Optimal Strategies, Futures Market, Commodities Trading

    Network overload avoidance by traffic engineering and content caching

    Get PDF
    The Internet traffic volume continues to grow at a great rate, now driven by video and TV distribution. For network operators it is important to avoid congestion in the network, and to meet service level agreements with their customers. This thesis presents work on two methods operators can use to reduce links loads in their networks: traffic engineering and content caching. This thesis studies access patterns for TV and video and the potential for caching. The investigation is done both using simulation and by analysis of logs from a large TV-on-Demand system over four months. The results show that there is a small set of programs that account for a large fraction of the requests and that a comparatively small local cache can be used to significantly reduce the peak link loads during prime time. The investigation also demonstrates how the popularity of programs changes over time and shows that the access pattern in a TV-on-Demand system very much depends on the content type. For traffic engineering the objective is to avoid congestion in the network and to make better use of available resources by adapting the routing to the current traffic situation. The main challenge for traffic engineering in IP networks is to cope with the dynamics of Internet traffic demands. This thesis proposes L-balanced routings that route the traffic on the shortest paths possible but make sure that no link is utilised to more than a given level L. L-balanced routing gives efficient routing of traffic and controlled spare capacity to handle unpredictable changes in traffic. We present an L-balanced routing algorithm and a heuristic search method for finding L-balanced weight settings for the legacy routing protocols OSPF and IS-IS. We show that the search and the resulting weight settings work well in real network scenarios

    An Efficient Execution Model for Reactive Stream Programs

    Get PDF
    Stream programming is a paradigm where a program is structured by a set of computational nodes connected by streams. Focusing on data moving between computational nodes via streams, this programming model fits well for applications that process long sequences of data. We call such applications reactive stream programs (RSPs) to distinguish them from stream programs with rather small and finite input data. In stream programming, concurrency is expressed implicitly via communication streams. This helps to reduce the complexity of parallel programming. For this reason, stream programming has gained popularity as a programming model for parallel platforms. However, it is also challenging to analyse and improve the performance without an understanding of the program's internal behaviour. This thesis targets an effi cient execution model for deploying RSPs on parallel platforms. This execution model includes a monitoring framework to understand the internal behaviour of RSPs, scheduling strategies for RSPs on uniform shared-memory platforms; and mapping techniques for deploying RSPs on heterogeneous distributed platforms. The foundation of the execution model is based on a study of the performance of RSPs in terms of throughput and latency. This study includes quantitative formulae for throughput and latency; and the identification of factors that influence these performance metrics. Based on the study of RSP performance, this thesis exploits characteristics of RSPs to derive effective scheduling strategies on uniform shared-memory platforms. Aiming to optimise both throughput and latency, these scheduling strategies are implemented in two heuristic-based schedulers. Both of them are designed to be centralised to provide load balancing for RSPs with dynamic behaviour as well as dynamic structures. The first one uses the notion of positive and negative data demands on each stream to determine the scheduling priorities. This scheduler is independent from the runtime system. The second one requires the runtime system to provide the position information for each computational node in the RSP; and uses that to decide the scheduling priorities. Our experiments show that both schedulers provides similar performance while being significantly better than a reference implementation without dynamic load balancing. Also based on the study of RSP performance, we present in this thesis two new heuristic partitioning algorithms which are used to map RSPs onto heterogeneous distributed platforms. These are Kernighan-Lin Adaptation (KLA) and Congestion Avoidance (CA), where the main objective is to optimise the throughput. This is a multi-parameter optimisation problem where existing graph partitioning algorithms are not applicable. Compared to the generic meta-heuristic Simulated Annealing algorithm, both proposed algorithms achieve equally good or better results. KLA is faster for small benchmarks while slower for large ones. In contrast, CA is always orders of magnitudes faster even for very large benchmarks

    Sum-of-Squares approach to feedback control of laminar wake flows

    Get PDF
    A novel nonlinear feedback control design methodology for incompressible fluid flows aiming at the optimisation of long-time averages of flow quantities is presented. It applies to reduced-order finite-dimensional models of fluid flows, expressed as a set of first-order nonlinear ordinary differential equations with the right-hand side being a polynomial function in the state variables and in the controls. The key idea, first discussed in Chernyshenko et al. 2014, Philos. T. Roy. Soc. 372(2020), is that the difficulties of treating and optimising long-time averages of a cost are relaxed by using the upper/lower bounds of such averages as the objective function. In this setting, control design reduces to finding a feedback controller that optimises the bound, subject to a polynomial inequality constraint involving the cost function, the nonlinear system, the controller itself and a tunable polynomial function. A numerically tractable approach to the solution of such optimisation problems, based on Sum-of-Squares techniques and semidefinite programming, is proposed. To showcase the methodology, the mitigation of the fluctuation kinetic energy in the unsteady wake behind a circular cylinder in the laminar regime at Re=100, via controlled angular motions of the surface, is numerically investigated. A compact reduced-order model that resolves the long-term behaviour of the fluid flow and the effects of actuation, is derived using Proper Orthogonal Decomposition and Galerkin projection. In a full-information setting, feedback controllers are then designed to reduce the long-time average of the kinetic energy associated with the limit cycle. These controllers are then implemented in direct numerical simulations of the actuated flow. Control performance, energy efficiency, and physical control mechanisms identified are analysed. Key elements, implications and future work are discussed

    Improving Runtime Overheads for detectEr

    Full text link
    We design monitor optimisations for detectEr, a runtime-verification tool synthesising systems of concurrent monitors from correctness properties for Erlang programs. We implement these optimisations as part of the existing tool and show that they yield considerably lower runtime overheads when compared to the unoptimised monitor synthesis.Comment: In Proceedings FESCA 2015, arXiv:1503.0437
    corecore