6,831 research outputs found
SQPR: Stream Query Planning with Reuse
When users submit new queries to a distributed stream processing system (DSPS), a query planner must allocate physical resources, such as CPU cores, memory and network bandwidth, from a set of hosts to queries. Allocation decisions must provide the correct mix of resources required by queries, while achieving an efficient overall allocation to scale in the number of admitted queries. By exploiting overlap between queries and reusing partial results, a query planner can conserve resources but has to carry out more complex planning decisions. In this paper, we describe SQPR, a query planner that targets DSPSs in data centre environments with heterogeneous resources. SQPR models query admission, allocation and reuse as a single constrained optimisation problem and solves an approximate version to achieve scalability. It prevents individual resources from becoming bottlenecks by re-planning past allocation decisions and supports different allocation objectives. As our experimental evaluation in comparison with a state-of-the-art planner shows SQPR makes efficient resource allocation decisions, even with a high utilisation of resources, with acceptable overheads
Influences on Throughput and Latency in Stream Programs
Vu Thien Nga Nguyen and Raimund Kirner, 'Influences on Throughput and Latency in Stream Programs' paper presented at the 2nd Workshop on Feedback-Directed Compiler Optimization for Multi-Core Architectures. Berlin, Germany. 22 January 2013Stream programming is a promising approach to execute programs on parallel hardware such as multi-core systems. It allows to reuse sequential code at component level and to extend such code with concurrency-handling at the communication level. In this paper we investigate in the performance of stream programs in terms of throughput and latency. We identify factors that affect these performance metrics and propose an efficient scheduling approach to obtain the maximal performance
Working with OpenCL to Speed Up a Genetic Programming Financial Forecasting Algorithm: Initial Results
The genetic programming tool EDDIE has been shown to be a successful financial forecasting tool, however it has suffered from an increase in execution time as new features have been added. Speed is an important aspect in financial problems, especially in the field of algorithmic trading, where a delay in taking a decision could cost millions. To offset this performance loss, EDDIE has been modified to take advantage of multi-core CPUs and dedicated GPUs. This has been achieved by modifying the candidate solution evaluation to use an OpenCL kernel, allowing the parallel evaluation of solutions. Our computational results have shown improvements in the running time of EDDIE when the evaluation was delegated to the OpenCL kernel running on a multi-core CPU, with speed ups up to 21 times faster than the original EDDIE algorithm. While most previous works in the literature reported significantly improvements in performance when running an OpenCL kernel on a GPU device, we did not observe this in our results. Further investigation revealed that memory copying overheads and branching code in the kernel are potentially causes of the (under-)performance of the OpenCL kernel when running on the GPU device
Automatizing Price Negotiation in Commodities Markets
This is an introductory work to trade automatization of the futures market, so far operated by human traders. We are not focusing on maximizing individual profits of any trader as done in many studies, but rather we try to build a stable electronic trading system allowing to obtain a fair price, based on supply and demand dynamics, in order to avoid speculative bubbles and crashes. In our setup, producers and consumers release regularly their forecasts of output and consumption respectively. Automated traders will use this information to negotiate price of the underlying commodity. We suggested a set of analytical criteria allowing to measure the efficiency of the automatic trading strategy in respect to market stability.Automated Traders, Optimal Strategies, Futures Market, Commodities Trading
Network overload avoidance by traffic engineering and content caching
The Internet traffic volume continues to grow at a great rate, now driven by video and TV distribution. For network operators it is important to avoid congestion in the network, and to meet service level agreements with their customers. This thesis presents work on two methods operators can use to reduce links loads in their networks: traffic engineering and content caching.
This thesis studies access patterns for TV and video and the potential for caching. The investigation is done both using simulation and by analysis of logs from a large TV-on-Demand system over four months.
The results show that there is a small set of programs that account for a large fraction of the requests and that a comparatively small local cache can be used to significantly reduce the peak link loads during prime time. The investigation also demonstrates how the popularity of programs changes over time and shows that the access pattern in a TV-on-Demand system very much depends on the content type.
For traffic engineering the objective is to avoid congestion in the network and to make better use of available resources by adapting the routing to the current traffic situation. The main challenge for traffic engineering in IP networks is to cope with the dynamics of Internet traffic demands.
This thesis proposes L-balanced routings that route the traffic on the shortest paths possible but make sure that no link is utilised to more than a given level L. L-balanced routing gives efficient routing of traffic and controlled spare capacity to handle unpredictable changes in traffic. We present an L-balanced routing algorithm and a heuristic search method for finding L-balanced weight settings for the legacy routing protocols OSPF and IS-IS. We show that the search and the resulting weight settings work well in real network scenarios
An Efficient Execution Model for Reactive Stream Programs
Stream programming is a paradigm where a program is structured by a set of computational nodes connected by streams. Focusing on data moving between computational nodes via streams, this programming model fits well for applications that process long
sequences of data. We call such applications reactive stream programs (RSPs) to distinguish them from stream programs with rather small and finite input data.
In stream programming, concurrency is expressed implicitly via communication streams. This helps to reduce the complexity of parallel programming. For this reason, stream programming has gained popularity as a programming model for parallel platforms.
However, it is also challenging to analyse and improve the performance without an understanding of the program's internal behaviour. This thesis targets an effi cient execution model for deploying RSPs on parallel platforms. This execution model includes a monitoring framework to understand the internal behaviour of RSPs, scheduling strategies for RSPs on uniform shared-memory platforms; and mapping techniques for deploying RSPs on heterogeneous distributed platforms. The foundation of the execution model is based on a study of the performance of RSPs in terms of throughput and latency. This study includes quantitative formulae for throughput and latency; and the identification
of factors that influence these performance metrics.
Based on the study of RSP performance, this thesis exploits characteristics of RSPs to derive effective scheduling strategies on uniform shared-memory platforms. Aiming to optimise both throughput and latency, these scheduling strategies are implemented in two heuristic-based schedulers. Both of them are designed to be centralised to provide load balancing for RSPs with dynamic behaviour as well as dynamic structures. The first one uses the notion of positive and negative data demands on each stream to
determine the scheduling priorities. This scheduler is independent from the runtime system. The second one requires the runtime system to provide the position information for each computational node in the RSP; and uses that to decide the scheduling priorities.
Our experiments show that both schedulers provides similar performance while being significantly better than a reference implementation without dynamic load balancing.
Also based on the study of RSP performance, we present in this thesis two new heuristic partitioning algorithms which are used to map RSPs onto heterogeneous distributed platforms. These are Kernighan-Lin Adaptation (KLA) and Congestion Avoidance (CA),
where the main objective is to optimise the throughput. This is a multi-parameter optimisation problem where existing graph partitioning algorithms are not applicable. Compared to the generic meta-heuristic Simulated Annealing algorithm, both proposed
algorithms achieve equally good or better results. KLA is faster for small benchmarks while slower for large ones. In contrast, CA is always orders of magnitudes faster even for very large benchmarks
Sum-of-Squares approach to feedback control of laminar wake flows
A novel nonlinear feedback control design methodology for incompressible
fluid flows aiming at the optimisation of long-time averages of flow quantities
is presented. It applies to reduced-order finite-dimensional models of fluid
flows, expressed as a set of first-order nonlinear ordinary differential
equations with the right-hand side being a polynomial function in the state
variables and in the controls. The key idea, first discussed in Chernyshenko et
al. 2014, Philos. T. Roy. Soc. 372(2020), is that the difficulties of treating
and optimising long-time averages of a cost are relaxed by using the
upper/lower bounds of such averages as the objective function. In this setting,
control design reduces to finding a feedback controller that optimises the
bound, subject to a polynomial inequality constraint involving the cost
function, the nonlinear system, the controller itself and a tunable polynomial
function. A numerically tractable approach to the solution of such optimisation
problems, based on Sum-of-Squares techniques and semidefinite programming, is
proposed.
To showcase the methodology, the mitigation of the fluctuation kinetic energy
in the unsteady wake behind a circular cylinder in the laminar regime at
Re=100, via controlled angular motions of the surface, is numerically
investigated. A compact reduced-order model that resolves the long-term
behaviour of the fluid flow and the effects of actuation, is derived using
Proper Orthogonal Decomposition and Galerkin projection. In a full-information
setting, feedback controllers are then designed to reduce the long-time average
of the kinetic energy associated with the limit cycle. These controllers are
then implemented in direct numerical simulations of the actuated flow. Control
performance, energy efficiency, and physical control mechanisms identified are
analysed. Key elements, implications and future work are discussed
Improving Runtime Overheads for detectEr
We design monitor optimisations for detectEr, a runtime-verification tool
synthesising systems of concurrent monitors from correctness properties for
Erlang programs. We implement these optimisations as part of the existing tool
and show that they yield considerably lower runtime overheads when compared to
the unoptimised monitor synthesis.Comment: In Proceedings FESCA 2015, arXiv:1503.0437
- …