5,781 research outputs found
Measuring and Managing Answer Quality for Online Data-Intensive Services
Online data-intensive services parallelize query execution across distributed
software components. Interactive response time is a priority, so online query
executions return answers without waiting for slow running components to
finish. However, data from these slow components could lead to better answers.
We propose Ubora, an approach to measure the effect of slow running components
on the quality of answers. Ubora randomly samples online queries and executes
them twice. The first execution elides data from slow components and provides
fast online answers; the second execution waits for all components to complete.
Ubora uses memoization to speed up mature executions by replaying network
messages exchanged between components. Our systems-level implementation works
for a wide range of platforms, including Hadoop/Yarn, Apache Lucene, the
EasyRec Recommendation Engine, and the OpenEphyra question answering system.
Ubora computes answer quality much faster than competing approaches that do not
use memoization. With Ubora, we show that answer quality can and should be used
to guide online admission control. Our adaptive controller processed 37% more
queries than a competing controller guided by the rate of timeouts.Comment: Technical Repor
Scaling of a large-scale simulation of synchronous slow-wave and asynchronous awake-like activity of a cortical model with long-range interconnections
Cortical synapse organization supports a range of dynamic states on multiple
spatial and temporal scales, from synchronous slow wave activity (SWA),
characteristic of deep sleep or anesthesia, to fluctuating, asynchronous
activity during wakefulness (AW). Such dynamic diversity poses a challenge for
producing efficient large-scale simulations that embody realistic metaphors of
short- and long-range synaptic connectivity. In fact, during SWA and AW
different spatial extents of the cortical tissue are active in a given timespan
and at different firing rates, which implies a wide variety of loads of local
computation and communication. A balanced evaluation of simulation performance
and robustness should therefore include tests of a variety of cortical dynamic
states. Here, we demonstrate performance scaling of our proprietary Distributed
and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and
AW for bidimensional grids of neural populations, which reflects the modular
organization of the cortex. We explored networks up to 192x192 modules, each
composed of 1250 integrate-and-fire neurons with spike-frequency adaptation,
and exponentially decaying inter-modular synaptic connectivity with varying
spatial decay constant. For the largest networks the total number of synapses
was over 70 billion. The execution platform included up to 64 dual-socket
nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40GHz
clock rates. Network initialization time, memory usage, and execution time
showed good scaling performances from 1 to 1024 processes, implemented using
the standard Message Passing Interface (MPI) protocol. We achieved simulation
speeds of between 2.3x10^9 and 4.1x10^9 synaptic events per second for both
cortical states in the explored range of inter-modular interconnections.Comment: 22 pages, 9 figures, 4 table
Performance and energy optimization on terasort algorithm by task self-resizing
In applications of MapReduce, Terasort is one of the most successful ones, which has helped Hadoop to win the Sort Benchmark three times. While Terasort is known for its sorting speed on big data, its performance and energy consumption still can be optimized. We have analyzed the characteristics of Terasort and have identified the existence of idle notes, which does not only waste energy but also loses performance. Therefore, we optimize Terasort through a single-task distributed algorithm and a task self-resizing algorithm to save time and reduce the energy that is consumed by map nodes, which is caused by waiting for tasks and reduce nodes waiting for input. The algorithm proposed in this paper has proved to be effective in optimizing performance and energy consumption through a series of experiments. It can also be adapted to other applications in the MapReduce environment
Code management automation for Erlang remote actors
Distributed Erlang provides mechanisms for spawning actors remotely through its remote spawn BIF. However, for remote spawn to function properly, the node hosting the spawned actor must share the same codebase as that of the node launching the actor. This assumption turns out to be too strong for various distributed settings. We propose a higher-level framework for the remote spawn of side effect free actors, abstracting from and automating codebase migration and management.peer-reviewe
Load sharing in distributed computer systems
PhD ThesisIn this thesis the problem of load sharing in distributed computer systems is
investigated. Fundamental issues that need to be resolved in order to
implement a load sharing scheme in a distributed system are identified and
possible solutions suggested. A load sharing scheme has been designed and
implemented on an existing Unix United system. The performance of this load
sharing scheme is then measured for different types of programs. It is
demonstrated that a load sharing scheme can be implemented on the Unix
United systems using the existing mechanisms provided by the Newcastle
Connection, and without making any significant changes to the existing
software. It is concluded that under some circumstances a substantial
improvement in the system performance can be obtained by the load sharing
scheme.Science and Engineering Research Counci
- …