240,475 research outputs found
Object-oriented Tools for Distributed Computing
Distributed computing systems are proliferating, owing to the availability of powerful, affordable microcomputers and inexpensive communication networks. A critical problem in developing such systems is getting application programs to interact with one another across a computer network. Remote interprogram connectivity is particularly challenging across heterogeneous environments, where applications run on different kinds of computers and operating systems. NetWorks! (trademark) is an innovative software product that provides an object-oriented messaging solution to these problems. This paper describes the design and functionality of NetWorks! and illustrates how it is being used to build complex distributed applications for NASA and in the commercial sector
Dynamic Control Flow in Large-Scale Machine Learning
Many recent machine learning models rely on fine-grained dynamic control flow
for training and inference. In particular, models based on recurrent neural
networks and on reinforcement learning depend on recurrence relations,
data-dependent conditional execution, and other features that call for dynamic
control flow. These applications benefit from the ability to make rapid
control-flow decisions across a set of computing devices in a distributed
system. For performance, scalability, and expressiveness, a machine learning
system must support dynamic control flow in distributed and heterogeneous
environments.
This paper presents a programming model for distributed machine learning that
supports dynamic control flow. We describe the design of the programming model,
and its implementation in TensorFlow, a distributed machine learning system.
Our approach extends the use of dataflow graphs to represent machine learning
models, offering several distinctive features. First, the branches of
conditionals and bodies of loops can be partitioned across many machines to run
on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs.
Second, programs written in our model support automatic differentiation and
distributed gradient computations, which are necessary for training machine
learning models that use control flow. Third, our choice of non-strict
semantics enables multiple loop iterations to execute in parallel across
machines, and to overlap compute and I/O operations.
We have done our work in the context of TensorFlow, and it has been used
extensively in research and production. We evaluate it using several real-world
applications, and demonstrate its performance and scalability.Comment: Appeared in EuroSys 2018. 14 pages, 16 figure
Limits and dynamics of stochastic neuronal networks with random heterogeneous delays
Realistic networks display heterogeneous transmission delays. We analyze here
the limits of large stochastic multi-populations networks with stochastic
coupling and random interconnection delays. We show that depending on the
nature of the delays distributions, a quenched or averaged propagation of chaos
takes place in these networks, and that the network equations converge towards
a delayed McKean-Vlasov equation with distributed delays. Our approach is
mostly fitted to neuroscience applications. We instantiate in particular a
classical neuronal model, the Wilson and Cowan system, and show that the
obtained limit equations have Gaussian solutions whose mean and standard
deviation satisfy a closed set of coupled delay differential equations in which
the distribution of delays and the noise levels appear as parameters. This
allows to uncover precisely the effects of noise, delays and coupling on the
dynamics of such heterogeneous networks, in particular their role in the
emergence of synchronized oscillations. We show in several examples that not
only the averaged delay, but also the dispersion, govern the dynamics of such
networks.Comment: Corrected misprint (useless stopping time) in proof of Lemma 1 and
clarified a regularity hypothesis (remark 1
TensorFlow Doing HPC
TensorFlow is a popular emerging open-source programming framework supporting
the execution of distributed applications on heterogeneous hardware. While
TensorFlow has been initially designed for developing Machine Learning (ML)
applications, in fact TensorFlow aims at supporting the development of a much
broader range of application kinds that are outside the ML domain and can
possibly include HPC applications. However, very few experiments have been
conducted to evaluate TensorFlow performance when running HPC workloads on
supercomputers. This work addresses this lack by designing four traditional HPC
benchmark applications: STREAM, matrix-matrix multiply, Conjugate Gradient (CG)
solver and Fast Fourier Transform (FFT). We analyze their performance on two
supercomputers with accelerators and evaluate the potential of TensorFlow for
developing HPC applications. Our tests show that TensorFlow can fully take
advantage of high performance networks and accelerators on supercomputers.
Running our TensorFlow STREAM benchmark, we obtain over 50% of theoretical
communication bandwidth on our testing platform. We find an approximately 2x,
1.7x and 1.8x performance improvement when increasing the number of GPUs from
two to four in the matrix-matrix multiply, CG and FFT applications
respectively. All our performance results demonstrate that TensorFlow has high
potential of emerging also as HPC programming framework for heterogeneous
supercomputers.Comment: Accepted for publication at The Ninth International Workshop on
Accelerators and Hybrid Exascale Systems (AsHES'19
A component-based middleware framework for configurable and reconfigurable Grid computing
Significant progress has been made in the design and development of Grid middleware which, in its present form, is founded on Web services technologies. However, we argue that present-day Grid middleware is severely limited in supporting projected next-generation applications which will involve pervasive and heterogeneous networked infrastructures, and advanced services such as collaborative distributed visualization. In this paper we discuss a new Grid middleware framework that features (i) support for advanced network services based on the novel concept of pluggable overlay networks, (ii) an architectural framework for constructing bespoke Grid middleware platforms in terms of 'middleware domains' such as extensible interaction types and resource discovery. We believe that such features will become increasingly essential with the emergence of next-generation e-Science applications. Copyright (c) 2005 John Wiley & Sons, Ltd
Symbiot: Congestion-driven Multi-resource Fairness for Multi-User Sensor Networks
© 2015 IEEE.In this paper, we study the problem of multi-resource fairness in multi-user sensor networks with heterogeneous and time-varying resources. Particularly we focus on data gathering applications run on Wireless Sensor Networks (WSNs) or Internet of Things (IoT) in which users require to run a serious of sensing operations with various resource requirements. We consider both the resource demands of sensing tasks, and data forwarding tasks needed to establish multi-hop relay communications. By exploiting graph theory, queueing theory and the notion of dominant resource shares, we develop Symbiot, a light-weight, distributed algorithm that ensures multi-resource fairness between these users. With Symbiot, nodes can independently schedule its resources while maintaining network-level resource fairness through observing traffic congestion levels. Large-scale simulations based Contiki OS and Cooja network emulator show the effectiveness of Symbiot in adaptively utilizing available resources and reducing average completion times
Consensus-based Networked Tracking in Presence of Heterogeneous Time-Delays
We propose a distributed (single) target tracking scheme based on networked
estimation and consensus algorithms over static sensor networks. The tracking
part is based on linear time-difference-of-arrival (TDOA) measurement proposed
in our previous works. This paper, in particular, develops delay-tolerant
distributed filtering solutions over sparse data-transmission networks. We
assume general arbitrary heterogeneous delays at different links. This may
occur in many realistic large-scale applications where the data-sharing between
different nodes is subject to latency due to communication-resource constraints
or large spatially distributed sensor networks. The solution we propose in this
work shows improved performance (verified by both theory and simulations) in
such scenarios. Another privilege of such distributed schemes is the
possibility to add localized fault-detection and isolation (FDI) strategies
along with survivable graph-theoretic design, which opens many follow-up venues
to this research. To our best knowledge no such delay-tolerant distributed
linear algorithm is given in the existing distributed tracking literature.Comment: ICRoM2
- …