23,781 research outputs found
Computation-Communication Trade-offs and Sensor Selection in Real-time Estimation for Processing Networks
Recent advances in electronics are enabling substantial processing to be
performed at each node (robots, sensors) of a networked system. Local
processing enables data compression and may mitigate measurement noise, but it
is still slower compared to a central computer (it entails a larger
computational delay). However, while nodes can process the data in parallel,
the centralized computational is sequential in nature. On the other hand, if a
node sends raw data to a central computer for processing, it incurs
communication delay. This leads to a fundamental communication-computation
trade-off, where each node has to decide on the optimal amount of preprocessing
in order to maximize the network performance. We consider a network in charge
of estimating the state of a dynamical system and provide three contributions.
First, we provide a rigorous problem formulation for optimal real-time
estimation in processing networks in the presence of delays. Second, we show
that, in the case of a homogeneous network (where all sensors have the same
computation) that monitors a continuous-time scalar linear system, the optimal
amount of local preprocessing maximizing the network estimation performance can
be computed analytically. Third, we consider the realistic case of a
heterogeneous network monitoring a discrete-time multi-variate linear system
and provide algorithms to decide on suitable preprocessing at each node, and to
select a sensor subset when computational constraints make using all sensors
suboptimal. Numerical simulations show that selecting the sensors is crucial.
Moreover, we show that if the nodes apply the preprocessing policy suggested by
our algorithms, they can largely improve the network estimation performance.Comment: 15 pages, 16 figures. Accepted journal versio
Federated Neural Architecture Search
To preserve user privacy while enabling mobile intelligence, techniques have
been proposed to train deep neural networks on decentralized data. However,
training over decentralized data makes the design of neural architecture quite
difficult as it already was. Such difficulty is further amplified when
designing and deploying different neural architectures for heterogeneous mobile
platforms. In this work, we propose an automatic neural architecture search
into the decentralized training, as a new DNN training paradigm called
Federated Neural Architecture Search, namely federated NAS. To deal with the
primary challenge of limited on-client computational and communication
resources, we present FedNAS, a highly optimized framework for efficient
federated NAS. FedNAS fully exploits the key opportunity of insufficient model
candidate re-training during the architecture search process, and incorporates
three key optimizations: parallel candidates training on partial clients, early
dropping candidates with inferior performance, and dynamic round numbers.
Tested on large-scale datasets and typical CNN architectures, FedNAS achieves
comparable model accuracy as state-of-the-art NAS algorithm that trains models
with centralized data, and also reduces the client cost by up to two orders of
magnitude compared to a straightforward design of federated NAS
On the descriptional complexity of iterative arrays
The descriptional complexity of iterative arrays (lAs) is studied. Iterative arrays are a parallel computational model with a sequential processing of the input. It is shown that lAs when compared to deterministic finite automata or pushdown automata may provide savings in size which are not bounded by any recursive function, so-called non-recursive trade-offs. Additional non-recursive trade-offs are proven to exist between lAs working in linear time and lAs working in real time. Furthermore, the descriptional complexity of lAs is compared with cellular automata (CAs) and non-recursive trade-offs are proven between two restricted classes. Finally, it is shown that many decidability questions for lAs are undecidable and not semidecidable
Nature-Inspired Interconnects for Self-Assembled Large-Scale Network-on-Chip Designs
Future nano-scale electronics built up from an Avogadro number of components
needs efficient, highly scalable, and robust means of communication in order to
be competitive with traditional silicon approaches. In recent years, the
Networks-on-Chip (NoC) paradigm emerged as a promising solution to interconnect
challenges in silicon-based electronics. Current NoC architectures are either
highly regular or fully customized, both of which represent implausible
assumptions for emerging bottom-up self-assembled molecular electronics that
are generally assumed to have a high degree of irregularity and imperfection.
Here, we pragmatically and experimentally investigate important design
trade-offs and properties of an irregular, abstract, yet physically plausible
3D small-world interconnect fabric that is inspired by modern network-on-chip
paradigms. We vary the framework's key parameters, such as the connectivity,
the number of switch nodes, the distribution of long- versus short-range
connections, and measure the network's relevant communication characteristics.
We further explore the robustness against link failures and the ability and
efficiency to solve a simple toy problem, the synchronization task. The results
confirm that (1) computation in irregular assemblies is a promising and
disruptive computing paradigm for self-assembled nano-scale electronics and (2)
that 3D small-world interconnect fabrics with a power-law decaying distribution
of shortcut lengths are physically plausible and have major advantages over
local 2D and 3D regular topologies
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
- …