3 research outputs found
Decentralized detection in resource-limited sensor network architectures
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (leaves 201-207).We consider the problem of decentralized binary detection in a network consisting of a large number of nodes arranged as a tree of bounded height. We show that the error probability decays exponentially fast with the number of nodes under both a Neyman-Pearson criterion and a Bayesian criterion, and provide bounds for the optimal error exponent. Furthermore, we show that under the Neyman-Pearson criterion, the optimal error exponent is often the same as that corresponding to a parallel configuration, implying that a large network can be designed to operate efficiently without significantly affecting the detection performance. We provide sufficient, as well as necessary, conditions for this to happen. For those networks satisfying the sufficient conditions, we propose a simple strategy that nearly achieves the optimal error exponent, and in which all non-leaf nodes need only send 1-bit messages. We also investigate the impact of node failures and unreliable communications on the detection performance. Node failures are modeled by a Galton-Watson branching process, and binary symmetric channels are assumed for the case of unreliable communications. We characterize the asymptotically optimal detection performance, develop simple strategies that nearly achieve the optimal performance, and compare the performance of the two types of networks. Our results suggest that in a large scale sensor network, it is more important to ensure that nodes can communicate reliably with each other(e.g.,by boosting the transmission power) than to ensure that nodes are robust to failures. In the case of networks with unbounded height, we establish the validity of a long-standing conjecture regarding the sub-exponential decay of Bayesian detection error probabilities in a tandem network. We also provide bounds for the error probability, and show that under the additional assumption of bounded Kullback-Leibler divergences, the error probability is (e cnd ), for all d> 1/2, with c c(logn)d being a positive constant. Furthermore, the bound (e), for all d> 1, holds under an additional mild condition on the distributions. This latter bound is shown to be tight. Moreover, for the Neyman-Pearson case, we establish that if the sensors act myopically, the Type II error probabilities also decay at a sub-exponential rate.(cont.) Finally, we consider the problem of decentralized detection when sensors have access to side-information that affects the statistics of their measurements, and the network has an overall cost constraint. Nodes can decide whether or not to make a measurement and transmit a message to the fusion center("censoring"), and also have a choice of the transmission function. We study the tradeoff in the detection performance with the cost constraint, and also the impact of sensor cooperation and global sharing of side-information. In particular, we show that if the Type I error probability is constrained to be small, then sensor cooperation is not necessary to achieve the optimal Type II error exponent.by Wee Peng Tay.Ph.D
Bayesian Detection in Bounded Height Tree Networks
We study the detection performance of large scale sensor networks, configured as trees with bounded height, in which information is progressively compressed as it moves towards the root of the tree. We show that, under a Bayesian formulation, the error probability decays exponentially fast, and we provide bounds for the error exponent. We then focus on the case where the tree has certain symmetry properties. We derive the form of the optimal exponent within a restricted class of easily implementable strategies, as well as optimal strategies within that class. We also find conditions under which (suitably defined) majority rules are optimal. Finally, we provide evidence that in designing a network it is preferable to keep the branching factor small for nodes other than the neighbors of the leaves
Distributed Inference and Learning with Byzantine Data
We are living in an increasingly networked world with sensing networks of varying shapes and sizes: the network often comprises of several tiny devices (or nodes) communicating with each other via different topologies. To make the problem even more complicated, the nodes in the network can be unreliable due to a variety of reasons: noise, faults and attacks, thus, providing
corrupted data. Although the area of statistical inference has been an active area of research in the
past, distributed learning and inference in a networked setup with potentially unreliable components
has only gained attention recently. The emergence of big and dirty data era demands new
distributed learning and inference solutions to tackle the problem of inference with corrupted data.
Distributed inference networks (DINs) consist of a group of networked entities which acquire
observations regarding a phenomenon of interest (POI), collaborate with other entities in the network
by sharing their inference via different topologies to make a global inference. The central
goal of this thesis is to analyze the effect of corrupted (or falsified) data on the inference performance
of DINs and design robust strategies to ensure reliable overall performance for several
practical network architectures. Specifically, the inference (or learning) process can be that of detection
or estimation or classification, and the topology of the system can be parallel, hierarchical
or fully decentralized (peer to peer).
Note that, the corrupted data model may seem similar to the scenario where local decisions
are transmitted over a Binary Symmetric Channel (BSC) with a certain cross over probability,
however, there are fundamental differences. Over the last three decades, research community
has extensively studied the impact of transmission channels or faults on the distributed detection
system and related problems due to its importance in several applications. However, corrupted
(Byzantine) data models considered in this thesis, are philosophically different from the BSC or
the faulty sensor cases. Byzantines are intentional and intelligent, therefore, they can optimize
over the data corruption parameters. Thus, in contrast to channel aware detection, both the FC and
the Byzantines can optimize their utility by choosing their actions based on the knowledge of their
opponent’s behavior. Study of these practically motivated scenarios in the presence of Byzantines
is of utmost importance, and is missing from the channel aware detection and fault tolerant detection
literature. This thesis advances the distributed inference literature by providing fundamental
limits of distributed inference with Byzantine data and provides optimal counter-measures (using
the insights provided by these fundamental limits) from a network designer’s perspective. Note
that, the analysis of problems related to strategical interaction between Byzantines and network
designed is very challenging (NP-hard is many cases). However, we show that by utilizing the
properties of the network architecture, efficient solutions can be obtained. Specifically, we found
that several problems related to the design of optimal counter-measures in the inference context
are, in fact, special cases of these NP-hard problems which can be solved in polynomial time.
First, we consider the problem of distributed Bayesian detection in the presence of data falsification
(or Byzantine) attacks in the parallel topology. Byzantines considered in this thesis are those
nodes that are compromised and reprogrammed by an adversary to transmit false information to
a centralized fusion center (FC) to degrade detection performance. We show that above a certain
fraction of Byzantine attackers in the network, the detection scheme becomes completely incapable
(or blind) of utilizing the sensor data for detection. When the fraction of Byzantines is not
sufficient to blind the FC, we also provide closed form expressions for the optimal attacking strategies
for the Byzantines that most degrade the detection performance. Optimal attacking strategies
in certain cases have the minimax property and, therefore, the knowledge of these strategies has
practical significance and can be used to implement a robust detector at the FC.
In several practical situations, parallel topology cannot be implemented due to limiting factors,
such as, the FC being outside the communication range of the nodes and limited energy budget of
the nodes. In such scenarios, a multi-hop network is employed, where nodes are organized hierarchically
into multiple levels (tree networks). Next, we study the problem of distributed inference
in tree topologies in the presence of Byzantines under several practical scenarios. We analytically
characterize the effect of Byzantines on the inference performance of the system. We also look at
the possible counter-measures from the FC’s perspective to protect the network from these Byzantines.
These counter-measures are of two kinds: Byzantine identification schemes and Byzantine
tolerant schemes. Using learning based techniques, Byzantine identification schemes are designed
that learn the identity of Byzantines in the network and use this information to improve system
performance. For scenarios where this is not possible, Byzantine tolerant schemes, which use
game theory and error-correcting codes, are developed that tolerate the effect of Byzantines while
maintaining a reasonably good inference performance in the network.
Going a step further, we also consider scenarios where a centralized FC is not available. In
such scenarios, a solution is to employ detection approaches which are based on fully distributed
consensus algorithms, where all of the nodes exchange information only with their neighbors. For
such networks, we analytically characterize the negative effect of Byzantines on the steady-state
and transient detection performance of conventional consensus-based detection schemes. To avoid
performance deterioration, we propose a distributed weighted average consensus algorithm that is
robust to Byzantine attacks. Next, we exploit the statistical distribution of the nodes’ data to devise
techniques for mitigating the influence of data falsifying Byzantines on the distributed detection
system. Since some parameters of the statistical distribution of the nodes’ data might not be known
a priori, we propose learning based techniques to enable an adaptive design of the local fusion or
update rules.
The above considerations highlight the negative effect of the corrupted data on the inference
performance. However, it is possible for a system designer to utilize the corrupted data for network’s
benefit. Finally, we consider the problem of detecting a high dimensional signal based on
compressed measurements with secrecy guarantees. We consider a scenario where the network
operates in the presence of an eavesdropper who wants to discover the state of the nature being
monitored by the system. To keep the data secret from the eavesdropper, we propose to use cooperating
trustworthy nodes that assist the FC by injecting corrupted data in the system to deceive the
eavesdropper. We also design the system by determining the optimal values of parameters which
maximize the detection performance at the FC while ensuring perfect secrecy at the eavesdropper