    Engineering Competitive and Query-Optimal Minimal-Adaptive Randomized Group Testing Strategies

    Suppose that given is a collection of nn elements where dd of them are \emph{defective}. We can query an arbitrarily chosen subset of elements which returns Yes if the subset contains at least one defective and No if the subset is free of defectives. The problem of group testing is to identify the defectives with a minimum number of such queries. By the information-theoretic lower bound at least log2(nd)dlog2(nd)dlog2n\log_2 \binom {n}{d} \approx d\log_2 (\frac{n}{d}) \approx d\log_2 n queries are needed. Using adaptive group testing, i.e., asking one query at a time, the lower bound can be easily achieved. However, strategies are preferred that work in a fixed small number of stages, where queries in a stage are asked in parallel. A group testing strategy is called \emph{competitive} if it works for completely unknown dd and requires only O(dlog2n)O(d\log_2 n) queries. Usually competitive group testing is based on sequential queries. We have shown that actually competitive group testing with expected O(dlog2n)O(d\log_2 n) queries is possible in only 22 or 33 stages. Then we have focused on minimizing the hidden constant factor in the query number and proposed a systematic approach for this purpose. Another main result is related to the design of query-optimal and minimal-adaptive strategies. We have shown that a 22-stage randomized strategy with prescribed success probability can asymptotically achieve the information-theoretic lower bound for dnd \ll n and growing much slower than nn. Similarly, we can approach the entropy lower bound in 44 stages when d=o(n)d=o(n)

    New Constructions for Competitive and Minimal-Adaptive Group Testing

    Group testing (GT) was originally proposed during the World War II in an attempt to minimize the \emph{cost} and \emph{waiting time} in performing identical blood tests of the soldiers for a low-prevalence disease. Formally, the GT problem asks to find dnd\ll n \emph{defective} elements out of nn elements by querying subsets (pools) for the presence of defectives. By the information-theoretic lower bound, essentially dlog2nd\log_2 n queries are needed in the worst-case. An \emph{adaptive} strategy proceeds sequentially by performing one query at a time, and it can achieve the lower bound. In various applications, nothing is known about dd beforehand and a strategy for this scenario is called \emph{competitive}. Such strategies are usually adaptive and achieve query optimality within a constant factor called the \emph{competitive ratio}. In many applications, queries are time-consuming. Therefore, \emph{minimal-adaptive} strategies which run in a small number ss of stages of parallel queries are favorable. This work is mainly devoted to the design of minimal-adaptive strategies combined with other demands of both theoretical and practical interest. First we target unknown dd and show that actually competitive GT is possible in as few as 22 stages only. The main ingredient is our randomized estimate of a previously unknown dd using nonadaptive queries. In addition, we have developed a systematic approach to obtain optimal competitive ratios for our strategies. When dd is a known upper bound, we propose randomized GT strategies which asymptotically achieve query optimality in just 22, 33 or 44 stages depending upon the growth of dd versus nn. Inspired by application settings, such as at American Red Cross, where in most cases GT is applied to small instances, \textit{e.g.}, n=16n=16. We extended our study of query-optimal GT strategies to solve a given problem instance with fixed values nn, dd and ss. We also considered the situation when elements to test cannot be divided physically (electronic devices), thus the pools must be disjoint. For GT with \emph{disjoint} simultaneous pools, we show that Θ(sd(n/d)1/s)\Theta (sd(n/d)^{1/s}) tests are sufficient, and also necessary for certain ranges of the parameters

    Scalable fault management architecture for dynamic optical networks : an information-theoretic approach

    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.MIT Barker Engineering Library copy: printed in pages.Also issued printed in pages.Includes bibliographical references (leaves 255-262).All-optical switching, in place of electronic switching, of high data-rate lightpaths at intermediate nodes is one of the key enabling technologies for economically scalable future data networks. This replacement of electronic switching with optical switching at intermediate nodes, however, presents new challenges for fault detection and localization in reconfigurable all-optical networks. Presently, fault detection and localization techniques, as implemented in SONET/G.709 networks, rely on electronic processing of parity checks at intermediate nodes. If similar techniques are adapted to all-optical reconfigurable networks, optical signals need to be tapped out at intermediate nodes for parity checks. This additional electronic processing would break the all-optical transparency paradigm and thus significantly diminish the cost advantages of all-optical networks. In this thesis, we propose new fault-diagnosis approaches specifically tailored to all-optical networks, with an objective of keeping the diagnostic capital expenditure and the diagnostic operation effort low. Instead of the aforementioned passive monitoring paradigm based on parity checks, we propose a proactive lightpath probing paradigm: optical probing signals are sent along a set of lightpaths in the network, and network state (i.e., failure pattern) is then inferred from testing results of this set of end-to-end lightpath measurements. Moreover, we assume that a subset of network nodes (up to all the nodes) is equipped with diagnostic agents - including both transmitters/receivers for probe transmission/detection and software processes for probe management to perform fault detection and localization. The design objectives of this proposed proactive probing paradigm are two folded: i) to minimize the number of lightpath probes to keep the diagnostic operational effort low, and ii) to minimize the number of diagnostic hardware to keep the diagnostic capital expenditure low.(cont.) The network fault-diagnosis problem can be mathematically modeled with a group testing-over-graphs framework. In particular, the network is abstracted as a graph in which the failure status of each node/link is modeled with a random variable (e.g. Bernoulli distribution). A probe over any path in the graph results in a value, defined as the probe syndrome, which is a function of all the random variables associated in that path. A network failure pattern is inferred through a set of probe syndromes resulting from a set of optimally chosen probes. This framework enriches the traditional group-testing problem by introducing a topological structure, and can be extended to model many other network-monitoring problems (e.g., packet delay, packet drop ratio, noise and etc) by choosing appropriate state variables. Under the group-testing-over-graphs framework with a probabilistic failure model, we initiate an information-theoretic approach to minimizing the average number of lightpath probes to identify all possible network failure patterns. Specifically, we have established an isomorphic mapping between the fault-diagnosis problem in network management and the source-coding problem in Information Theory. This mapping suggests that the minimum average number of lightpath probes required is lower bounded by the information entropy of the network state and efficient source-coding algorithms (e.g. the run-length code) can be translated into scalable fault-diagnosis schemes under some additional probe feasibility constraint. Our analytical and numerical investigations yield a guideline for designing scalable fault-diagnosis algorithms: each probe should provide approximately 1-bit of state information, and thus the total number of probes required is approximately equal to the entropy of the network state.(cont.) To address the hardware cost of diagnosis, we also developed a probabilistic analysis framework to characterize the trade-off between hardware cost (i.e., the number of nodes equipped with Tx/Rx pairs) and diagnosis capability (i.e., the probability of successful failure detection and localization). Our results suggest that, for practical situations, the hardware cost can be reduced significantly by accepting a small amount of uncertainty about the failure status.by Yonggang Wen.Ph.D