443 research outputs found

    Computationally Tractable Algorithms for Finding a Subset of Non-defective Items from a Large Population

    Full text link
    In the classical non-adaptive group testing setup, pools of items are tested together, and the main goal of a recovery algorithm is to identify the "complete defective set" given the outcomes of different group tests. In contrast, the main goal of a "non-defective subset recovery" algorithm is to identify a "subset" of non-defective items given the test outcomes. In this paper, we present a suite of computationally efficient and analytically tractable non-defective subset recovery algorithms. By analyzing the probability of error of the algorithms, we obtain bounds on the number of tests required for non-defective subset recovery with arbitrarily small probability of error. Our analysis accounts for the impact of both the additive noise (false positives) and dilution noise (false negatives). By comparing with the information theoretic lower bounds, we show that the upper bounds on the number of tests are order-wise tight up to a log2K\log^2K factor, where KK is the number of defective items. We also provide simulation results that compare the relative performance of the different algorithms and provide further insights into their practical utility. The proposed algorithms significantly outperform the straightforward approaches of testing items one-by-one, and of first identifying the defective set and then choosing the non-defective items from the complement set, in terms of the number of measurements required to ensure a given success rate.Comment: In this revision: Unified some proofs and reorganized the paper, corrected a small mistake in one of the proofs, added more reference

    On Finding a Subset of Healthy Individuals from a Large Population

    Full text link
    In this paper, we derive mutual information based upper and lower bounds on the number of nonadaptive group tests required to identify a given number of "non defective" items from a large population containing a small number of "defective" items. We show that a reduction in the number of tests is achievable compared to the approach of first identifying all the defective items and then picking the required number of non-defective items from the complement set. In the asymptotic regime with the population size NN \rightarrow \infty, to identify LL non-defective items out of a population containing KK defective items, when the tests are reliable, our results show that CsK1o(1)(Φ(α0,β0)+o(1))\frac{C_s K}{1-o(1)} (\Phi(\alpha_0, \beta_0) + o(1)) measurements are sufficient, where CsC_s is a constant independent of N,KN, K and LL, and Φ(α0,β0)\Phi(\alpha_0, \beta_0) is a bounded function of α0limNLNK\alpha_0 \triangleq \lim_{N\rightarrow \infty} \frac{L}{N-K} and β0limNKNK\beta_0 \triangleq \lim_{N\rightarrow \infty} \frac{K} {N-K}. Further, in the nonadaptive group testing setup, we obtain rigorous upper and lower bounds on the number of tests under both dilution and additive noise models. Our results are derived using a general sparse signal model, by virtue of which, they are also applicable to other important sparse signal based applications such as compressive sensing.Comment: 32 pages, 2 figures, 3 tables, revised version of a paper submitted to IEEE Trans. Inf. Theor

    Near-Optimal Noisy Group Testing via Separate Decoding of Items

    Get PDF
    The group testing problem consists of determining a small set of defective items from a larger set of items based on a number of tests, and is relevant in applications such as medical testing, communication protocols, pattern matching, and more. In this paper, we revisit an efficient algorithm for noisy group testing in which each item is decoded separately (Malyutov and Mateev, 1980), and develop novel performance guarantees via an information-theoretic framework for general noise models. For the special cases of no noise and symmetric noise, we find that the asymptotic number of tests required for vanishing error probability is within a factor log20.7\log 2 \approx 0.7 of the information-theoretic optimum at low sparsity levels, and that with a small fraction of allowed incorrectly decoded items, this guarantee extends to all sublinear sparsity levels. In addition, we provide a converse bound showing that if one tries to move slightly beyond our low-sparsity achievability threshold using separate decoding of items and i.i.d. randomized testing, the average number of items decoded incorrectly approaches that of a trivial decoder.Comment: Submitted to IEEE Journal of Selected Topics in Signal Processin

    How little does non-exact recovery help in group testing?

    Get PDF
    We consider the group testing problem, in which one seeks to identify a subset of defective items within a larger set of items based on a number of tests. We characterize the information-theoretic performance limits in the presence of list decoding, in which the decoder may output a list containing more elements than the number of defectives, and the only requirement is that the true defective set is a subset of the list, or more generally, that their overlap exceeds a given threshold. We show that even under this highly relaxed criterion, in several scaling regimes the asymptotic number of tests is no smaller than the exact recovery setting. However, we also provide examples where a reduction is provably attained. We support our theoretical findings with numerical experiments

    Discovery of low-dimensional structure in high-dimensional inference problems

    Full text link
    Many learning and inference problems involve high-dimensional data such as images, video or genomic data, which cannot be processed efficiently using conventional methods due to their dimensionality. However, high-dimensional data often exhibit an inherent low-dimensional structure, for instance they can often be represented sparsely in some basis or domain. The discovery of an underlying low-dimensional structure is important to develop more robust and efficient analysis and processing algorithms. The first part of the dissertation investigates the statistical complexity of sparse recovery problems, including sparse linear and nonlinear regression models, feature selection and graph estimation. We present a framework that unifies sparse recovery problems and construct an analogy to channel coding in classical information theory. We perform an information-theoretic analysis to derive bounds on the number of samples required to reliably recover sparsity patterns independent of any specific recovery algorithm. In particular, we show that sample complexity can be tightly characterized using a mutual information formula similar to channel coding results. Next, we derive major extensions to this framework, including dependent input variables and a lower bound for sequential adaptive recovery schemes, which helps determine whether adaptivity provides performance gains. We compute statistical complexity bounds for various sparse recovery problems, showing our analysis improves upon the existing bounds and leads to intuitive results for new applications. In the second part, we investigate methods for improving the computational complexity of subgraph detection in graph-structured data, where we aim to discover anomalous patterns present in a connected subgraph of a given graph. This problem arises in many applications such as detection of network intrusions, community detection, detection of anomalous events in surveillance videos or disease outbreaks. Since optimization over connected subgraphs is a combinatorial and computationally difficult problem, we propose a convex relaxation that offers a principled approach to incorporating connectivity and conductance constraints on candidate subgraphs. We develop a novel nearly-linear time algorithm to solve the relaxed problem, establish convergence and consistency guarantees and demonstrate its feasibility and performance with experiments on real networks

    A Metaheuristic-Based Simulation Optimization Framework For Supply Chain Inventory Management Under Uncertainty

    Get PDF
    The need for inventory control models for practical real-world applications is growing with the global expansion of supply chains. The widely used traditional optimization procedures usually require an explicit mathematical model formulated based on some assumptions. The validity of such models and approaches for real world applications depend greatly upon whether the assumptions made match closely with the reality. The use of meta-heuristics, as opposed to a traditional method, does not require such assumptions and has allowed more realistic modeling of the inventory control system and its solution. In this dissertation, a metaheuristic-based simulation optimization framework is developed for supply chain inventory management under uncertainty. In the proposed framework, any effective metaheuristic can be employed to serve as the optimizer to intelligently search the solution space, using an appropriate simulation inventory model as the evaluation module. To be realistic and practical, the proposed framework supports inventory decision-making under supply-side and demand-side uncertainty in a supply chain. The supply-side uncertainty specifically considered includes quality imperfection. As far as demand-side uncertainty is concerned, the new framework does not make any assumption on demand distribution and can process any demand time series. This salient feature enables users to have the flexibility to evaluate data of practical relevance. In addition, other realistic factors, such as capacity constraints, limited shelf life of products and type-compatible substitutions are also considered and studied by the new framework. The proposed framework has been applied to single-vendor multi-buyer supply chains with the single vendor facing the direct impact of quality deviation and capacity constraint from its supplier and the buyers facing demand uncertainty. In addition, it has been extended to the supply chain inventory management of highly perishable products. Blood products with limited shelf life and ABO compatibility have been examined in detail. It is expected that the proposed framework can be easily adapted to different supply chain systems, including healthcare organizations. Computational results have shown that the proposed framework can effectively assess the impacts of different realistic factors on the performance of a supply chain from different angles, and to determine the optimal inventory policies accordingly

    Group testing:an information theory perspective

    Get PDF
    The group testing problem concerns discovering a small number of defective items within a large population by performing tests on pools of items. A test is positive if the pool contains at least one defective, and negative if it contains no defectives. This is a sparse inference problem with a combinatorial flavour, with applications in medical testing, biology, telecommunications, information technology, data science, and more. In this monograph, we survey recent developments in the group testing problem from an information-theoretic perspective. We cover several related developments: efficient algorithms with practical storage and computation requirements, achievability bounds for optimal decoding methods, and algorithm-independent converse bounds. We assess the theoretical guarantees not only in terms of scaling laws, but also in terms of the constant factors, leading to the notion of the {\em rate} of group testing, indicating the amount of information learned per test. Considering both noiseless and noisy settings, we identify several regimes where existing algorithms are provably optimal or near-optimal, as well as regimes where there remains greater potential for improvement. In addition, we survey results concerning a number of variations on the standard group testing problem, including partial recovery criteria, adaptive algorithms with a limited number of stages, constrained test designs, and sublinear-time algorithms.Comment: Survey paper, 140 pages, 19 figures. To be published in Foundations and Trends in Communications and Information Theor

    Optimum Allocation of Inspection Stations in Multistage Manufacturing Processes by Using Max-Min Ant System

    Get PDF
    In multistage manufacturing processes it is common to locate inspection stations after some or all of the processing workstations. The purpose of the inspection is to reduce the total manufacturing cost, resulted from unidentified defective items being processed unnecessarily through subsequent manufacturing operations. This total cost is the sum of the costs of production, inspection and failures (during production and after shipment). Introducing inspection stations into a serial multistage manufacturing process, although constituting an additional cost, is expected to be a profitable course of action. Specifically, at some positions the associated inspection costs will be recovered from the benefits realised through the detection of defective items, before wasting additional cost by continuing to process them. In this research, a novel general cost modelling for allocating a limited number of inspection stations in serial multistage manufacturing processes is formulated. In allocation of inspection station (AOIS) problem, as the number of workstations increases, the number of inspection station allocation possibilities increases exponentially. To identify the appropriate approach for the AOIS problem, different optimisation methods are investigated. The MAX-MIN Ant System (MMAS) algorithm is proposed as a novel approach to explore AOIS in serial multistage manufacturing processes. MMAS is an ant colony optimisation algorithm that was designed originally to begin an explorative search phase and, subsequently, to make a slow transition to the intensive exploitation of the best solutions found during the search, by allowing only one ant to update the pheromone trails. Two novel heuristics information for the MMAS algorithm are created. The heuristic information for the MMAS algorithm is exploited as a novel means to guide ants to build reasonably good solutions from the very beginning of the search. To improve the performance of the MMAS algorithm, six local search methods which are well-known and suitable for the AOIS problem are used. Selecting relevant parameter values for the MMAS algorithm can have a great impact on the algorithm’s performance. As a result, a method for tuning the most influential parameter values for the MMAS algorithm is developed. The contribution of this research is, for the first time, a methodology using MMAS to solve the AOIS problem in serial multistage manufacturing processes has been developed. The methodology takes into account the constraints on inspection resources, in terms of a limited number of inspection stations. As a result, the total manufacturing cost of a product can be reduced, while maintaining the quality of the product. Four numerical experiments are conducted to assess the MMAS algorithm for the AOIS problem. The performance of the MMAS algorithm is compared with a number of other methods this includes the complete enumeration method (CEM), rule of thumb, a pure random search algorithm, particle swarm optimisation, simulated annealing and genetic algorithm. The experimental results show that the effectiveness of the MMAS algorithm lies in its considerably shorter execution time and robustness. Further, in certain conditions results obtained by the MMAS algorithm are identical to the CEM. In addition, the results show that applying local search to the MMAS algorithm has significantly improved the performance of the algorithm. Also the results demonstrate that it is essential to use heuristic information with the MMAS algorithm for the AOIS problem, in order to obtain a high quality solution. It was found that the main parameters of MMAS include the pheromone trail intensity, heuristic information and evaporation of pheromone are less sensitive within the specified range as the number of workstations is significantly increased
    corecore