443 research outputs found
Computationally Tractable Algorithms for Finding a Subset of Non-defective Items from a Large Population
In the classical non-adaptive group testing setup, pools of items are tested
together, and the main goal of a recovery algorithm is to identify the
"complete defective set" given the outcomes of different group tests. In
contrast, the main goal of a "non-defective subset recovery" algorithm is to
identify a "subset" of non-defective items given the test outcomes. In this
paper, we present a suite of computationally efficient and analytically
tractable non-defective subset recovery algorithms. By analyzing the
probability of error of the algorithms, we obtain bounds on the number of tests
required for non-defective subset recovery with arbitrarily small probability
of error. Our analysis accounts for the impact of both the additive noise
(false positives) and dilution noise (false negatives). By comparing with the
information theoretic lower bounds, we show that the upper bounds on the number
of tests are order-wise tight up to a factor, where is the number
of defective items. We also provide simulation results that compare the
relative performance of the different algorithms and provide further insights
into their practical utility. The proposed algorithms significantly outperform
the straightforward approaches of testing items one-by-one, and of first
identifying the defective set and then choosing the non-defective items from
the complement set, in terms of the number of measurements required to ensure a
given success rate.Comment: In this revision: Unified some proofs and reorganized the paper,
corrected a small mistake in one of the proofs, added more reference
On Finding a Subset of Healthy Individuals from a Large Population
In this paper, we derive mutual information based upper and lower bounds on
the number of nonadaptive group tests required to identify a given number of
"non defective" items from a large population containing a small number of
"defective" items. We show that a reduction in the number of tests is
achievable compared to the approach of first identifying all the defective
items and then picking the required number of non-defective items from the
complement set. In the asymptotic regime with the population size , to identify non-defective items out of a population
containing defective items, when the tests are reliable, our results show
that measurements are
sufficient, where is a constant independent of and , and
is a bounded function of and . Further, in the nonadaptive group
testing setup, we obtain rigorous upper and lower bounds on the number of tests
under both dilution and additive noise models. Our results are derived using a
general sparse signal model, by virtue of which, they are also applicable to
other important sparse signal based applications such as compressive sensing.Comment: 32 pages, 2 figures, 3 tables, revised version of a paper submitted
to IEEE Trans. Inf. Theor
Near-Optimal Noisy Group Testing via Separate Decoding of Items
The group testing problem consists of determining a small set of defective
items from a larger set of items based on a number of tests, and is relevant in
applications such as medical testing, communication protocols, pattern
matching, and more. In this paper, we revisit an efficient algorithm for noisy
group testing in which each item is decoded separately (Malyutov and Mateev,
1980), and develop novel performance guarantees via an information-theoretic
framework for general noise models. For the special cases of no noise and
symmetric noise, we find that the asymptotic number of tests required for
vanishing error probability is within a factor of the
information-theoretic optimum at low sparsity levels, and that with a small
fraction of allowed incorrectly decoded items, this guarantee extends to all
sublinear sparsity levels. In addition, we provide a converse bound showing
that if one tries to move slightly beyond our low-sparsity achievability
threshold using separate decoding of items and i.i.d. randomized testing, the
average number of items decoded incorrectly approaches that of a trivial
decoder.Comment: Submitted to IEEE Journal of Selected Topics in Signal Processin
How little does non-exact recovery help in group testing?
We consider the group testing problem, in which one seeks to identify a subset of defective items within a larger set of items based on a number of tests. We characterize the information-theoretic performance limits in the presence of list decoding, in which the decoder may output a list containing more elements than the number of defectives, and the only requirement is that the true defective set is a subset of the list, or more generally, that their overlap exceeds a given threshold. We show that even under this highly relaxed criterion, in several scaling regimes the asymptotic number of tests is no smaller than the exact recovery setting. However, we also provide examples where a reduction is provably attained. We support our theoretical findings with numerical experiments
Discovery of low-dimensional structure in high-dimensional inference problems
Many learning and inference problems involve high-dimensional data such as images, video or genomic data, which cannot be processed efficiently using conventional methods due to their dimensionality. However, high-dimensional data often exhibit an inherent low-dimensional structure, for instance they can often be represented sparsely in some basis or domain. The discovery of an underlying low-dimensional structure is important to develop more robust and efficient analysis and processing algorithms.
The first part of the dissertation investigates the statistical complexity of sparse recovery problems, including sparse linear and nonlinear regression models, feature selection and graph estimation. We present a framework that unifies sparse recovery problems and construct an analogy to channel coding in classical information theory. We perform an information-theoretic analysis to derive bounds on the number of samples required to reliably recover sparsity patterns independent of any specific recovery algorithm. In particular, we show that sample complexity can be tightly characterized using a mutual information formula similar to channel coding results. Next, we derive major extensions to this framework, including dependent input variables and a lower bound for sequential adaptive recovery schemes, which helps determine whether adaptivity provides performance gains. We compute statistical complexity bounds for various sparse recovery problems, showing our analysis improves upon the existing bounds and leads to intuitive results for new applications.
In the second part, we investigate methods for improving the computational complexity of subgraph detection in graph-structured data, where we aim to discover anomalous patterns present in a connected subgraph of a given graph. This problem arises in many applications such as detection of network intrusions, community detection, detection of anomalous events in surveillance videos or disease outbreaks. Since optimization over connected subgraphs is a combinatorial and computationally difficult problem, we propose a convex relaxation that offers a principled approach to incorporating connectivity and conductance constraints on candidate subgraphs. We develop a novel nearly-linear time algorithm to solve the relaxed problem, establish convergence and consistency guarantees and demonstrate its feasibility and performance with experiments on real networks
A Metaheuristic-Based Simulation Optimization Framework For Supply Chain Inventory Management Under Uncertainty
The need for inventory control models for practical real-world applications is growing with the global expansion of supply chains. The widely used traditional optimization procedures usually require an explicit mathematical model formulated based on some assumptions. The validity of such models and approaches for real world applications depend greatly upon whether the assumptions made match closely with the reality. The use of meta-heuristics, as opposed to a traditional method, does not require such assumptions and has allowed more realistic modeling of the inventory control system and its solution. In this dissertation, a metaheuristic-based simulation optimization framework is developed for supply chain inventory management under uncertainty. In the proposed framework, any effective metaheuristic can be employed to serve as the optimizer to intelligently search the solution space, using an appropriate simulation inventory model as the evaluation module. To be realistic and practical, the proposed framework supports inventory decision-making under supply-side and demand-side uncertainty in a supply chain. The supply-side uncertainty specifically considered includes quality imperfection. As far as demand-side uncertainty is concerned, the new framework does not make any assumption on demand distribution and can process any demand time series. This salient feature enables users to have the flexibility to evaluate data of practical relevance. In addition, other realistic factors, such as capacity constraints, limited shelf life of products and type-compatible substitutions are also considered and studied by the new framework. The proposed framework has been applied to single-vendor multi-buyer supply chains with the single vendor facing the direct impact of quality deviation and capacity constraint from its supplier and the buyers facing demand uncertainty. In addition, it has been extended to the supply chain inventory management of highly perishable products. Blood products with limited shelf life and ABO compatibility have been examined in detail. It is expected that the proposed framework can be easily adapted to different supply chain systems, including healthcare organizations. Computational results have shown that the proposed framework can effectively assess the impacts of different realistic factors on the performance of a supply chain from different angles, and to determine the optimal inventory policies accordingly
Group testing:an information theory perspective
The group testing problem concerns discovering a small number of defective
items within a large population by performing tests on pools of items. A test
is positive if the pool contains at least one defective, and negative if it
contains no defectives. This is a sparse inference problem with a combinatorial
flavour, with applications in medical testing, biology, telecommunications,
information technology, data science, and more. In this monograph, we survey
recent developments in the group testing problem from an information-theoretic
perspective. We cover several related developments: efficient algorithms with
practical storage and computation requirements, achievability bounds for
optimal decoding methods, and algorithm-independent converse bounds. We assess
the theoretical guarantees not only in terms of scaling laws, but also in terms
of the constant factors, leading to the notion of the {\em rate} of group
testing, indicating the amount of information learned per test. Considering
both noiseless and noisy settings, we identify several regimes where existing
algorithms are provably optimal or near-optimal, as well as regimes where there
remains greater potential for improvement. In addition, we survey results
concerning a number of variations on the standard group testing problem,
including partial recovery criteria, adaptive algorithms with a limited number
of stages, constrained test designs, and sublinear-time algorithms.Comment: Survey paper, 140 pages, 19 figures. To be published in Foundations
and Trends in Communications and Information Theor
Optimum Allocation of Inspection Stations in Multistage Manufacturing Processes by Using Max-Min Ant System
In multistage manufacturing processes it is common to locate inspection stations after some or all of the processing workstations. The purpose of the inspection is to reduce the total manufacturing cost, resulted from unidentified defective items being processed unnecessarily through subsequent manufacturing operations. This total cost is the sum of the costs of production, inspection and failures (during production and after shipment). Introducing inspection stations into a serial multistage manufacturing process, although constituting an additional cost, is expected to be a profitable course of action. Specifically, at some positions the associated inspection costs will be recovered from the benefits realised through the detection of defective items, before wasting additional cost by continuing to process them.
In this research, a novel general cost modelling for allocating a limited number of inspection stations in serial multistage manufacturing processes is formulated. In allocation of inspection station (AOIS) problem, as the number of workstations increases, the number of inspection station allocation possibilities increases exponentially. To identify the appropriate approach for the AOIS problem, different optimisation methods are investigated. The MAX-MIN Ant System (MMAS) algorithm is proposed as a novel approach to explore AOIS in serial multistage manufacturing processes. MMAS is an ant colony optimisation algorithm that was designed originally to begin an explorative search phase and, subsequently, to make a slow transition to the intensive exploitation of the best solutions found during the search, by allowing only one ant to update the pheromone trails. Two novel heuristics information for the MMAS algorithm are created. The heuristic information for the MMAS algorithm is exploited as a novel means to guide ants to build reasonably good solutions from the very beginning of the search. To improve the performance of the MMAS algorithm, six local search methods which are well-known and suitable for the AOIS problem are used. Selecting relevant parameter values for the MMAS algorithm can have a great impact on the algorithm’s performance. As a result, a method for tuning the most influential parameter values for the MMAS algorithm is developed.
The contribution of this research is, for the first time, a methodology using MMAS to solve the AOIS problem in serial multistage manufacturing processes has been developed. The methodology takes into account the constraints on inspection resources, in terms of a limited number of inspection stations. As a result, the total manufacturing cost of a product can be reduced, while maintaining the quality of the product. Four numerical experiments are conducted to assess the MMAS algorithm for the AOIS problem. The performance of the MMAS algorithm is compared with a number of other methods this includes the complete enumeration method (CEM), rule of thumb, a pure random search algorithm, particle swarm optimisation, simulated annealing and genetic algorithm. The experimental results show that the effectiveness of the MMAS algorithm lies in its considerably shorter execution time and robustness. Further, in certain conditions results obtained by the MMAS algorithm are identical to the CEM. In addition, the results show that applying local search to the MMAS algorithm has significantly improved the performance of the algorithm. Also the results demonstrate that it is essential to use heuristic information with the MMAS algorithm for the AOIS problem, in order to obtain a high quality solution. It was found that the main parameters of MMAS include the pheromone trail intensity, heuristic information and evaporation of pheromone are less sensitive within the specified range as the number of workstations is significantly increased
- …