56,128 research outputs found

    Classification via sequential testing

    Get PDF
    The problem of generating the sequence of tests required to reach a diagnostic conclusion with minimum average cost, which is also known as test sequencing problem, is considered. The test sequencing problem is formulated as an optimal binary AND/OR decision tree construction problem, whose solution is known to be NP-complete. The problem can be solved optimally using dynamic programming or AND/OR graph search methods (AO*, CF, and HS). However, for large systems, the associated computational effort with dynamic programming or AND/OR graph search methods is substantial, due to the rapidly increasing number of nodes in AND/OR search graph. In order to prevent the computational explosion, one-step or multistep lookahead heuristic algorithms have been developed to solve the test sequencing problem. Our approach is based on integrating concepts from the one-step lookahead heuristic algorithms and the strategies used in Huffman coding. The effectiveness of the algorithms is demonstrated on several test cases. The traditional test sequencing problem is generalized here to include asymmetrical tests. Our approach to test sequencing can be adapted to solve a wide variety of binary identification problems arising in decision table programming, medical diagnosis, database query processing, quality assurance, and pattern recognition

    Change-point model on nonhomogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing

    Get PDF
    We propose a flexible change-point model for inhomogeneous Poisson Processes, which arise naturally from next-generation DNA sequencing, and derive score and generalized likelihood statistics for shifts in intensity functions. We construct a modified Bayesian information criterion (mBIC) to guide model selection, and point-wise approximate Bayesian confidence intervals for assessing the confidence in the segmentation. The model is applied to DNA Copy Number profiling with sequencing data and evaluated on simulated spike-in and real data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS517 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Quantitative Comparison of Abundance Structures of Generalized Communities: From B-Cell Receptor Repertoires to Microbiomes

    Full text link
    The \emph{community}, the assemblage of organisms co-existing in a given space and time, has the potential to become one of the unifying concepts of biology, especially with the advent of high-throughput sequencing experiments that reveal genetic diversity exhaustively. In this spirit we show that a tool from community ecology, the Rank Abundance Distribution (RAD), can be turned by the new MaxRank normalization method into a generic, expressive descriptor for quantitative comparison of communities in many areas of biology. To illustrate the versatility of the method, we analyze RADs from various \emph{generalized communities}, i.e.\ assemblages of genetically diverse cells or organisms, including human B cells, gut microbiomes under antibiotic treatment and of different ages and countries of origin, and other human and environmental microbial communities. We show that normalized RADs enable novel quantitative approaches that help to understand structures and dynamics of complex generalize communities
    corecore