466 research outputs found

    Constraining the Number of Positive Responses in Adaptive, Non-Adaptive, and Two-Stage Group Testing

    Full text link
    Group testing is a well known search problem that consists in detecting the defective members of a set of objects O by performing tests on properly chosen subsets (pools) of the given set O. In classical group testing the goal is to find all defectives by using as few tests as possible. We consider a variant of classical group testing in which one is concerned not only with minimizing the total number of tests but aims also at reducing the number of tests involving defective elements. The rationale behind this search model is that in many practical applications the devices used for the tests are subject to deterioration due to exposure to or interaction with the defective elements. In this paper we consider adaptive, non-adaptive and two-stage group testing. For all three considered scenarios, we derive upper and lower bounds on the number of "yes" responses that must be admitted by any strategy performing at most a certain number t of tests. In particular, for the adaptive case we provide an algorithm that uses a number of "yes" responses that exceeds the given lower bound by a small constant. Interestingly, this bound can be asymptotically attained also by our two-stage algorithm, which is a phenomenon analogous to the one occurring in classical group testing. For the non-adaptive scenario we give almost matching upper and lower bounds on the number of "yes" responses. In particular, we give two constructions both achieving the same asymptotic bound. An interesting feature of one of these constructions is that it is an explicit construction. The bounds for the non-adaptive and the two-stage cases follow from the bounds on the optimal sizes of new variants of d-cover free families and (p,d)-cover free families introduced in this paper, which we believe may be of interest also in other contexts

    Ortho2ExpressMatrix—a web server that interprets cross-species gene expression data by gene family information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The study of gene families is pivotal for the understanding of gene evolution across different organisms and such phylogenetic background is often used to infer biochemical functions of genes. Modern high-throughput experiments offer the possibility to analyze the entire transcriptome of an organism; however, it is often difficult to deduct functional information from that data.</p> <p>Results</p> <p>To improve functional interpretation of gene expression we introduce Ortho2ExpressMatrix, a novel tool that integrates complex gene family information, computed from sequence similarity, with comparative gene expression profiles of two pre-selected biological objects: gene families are displayed with two-dimensional matrices. Parameters of the tool are object type (two organisms, two individuals, two tissues, etc.), type of computational gene family inference, experimental meta-data, microarray platform, gene annotation level and genome build. Family information in Ortho2ExpressMatrix bases on computationally different protein family approaches such as EnsemblCompara, InParanoid, SYSTERS and Ensembl Family. Currently, respective all-against-all associations are available for five species: human, mouse, worm, fruit fly and yeast. Additionally, microRNA expression can be examined with respect to miRBase or TargetScan families. The visualization, which is typical for Ortho2ExpressMatrix, is performed as matrix view that displays functional traits of genes (differential expression) as well as sequence similarity of protein family members (BLAST e-values) in colour codes. Such translations are intended to facilitate the user's perception of the research object.</p> <p>Conclusions</p> <p>Ortho2ExpressMatrix integrates gene family information with genome-wide expression data in order to enhance functional interpretation of high-throughput analyses on diseases, environmental factors, or genetic modification or compound treatment experiments. The tool explores differential gene expression in the light of orthology, paralogy and structure of gene families up to the point of ambiguity analyses. Results can be used for filtering and prioritization in functional genomic, biomedical and systems biology applications. The web server is freely accessible at <url>http://bioinf-data.charite.de/o2em/cgi-bin/o2em.pl</url>.</p

    Bayesian Optimization Algorithm for Non-unique Oligonucleotide Probe Selection

    Get PDF
    One important application of DNA microarrays is measuring the expression levels of genes. The quality of the microarrays design which includes selecting short Oligonucleotide sequences (probes) to be affixed on the surface of the microarray becomes a major issue. A good design is the one that contains the minimum possible number of probes while having an acceptable ability in identifying the targets existing in the sample. We focuse on the problem of computing the minimal set of probes which is able to identify each target of a sample, referred to as Non-unique Oligonucleotide Probe Selection. We present the application of an Estimation of Distribution Algorithm named Bayesian Optimization Algorithm (BOA) to this problem, and consider integration of BOA and one simple heuristic. We also present application of our method in integration with decoding approach in a multiobjective optimization framework for solving the problem in case of multiple targets in the sample

    Non-Unique oligonucleotide probe selection heuristics

    Get PDF
    The non-unique probe selection problem consists of selecting both unique and nonunique oligonucleotide probes for oligonucleotide microarrays, which are widely used tools to identify viruses or bacteria in biological samples. The non-unique probes, designed to hybridize to at least one target, are used as alternatives when the design of unique probes is particularly difficult for the closely related target genes. The goal of the non-unique probe selection problem is to determine a smallest set of probes able to identify all targets present in a biological sample. This problem is known to be NP-hard. In this thesis, several novel heuristics are presented based on greedy strategy, genetic algorithms and evolutionary strategy respectively for the minimization problem arisen from the non-unique probe selection using the best-known ILP formulation. Experiment results show that our methods are capable of reducing the number of probes required over the state-of-the-art methods

    Promoting synergistic research and education in genomics and bioinformatics

    Get PDF
    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.&#xd;&#xa;Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. &#xd;&#xa;Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.&#xd;&#xa

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.&#xd;&#xa;Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. &#xd;&#xa;Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.&#xd;&#xa

    Determining the Drivers of Anti-Tropical Distributions Across the Fish Tree of Life

    Get PDF
    Anti-tropical distributions are those where populations of a single species, or multiple closely related taxa, are distributed outside of, and on opposing sides of, the tropics. These latitudinally disjunct distributions have been noted for over a century. Despite this long history of interest, little has been concluded regarding the actual mechanisms that drive this pattern, with several prominent hypotheses competing with one another in the literature. Here I review the proposed drivers of anti-tropicality, and subsequently test them using fishes with a variety of life history and taxonomic differences. This includes (1) a temperately restricted family with anti-tropical distributions – Cheilodactylidae, (2) a tropical reef fish family with a single temperate anti-tropical genus – Prionurus, and (3) a variety of fishes from across the fish tree of life that have populations split by the tropics. Using complete taxonomic sampling, and phylogenomic approaches coupled with fossil calibration points, I find evidence for recent equatorial divergence events in the Pleistocene and Pliocene, as well as divergence events dating to the Miocene for both Cheilodactylidae and Prionurus. Furthermore, taxonomic issues were detected, and explored within both of these groups. To disentangle the multiple hypotheses that can explain recent transitions, I used ecological niche models coupled with extant distributional data for a variety of species across the fish tree of life that exhibit intra-specific anti-tropicality. These data reveal distinct support for both glacial dispersal, and biotic exclusion from the tropics. These results are then interpreted in a comprehensive framework to determine what drives anti-tropical distributions in marine systems. Overall, multiple mechanisms seem responsible that act in concert over time to produce these distributions. Certain equatorial divergence events are recovered in time periods currently not associated with any anti-tropical hypotheses. It seems likely that stochastic crossing events may be important in the initial colonization of a new hemisphere
    corecore