3,110 research outputs found

    DMPFinder - Finding differentiating pathways with gaps from two groups of metabolic networks

    Get PDF
    Session 2B: Biological and Regulatory NetworksWhy some strains of a species exhibit a certain phenotype (e.g. drug resistant) but not the other strains of the same species is a critical question to answer. Studying the metabolism of the two groups of strains may discover the corresponding pathways that are conserved in the first group but not in the second group. However, only a few tools provide functions to compare two groups of metabolic networks which are usually limited to the reaction level, not the pathway level. In this paper, we formulate the DMP (Differentiating Metabolic Pathway) problem for finding conserved pathways exist in first group, but not the second group. The problem also captures the mutation in pathways and derives a measure (p-value and e-score) for evaluating the confident of the pathways. We then developed an algorithm, DMPFinder, to solve the DMP problem. Experimental results show that DMPFinder is able to identify pathways that are critical for the first group to exhibit a certain phenotype which is absent in the other group. Some of these pathways cannot be identified by other tools which only consider reaction level or do not take into account possible mutations among species. The software is available at: http://i.cs.hku.hk/alse/hkubrg/projects/DMPFinder/postprintThe 3rd International Conference on Bioinformatics and Computational Biology (BICoB 2011), New Orleans, LA., 23-25 March 2011

    Efficiency is dependent on the control of supply.

    Get PDF
    At a time when health care systems are undergoing reform, it is useful to review the causes of inefficiency in health care, along with potential solutions. Such solutions can affect suppliers (supply-side measures) or users of care (demand-side measures). This paper argues that to have an efficient health care system, supply-side measures must be implemented. Some examples of supply-side measures, with particular relevance to the Hong Kong situation, are discussed. By their nature, supply-side measures require government intervention. Only then, can allocative efficiency, as well as technical efficiency, be achieved. Once a health care system is operating efficiently, it is an easier task to determine whether the system requires more resources, either currently, or in the future.published_or_final_versio

    Nonvolatile organic transistor-memory devices using various thicknesses of silver nanoparticle layers

    Get PDF
    Author name used in this publication: C. W. LeungAuthor name used in this publication: P. K. L. Chan2010-2011 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Effects of qigong training on physical and psychosocial well-being of breast cancer survivors: a systematic review

    Get PDF
    published_or_final_versio

    Non-adaptive complex group testing with multiple positive sets

    Get PDF
    LNCS v. 6648 is conference proceedings of TAMC 2011Given n items with at most d of them having a particular property (referred as positive items), a single test on a selected subset of them is positive if the subset contains any positive item. The non-adaptive group testing problem is to design how to group the items to minimize the number of tests required to identify all positive items in which all tests are performed in parallel. This problem is well-studied and algorithms exist that match the lower bound with a small gap of logd asymptoticically. An important generalization of the problem is to consider the case that individual positive item cannot make a test positive, but a combination of them (referred as positive subsets) can do. The problem is referred as the non-adaptive complex group testing. Assume there are at most d positive subsets whose sizes are at most s, existing algorithms either require Ω(logs n) tests for general n or O((s+d/d) log n) tests for some special values of n . However, the number of items in each test cannot be very small or very large in real situation. The above algorithms cannot be applied because there is no control on the number of items in each test. In this paper, we provide a novel and practical derandomized algorithm to construct the tests, which has two important properties. (1) Our algorithm requires only O((d+s)d+s+1/(ddss log n) tests for all positive integers n which matches the upper bound on the number of tests when all positive subsets are singletons, i.e. s = 1. (2) All tests in our algorithm can have the same number of tested items k. Thus, our algorithm can solve the problem with additional constraints on the number of tested items in each test, such as maximum or minimum number of tested items. © 2011 Springer-Verlag.postprintThe 8th Annual Conference on Theory and Applications of Models of Computation (TAMC 2011), Tokyo, Japan, 23-25 May 2011. In Lecture Notes in Computer Science, 2011, v. 6648, p. 172-18

    MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binning

    Get PDF
    This article is part of the supplement: Selected articles from the Twelfth Asia Pacific Bioinformatics Conference (APBC 2014): GenomicsBackground Taxonomic annotation of reads is an important problem in metagenomic analysis. Existing annotation tools, which rely on the approach of aligning each read to the taxonomic structure, are unable to annotate many reads efficiently and accurately as reads (100 bp) are short and most of them come from unknown genomes. Previous work has suggested assembling the reads to make longer contigs before annotation. More reads/contigs can be annotated as a longer contig (in Kbp) can be aligned to a taxon even if it is from an unknown species as long as it contains a conserved region of that taxon. Unfortunately existing metagenomic assembly tools are not mature enough to produce long enough contigs. Binning tries to group reads/contigs of similar species together. Intuitively, reads in the same group (cluster) should be annotated to the same taxon and these reads altogether should cover a significant portion of the genome alleviating the problem of short contigs if the quality of binning is high. However, no existing work has tried to use binning results to help solve the annotation problem. This work explores this direction. Results In this paper, we describe MetaCluster-TA, an assembly-assisted binning-based annotation tool which relies on an innovative idea of annotating binned reads instead of aligning each read or contig to the taxonomic structure separately. We propose the novel concept of the 'virtual contig' (which can be up to 10 Kb in length) to represent a set of reads and then represent each cluster as a set of 'virtual contigs' (which together can be total up to 1 Mb in length) for annotation. MetaCluster-TA can outperform widely-used MEGAN4 and can annotate (1) more reads since the virtual contigs are much longer; (2) more accurately since each cluster of long virtual contigs contains global information of the sampled genome which tends to be more accurate than short reads or assembled contigs which contain only local information of the genome; and (3) more efficiently since there are much fewer long virtual contigs to align than short reads. MetaCluster-TA outperforms MetaCluster 5.0 as a binning tool since binning itself can be more sensitive and precise given long virtual contigs and the binning results can be improved using the reference taxonomic database. Conclusions MetaCluster-TA can outperform widely-used MEGAN4 and can annotate more reads with higher accuracy and higher efficiency. It also outperforms MetaCluster 5.0 as a binning tool.published_or_final_versio

    T-IDBA: A de novo Iterative de Bruijn Graph Assembler for Transcriptome

    Get PDF
    LNCS v. 6577 entitled: Research in computational molecular biology: 15th annual international conference, RECOMB 2011 ... : proceedingsRNA-seq data produced by next-generation sequencing technology is a useful tool for analyzing transcriptomes. However, existing de novo transcriptome assemblers do not fully utilize the properties of transcriptomes and may result in short contigs because of the splicing nature (shared exons) of the genes. We propose the T-IDBA algorithm to reconstruct expressed isoforms without reference genome. By using pair-end information to solve the problem of long repeats in different genes and branching in the same gene due to alternative splicing, the graph can be decomposed into small components, each corresponds to a gene. The most possible isoforms with sufficient support from the pair-end reads will be found heuristically. In practice, our de novo transcriptome assembler, T-IDBA, outperforms Abyss substantially in terms of sensitivity and precision for both simulated and real data. T-IDBA is available at http://www.cs.hku.hk/~alse/ tidba/. © 2011 Springer-Verlag.postprin

    Filtering of false positive microRNA candidates by a clustering-based approach

    Get PDF
    B M C BioinformaticsBackground: MicroRNAs are small non-coding RNA gene products that play diversified roles from species to species. The explosive growth of microRNA researches in recent years proves the importance of microRNAs in the biological system and it is believed that microRNAs have valuable therapeutic potentials in human diseases. Continual efforts are therefore required to locate and verify the unknown microRNAs in various genomes. As many miRNAs are found to be arranged in clusters, meaning that they are in close proximity with their neighboring miRNAs, we are interested in utilizing the concept of microRNA clustering and applying it in microRNA computational prediction. Results: We first validate the microRNA clustering phenomenon in the human, mouse and rat genomes. There are 45.45%, 51.86% and 48.67% of the total miRNAs that are clustered in the three genomes, respectively. We then conduct sequence and secondary structure similarity analyses among clustered miRNAs, non-clustered miRNAs, neighboring sequences of clustered miRNAs and random sequences, and find that clustered miRNAs are structurally more similar to one another, and the RNAdistance score can be used to assess the structural similarity between two sequences. We therefore design a clustering-based approach which utilizes this observation to filter false positives from a list of candidates generated by a selected microRNA prediction program, and successfully raise the positive predictive value by a considerable amount ranging from 15.23% to 23.19% in the human, mouse and rat genomes, while keeping a reasonably high sensitivity. Conclusion: Our clustering-based approach is able to increase the effectiveness of currently available microRNA prediction program by raising the positive predictive value while maintaining a high sensitivity, and hence can serve as a filtering step. We believe that it is worthwhile to carry out further experiments and tests with our approach using data from other genomes and other prediction software tools. Better results may be achieved with fine-tuning of parameters. © 2008 Leung et al; licensee BioMed Central Ltd.published_or_final_versio

    MetaCluster 4.0: A novel binning algorithm for NGS reads and huge number of species

    Get PDF
    Next-generation sequencing (NGS) technologies allow the sequencing of microbial communities directly from the environment without prior culturing. The output of environmental DNA sequencing consists of many reads from genomes of different unknown species, making the clustering together reads from the same (or similar) species (also known as binning) a crucial step. The difficulties of the binning problem are due to the following four factors: (1) the lack of reference genomes; (2) uneven abundance ratio of species; (3) short NGS reads; and (4) a large number of species (can be more than a hundred). None of the existing binning tools can handle all four factors. No tools, including both AbundanceBin and MetaCluster 3.0, have demonstrated reasonable performance on a sample with more than 20 species. In this article, we introduce MetaCluster 4.0, an unsupervised binning algorithm that can accurately (with about 80% precision and sensitivity in all cases and at least 90% in some cases) and efficiently bin short reads with varying abundance ratios and is able to handle datasets with 100 species. The novelty of MetaCluster 4.0 stems from solving a few important problems: how to divide reads into groups by a probabilistic approach, how to estimate the 4-mer distribution of each group, how to estimate the number of species, and how to modify MetaCluster 3.0 to handle a large number of species. We show that Meta Cluster 4.0 is effective for both simulated and real datasets. Supplementary Material is available at www.liebertonline.com/cmb. © 2012 Mary Ann Liebert, Inc.published_or_final_versio
    corecore