77 research outputs found

    OGRE: Overlap Graph-based metagenomic Read clustEring

    Get PDF
    The microbes that live in an environment can be identified from the genomic material that is present, also referred to as the metagenome. Using Next Generation Sequencing techniques this genomic material can be obtained from the environment, resulting in a large set of sequencing reads. A proper assembly of these reads into contigs or even full genomes allows one to identify the microbial species and strains that live in the environment. Assembling a metagenome is a challenging task and can benefit from clustering the reads into species-specific bins prior to assembly. In this paper we propose OGRE, an Overlap-Graph based Read clustEring procedure for metagenomic read data. OGRE is the only method that can successfully cluster reads in species-specific bins for large metagenomic datasets without running into computation time- or memory issues

    MALVA: Genotyping by Mapping-free ALlele Detection of Known VAriants

    Get PDF
    The amount of genetic variation discovered in human populations is growing rapidly leading to challenging computational tasks, such as variant calling. Standard methods for addressing this problem include read mapping, a computationally expensive procedure; thus, mapping-free tools have been proposed in recent years. These tools focus on isolated, biallelic SNPs, providing limited support for multi-allelic SNPs and short insertions and deletions of nucleotides (indels). Here we introduce MALVA, a mapping-free method to genotype an individual from a sample of reads. MALVA is the first mapping-free tool able to genotype multi-allelic SNPs and indels, even in high-density genomic regions, and to effectively handle a huge number of variants. MALVA requires one order of magnitude less time to genotype a donor than alignment-based pipelines, providing similar accuracy. Remarkably, on indels, MALVA provides even better results than the most widely adopted variant discovery tools. Biological Sciences; Genetics; Genomics; Bioinformatic

    An image representation based convolutional network for DNA classification

    Get PDF
    The folding structure of the DNA molecule combined with helper molecules, also referred to as the chromatin, is highly relevant for the functional properties of DNA. The chromatin structure is largely determined by the underlying primary DNA sequence, though the interaction is not yet fully understood. In this paper we develop a convolutional neural network that takes an image-representation of primary DNA sequence as its input, and predicts key determinants of chromatin structure. The method is developed such that it is capable of detecting interactions between distal elements in the DNA sequence, which are known to be highly relevant. Our experiments show that the method outperforms several existing methods both in terms of prediction accuracy and training time

    Using the structure of genome data in the design of deep neural networks for predicting amyotrophic lateral sclerosis from genotype

    Get PDF
    Motivation: Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease caused by aberrations in the genome. While several disease-causing variants have been identified, a major part of heritability remains unexplained. ALS is believed to have a complex genetic basis where non-additive combinations of variants constitute disease, which cannot be picked up using the linear models employed in classical genotype-phenotype association studies. Deep learning on the other hand is highly promising for identifying such complex relations. We therefore developed a deep-learning based approach for the classification of ALS patients versus healthy individuals from the Dutch cohort of the Project MinE dataset. Based on recent insight that regulatory regions harbor the majority of disease-associated variants, we employ a two-step approach: first promoter regions that are likely associated to ALS are identified, and second individuals are classified based on their genotype in the selected genomic regions. Both steps employ a deep convolutional neural network. The network architecture accounts for the structure of genome data by applying convolution only to parts of the data where this makes sense from a genomics perspective. Results: Our approach identifies potentially ALS-associated promoter regions, and generally outperforms other classification methods. Test results support the hypothesis that non-additive combinations of variants contribute to ALS. Architectures and protocols developed are tailored toward processing population-scale, whole-genome data. We consider this a relevant first step toward deep learning assisted genotype-phenotype association in whole genome-sized data

    Using the structure of genome data in the design of deep neural networks for predicting amyotrophic lateral sclerosis from genotype

    Get PDF
    Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease caused by aberrations in the genome. While several disea

    OGRE: Overlap Graph-based metagenomic Read clustEring

    Get PDF
    MOTIVATION: The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in large volumes of sequencing reads. A promising approach to reduce the size of metagenomic datasets is by clustering reads into groups based on their overlaps. Clustering reads are valuable to facilitate downstream analyses, including computationally intensive strain-aware assembly. As current read clustering approaches cannot handle the large datasets arising from high-throughput metagenome sequencing, a novel read clustering approach is needed. In this article, we propose OGRE, an Overlap Graph-based Read clustEring procedure for high-throughput sequencing data, with a focus on shotgun metagenomes. RESULTS: We show that for small datasets OGRE outperforms other read binners in terms of the number of species included in a cluster, also referred to as cluster purity, and the fraction of all reads that is placed in one of the clusters. Furthermore, OGRE is able to process metagenomic datasets that are too large for other read binners into clusters with high cluster purity. CONCLUSION: OGRE is the only method that can successfully cluster reads in species-specific clusters for large metagenomic datasets without running into computation time- or memory issues. AVAILABILITY AND IMPLEMENTATION: Code is made available on Github (https://github.com/Marleen1/OGRE). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Метафорична картина світу та її місце у системі світів

    Get PDF
    Статья посвящается исследованию понятия метафорической картины мира, целесообразность выделения которой автор объясняет тем, что по аналогии с языковой и концептуальной картинами мира, термин "метафорическая картина мира" содержит информацию о сложной структуре многосмысловых значений, которые в силу своей метафорической природе гармонически объединяются.У статті йдеться про поняття метафоричної картини світу, доцільність виділення якої авторка пояснює тим, що за аналогією до мовної й концептуальної картин світу, термін "метафорична картина світу" вміщує інформацію про складну структуру багатосмислових значень, що завдяки своїй метафоричній природі гармонійно поєднуються.The article deals with the notion of metaphorical world picture connected with the general principle of conceptualization. The term "metaphorical world picture" consists of a complex structure of various meanings harmonically combined due to their metaphorical nature

    Asymptotic properties of quantum Markov chains

    Full text link
    The asymptotic dynamics of quantum Markov chains generated by the most general physically relevant quantum operations is investigated. It is shown that it is confined to an attractor space on which the resulting quantum Markov chain is diagonalizable. A construction procedure of a basis of this attractor space and its associated dual basis is presented. It applies whenever a strictly positive quantum state exists which is contracted or left invariant by the generating quantum operation. Moreover, algebraic relations between the attractor space and Kraus operators involved in the definition of a quantum Markov chain are derived. This construction is not only expected to offer significant computational advantages in cases in which the dimension of the Hilbert space is large and the dimension of the attractor space is small but it also sheds new light onto the relation between the asymptotic dynamics of quantum Markov chains and fixed points of their generating quantum operations.Comment: 10 page
    corecore