Search CORE

77 research outputs found

OGRE: Overlap Graph-based metagenomic Read clustEring

Author: Balvert M. (Marleen)
Dutilh B.E. (Bas)
Hauptfeld T. (Ernestina)
Schönhuth A. (Alexander)
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 03/01/2019
Field of study

The microbes that live in an environment can be identified from the genomic material that is present, also referred to as the metagenome. Using Next Generation Sequencing techniques this genomic material can be obtained from the environment, resulting in a large set of sequencing reads. A proper assembly of these reads into contigs or even full genomes allows one to identify the microbial species and strains that live in the environment. Assembling a metagenome is a challenging task and can benefit from clustering the reads into species-specific bins prior to assembly. In this paper we propose OGRE, an Overlap-Graph based Read clustEring procedure for metagenomic read data. OGRE is the only method that can successfully cluster reads in species-specific bins for large metagenomic datasets without running into computation time- or memory issues

CWI's Institutional Repository

MALVA: Genotyping by Mapping-free ALlele Detection of Known VAriants

Author: Bernardini G. (Giulia)
Bonizzoni P. (Paola)
Denti L. (Luca)
Previtali M. (Marco)
Schönhuth A. (Alexander)
Publication venue: 'Elsevier BV'
Publication date: 30/08/2019
Field of study

The amount of genetic variation discovered in human populations is growing rapidly leading to challenging computational tasks, such as variant calling. Standard methods for addressing this problem include read mapping, a computationally expensive procedure; thus, mapping-free tools have been proposed in recent years. These tools focus on isolated, biallelic SNPs, providing limited support for multi-allelic SNPs and short insertions and deletions of nucleotides (indels). Here we introduce MALVA, a mapping-free method to genotype an individual from a sample of reads. MALVA is the first mapping-free tool able to genotype multi-allelic SNPs and indels, even in high-density genomic regions, and to effectively handle a huge number of variants. MALVA requires one order of magnitude less time to genotype a donor than alignment-based pipelines, providing similar accuracy. Remarkably, on indels, MALVA provides even better results than the most widely adopted variant discovery tools. Biological Sciences; Genetics; Genomics; Bioinformatic

CWI's Institutional Repository

An image representation based convolutional network for DNA classification

Author: Balvert M. (Marleen)
Bohte S.M. (Sander)
Schönhuth A. (Alexander)
Yin B. (Bojian)
Zambrano D. (Davide)
Publication venue
Publication date: 30/05/2018
Field of study

The folding structure of the DNA molecule combined with helper molecules, also referred to as the chromatin, is highly relevant for the functional properties of DNA. The chromatin structure is largely determined by the underlying primary DNA sequence, though the interaction is not yet fully understood. In this paper we develop a convolutional neural network that takes an image-representation of primary DNA sequence as its input, and predicts key determinants of chromatin structure. The method is developed such that it is capable of detecting interactions between distal elements in the DNA sequence, which are known to be highly relevant. Our experiments show that the method outperforms several existing methods both in terms of prediction accuracy and training time

CWI's Institutional Repository

Discovering motifs that induce sequencing errors

Author: Allhoff M.C. (Manuel)
Costa I.G.
Marschall T. (Tobias)
Martin M. (Marcel)
Rahmann S. (Sven)
Schönhuth A. (Alexander)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

CWI's Institutional Repository

Discovering motifs that induce sequencing errors

Author: Allhoff M.C. (Manuel)
Costa I.G.
Marschall T. (Tobias)
Martin M. (Marcel)
Rahmann S. (Sven)
Schönhuth A. (Alexander)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

CWI's Institutional Repository

Using the structure of genome data in the design of deep neural networks for predicting amyotrophic lateral sclerosis from genotype

Author: Balvert M. (Marleen)
Bohte S.M. (Sander)
Dutilh B.E. (Bas)
Schönhuth A. (Alexander)
Spek R.A.A. (Rick) van der
Veldink J. (Jan)
Yin B. (Bojian)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 29/01/2019
Field of study

Motivation: Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease caused by aberrations in the genome. While several disease-causing variants have been identified, a major part of heritability remains unexplained. ALS is believed to have a complex genetic basis where non-additive combinations of variants constitute disease, which cannot be picked up using the linear models employed in classical genotype-phenotype association studies. Deep learning on the other hand is highly promising for identifying such complex relations. We therefore developed a deep-learning based approach for the classification of ALS patients versus healthy individuals from the Dutch cohort of the Project MinE dataset. Based on recent insight that regulatory regions harbor the majority of disease-associated variants, we employ a two-step approach: first promoter regions that are likely associated to ALS are identified, and second individuals are classified based on their genotype in the selected genomic regions. Both steps employ a deep convolutional neural network. The network architecture accounts for the structure of genome data by applying convolution only to parts of the data where this makes sense from a genomics perspective. Results: Our approach identifies potentially ALS-associated promoter regions, and generally outperforms other classification methods. Test results support the hypothesis that non-additive combinations of variants contribute to ALS. Architectures and protocols developed are tailored toward processing population-scale, whole-genome data. We consider this a relevant first step toward deep learning assisted genotype-phenotype association in whole genome-sized data

CWI's Institutional Repository

Using the structure of genome data in the design of deep neural networks for predicting amyotrophic lateral sclerosis from genotype

Author: Balvert M. (Marleen)
Bohte S.M. (Sander)
Dutilh B.E. (Bas)
Schönhuth A. (Alexander)
Spek R.A.A. (Rick) van der
Veldink J. (Jan)
Yin B. (Bojian)
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 29/01/2019
Field of study

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease caused by aberrations in the genome. While several disea

CWI's Institutional Repository

OGRE: Overlap Graph-based metagenomic Read clustEring

Author: Balvert M. (Marleen)
Dutilh B.E. (Bas)
Hauptfeld T. (Ernestina)
Luo X. (Vincent)
Schönhuth A. (Alexander)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/04/2021
Field of study

MOTIVATION: The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in large volumes of sequencing reads. A promising approach to reduce the size of metagenomic datasets is by clustering reads into groups based on their overlaps. Clustering reads are valuable to facilitate downstream analyses, including computationally intensive strain-aware assembly. As current read clustering approaches cannot handle the large datasets arising from high-throughput metagenome sequencing, a novel read clustering approach is needed. In this article, we propose OGRE, an Overlap Graph-based Read clustEring procedure for high-throughput sequencing data, with a focus on shotgun metagenomes. RESULTS: We show that for small datasets OGRE outperforms other read binners in terms of the number of species included in a cluster, also referred to as cluster purity, and the fraction of all reads that is placed in one of the clusters. Furthermore, OGRE is able to process metagenomic datasets that are too large for other read binners into clusters with high cluster purity. CONCLUSION: OGRE is the only method that can successfully cluster reads in species-specific clusters for large metagenomic datasets without running into computation time- or memory issues. AVAILABILITY AND IMPLEMENTATION: Code is made available on Github (https://github.com/Marleen1/OGRE). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

CWI's Institutional Repository

Метафорична картина світу та її місце у системі світів

Author: Boomsma Dorret I
de Bakker Paul I W
Deelen Patrick
Hottenga Jouke Jan
Kanterakis Alexandros
Karssen Lennart C
Kattenberg Mathijs V
Schönhuth Alexander
Slagboom P Eline
Swertz Morris A
van Duijn Cornelia M
van Leeuwen Elisabeth M
Wijmenga Cisca
Publication venue: Кримський науковий центр НАН України і МОН України
Publication date: 01/01/2007
Field of study

Статья посвящается исследованию понятия метафорической картины мира, целесообразность выделения которой автор объясняет тем, что по аналогии с языковой и концептуальной картинами мира, термин "метафорическая картина мира" содержит информацию о сложной структуре многосмысловых значений, которые в силу своей метафорической природе гармонически объединяются.У статті йдеться про поняття метафоричної картини світу, доцільність виділення якої авторка пояснює тим, що за аналогією до мовної й концептуальної картин світу, термін "метафорична картина світу" вміщує інформацію про складну структуру багатосмислових значень, що завдяки своїй метафоричній природі гармонійно поєднуються.The article deals with the notion of metaphorical world picture connected with the general principle of conceptualization. The term "metaphorical world picture" consists of a complex structure of various meanings harmonically combined due to their metaphorical nature

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Publications at Bielefeld University

Utrecht University Repository

Dissertations of the University of Groningen

Asymptotic properties of quantum Markov chains

Author: Baumgartner B Narnhofer H
Bhatia R
Bratteli O
Bruß D
Faigle U Schönhuth A
G Alber
Holevo A S
I Jex
J Novotný
Liu Ch
Nielsen M
Novotný J
Paulsen V
Werner R F
Wiseman H M
Ying M Yu N Feng Y Duan R
Publication venue: 'IOP Publishing'
Publication date: 03/08/2012
Field of study

The asymptotic dynamics of quantum Markov chains generated by the most general physically relevant quantum operations is investigated. It is shown that it is confined to an attractor space on which the resulting quantum Markov chain is diagonalizable. A construction procedure of a basis of this attractor space and its associated dual basis is presented. It applies whenever a strictly positive quantum state exists which is contracted or left invariant by the generating quantum operation. Moreover, algebraic relations between the attractor space and Kraus operators involved in the definition of a quantum Markov chain are derived. This construction is not only expected to offer significant computational advantages in cases in which the dimension of the Hilbert space is large and the dimension of the attractor space is small but it also sheds new light onto the relation between the asymptotic dynamics of quantum Markov chains and fixed points of their generating quantum operations.Comment: 10 page

arXiv.org e-Print Archive

TUbiblio

Crossref