32 research outputs found
Resolving the complete genome of Kuenenia stuttgartiensis from a membrane bioreactor enrichment using Single-Molecule Real-Time sequencing
Contains fulltext :
190074.pdf (publisher's version ) (Open Access)10 p
Interspecies Translation of Disease Networks Increases Robustness and Predictive Accuracy
Gene regulatory networks give important insights into the mechanisms underlying physiology and pathophysiology. The derivation of gene regulatory networks from high-throughput expression data via machine learning strategies is problematic as the reliability of these models is often compromised by limited and highly variable samples, heterogeneity in transcript isoforms, noise, and other artifacts. Here, we develop a novel algorithm, dubbed Dandelion, in which we construct and train intraspecies Bayesian networks that are translated and assessed on independent test sets from other species in a reiterative procedure. The interspecies disease networks are subjected to multi-layers of analysis and evaluation, leading to the identification of the most consistent relationships within the network structure. In this study, we demonstrate the performance of our algorithms on datasets from animal models of oculopharyngeal muscular dystrophy (OPMD) and patient materials. We show that the interspecies network of genes coding for the proteasome provide highly accurate predictions on gene expression levels and disease phenotype. Moreover, the cross-species translation increases the stability and robustness of these networks. Unlike existing modeling approaches, our algorithms do not require assumptions on notoriously difficult one-to-one mapping of protein orthologues or alternative transcripts and can deal with missing data. We show that the identified key components of the OPMD disease network can be confirmed in an unseen and independent disease model. This study presents a state-of-the-art strategy in constructing interspecies disease networks that provide crucial information on regulatory relationships among genes, leading to better understanding of the disease molecular mechanisms
Determining the quality and complexity of next-generation sequencing data without a reference genome
Genomics, epigenetics, population genetics and bioinformatic
Recommended from our members
Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing
Background: The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing. Results: In MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells. Conclusions: Our findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing. Electronic supplementary material The online version of this article (10.1186/s13059-018-1418-0) contains supplementary material, which is available to authorized users
Aging as Accelerated Accumulation of Somatic Variants: Whole-Genome Sequencing of Centenarian and Middle-Aged Monozygotic Twin Pairs
Mapping 123 million neonatal, infant and child deaths between 2000 and 2017
Since 2000, many countries have achieved considerable success in improving child survival, but localized progress remains unclear. To inform efforts towards United Nations Sustainable Development Goal 3.2—to end preventable child deaths by 2030—we need consistently estimated data at the subnational level regarding child mortality rates and trends. Here we quantified, for the period 2000–2017, the subnational variation in mortality rates and number of deaths of neonates, infants and children under 5 years of age within 99 low- and middle-income countries using a geostatistical survival model. We estimated that 32% of children under 5 in these countries lived in districts that had attained rates of 25 or fewer child deaths per 1,000 live births by 2017, and that 58% of child deaths between 2000 and 2017 in these countries could have been averted in the absence of geographical inequality. This study enables the identification of high-mortality clusters, patterns of progress and geographical inequalities to inform appropriate investments and implementations that will help to improve the health of all populations
Characterizing IonTorrent PGM Error Profiles using TSSV
<p>This dataset contains sequencing reads used to characterize systematic errors of IonTorrent PGM sequencer as well as all of the analysis results in separate files.</p
Allele-specific Characterization of STR Structures in Pure and Mixed Forensic Samples using TSSV
<p>This dataset contains sequencing reads used to characterize allelic STR structures in pure and mixed forensic samples as well as all of the analysis results in separate files.</p
Identification of SNPs to determine associated Y-chromosome Haplogroup using TSSV
<p>This dataset contains sequencing reads used to identify SNPs in order to determine associated Y-chromosomal haplogroups as well as all of the analysis results in separate files.</p
Characterization of DeNovo Structural Variations Induced by TALENs Targeting hDMD in Mouse ES Cells using TSSV
<p>This dataset contains sequencing reads used to characterize de novo variations induced by TALENs targeting intron 52-53 of hDMD gene in mouse ES cells as well as all of the analysis results in separate files.</p