43 research outputs found

    TransRate: reference-free quality assessment of de novo transcriptome assemblies.

    Get PDF
    TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected. These include chimeras, structural errors, incomplete assembly, and base errors. TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies. Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads. Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies. Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples

    AMI-diagram: Mining Facts from Images

    Get PDF

    Transcriptomics of C4 photosynthesis in rice paddy

    Get PDF
    This is the author accepted manuscript. The final version is available from the American Society of Plant Biologists via http://dx.doi.org/​10.​1104/​pp.​15.​00889The C₄ pathway is a highly complex trait that increases photosynthetic efficiency in over sixty plant lineages. Although the majority of C₄ plants occupy disturbed, arid and nutrient-poor habitats, some grow in high-nutrient, waterlogged conditions. One such example is Echinochloa glabrescens, which is an aggressive weed of rice paddies. We generated comprehensive transcriptome datasets for C₄ E. glabrescens and C₃ rice to identify genes associated with adaption to waterlogged, nutrient-replete conditions, but also used the data to better understand how C₄ photosynthesis operates in these conditions. Leaves of E. glabrescens exhibited classical Kranz anatomy with lightly lobed mesophyll cells having low chloroplast coverage. As with rice and other hygrophytic C₃ species, leaves of E. glabrescens accumulated a chloroplastic phosphoenolpyruvate carboxylase protein, albeit at reduced amounts relative to rice. The arid-grown species Setaria italica (C₄) and Brachypodium distachyon (C₃) were also found to accumulate chloroplastic PEPC. We identified a molecular signature associated with C₄ photosynthesis in nutrient-replete, waterlogged conditions that is highly similar to those previously reported from C₄ plants that grow in more arid conditions. We also identified a cohort of genes that have been subjected to a selective sweep associated with growth in paddy conditions. Overall, this approach highlights the value of using wild species such as weeds to identify adaptions to specific conditions associated with high-yielding crops in agriculture

    Shared characteristics underpinning C 4 leaf maturation derived from analysis of multiple C 3 and C 4 species of Flaveria

    Get PDF
    Most terrestrial plants use C3 photosynthesis to fix carbon. In multiple plant lineages a modified system known as C4 photosynthesis has evolved. To better understand the molecular patterns associated with induction of C4 photosynthesis, the genus Flaveria that contains C3 and C4 species was used. A base to tip maturation gradient of leaf anatomy was defined, and RNA sequencing was undertaken along this gradient for two C3 and two C4Flaveria species. Key C4 traits including vein density, mesophyll and bundle sheath cross-sectional area, chloroplast ultrastructure, and abundance of transcripts encoding proteins of C4 photosynthesis were quantified. Candidate genes underlying each of these C4 characteristics were identified. Principal components analysis indicated that leaf maturation and the photosynthetic pathway were responsible for the greatest amount of variation in transcript abundance. Photosynthesis genes were over-represented for a prolonged period in the C4 species. Through comparison with publicly available data sets, we identify a small number of transcriptional regulators that have been up-regulated in diverse C4 species. The analysis identifies similar patterns of expression in independent C4 lineages and so indicates that the complex C4 pathway is associated with parallel as well as convergent evolution

    Evolutionary convergence of cell-specific gene expression in independent lineages of C-4 grasses

    Get PDF
    Leaves of almost all C(4) lineages separate the reactions of photosynthesis into the mesophyll (M) and bundle sheath (BS). The extent to which messenger RNA profiles of M and BS cells from independent C(4) lineages resemble each other is not known. To address this, we conducted deep sequencing of RNA isolated from the M and BS of Setaria viridis and compared these data with publicly available information from maize (Zea mays). This revealed a high correlation (r = 0.89) between the relative abundance of transcripts encoding proteins of the core C(4) pathway in M and BS cells in these species, indicating significant convergence in transcript accumulation in these evolutionarily independent C(4) lineages. We also found that the vast majority of genes encoding proteins of the C(4) cycle in S. viridis are syntenic to homologs used by maize. In both lineages, 122 and 212 homologous transcription factors were preferentially expressed in the M and BS, respectively. Sixteen shared regulators of chloroplast biogenesis were identified, 14 of which were syntenic homologs in maize and S. viridis. In sorghum (Sorghum bicolor), a third C(4) grass, we found that 82% of these trans-factors were also differentially expressed in either M or BS cells. Taken together, these data provide, to our knowledge, the first quantification of convergence in transcript abundance in the M and BS cells from independent lineages of C(4) grasses. Furthermore, the repeated recruitment of syntenic homologs from large gene families strongly implies that parallel evolution of both structural genes and trans-factors underpins the polyphyletic evolution of this highly complex trait in the monocotyledons

    Deep evolutionary comparison of gene expression identifies parallel recruitment of trans-factors in two independent origins of C4 photosynthesis

    Get PDF
    With at least 60 independent origins spanning monocotyledons and dicotyledons, the C(4) photosynthetic pathway represents one of the most remarkable examples of convergent evolution. The recurrent evolution of this highly complex trait involving alterations to leaf anatomy, cell biology and biochemistry allows an increase in productivity by ∼50% in tropical and subtropical areas. The extent to which separate lineages of C(4) plants use the same genetic networks to maintain C(4) photosynthesis is unknown. We developed a new informatics framework to enable deep evolutionary comparison of gene expression in species lacking reference genomes. We exploited this to compare gene expression in species representing two independent C(4) lineages (Cleome gynandra and Zea mays) whose last common ancestor diverged ∼140 million years ago. We define a cohort of 3,335 genes that represent conserved components of leaf and photosynthetic development in these species. Furthermore, we show that genes encoding proteins of the C(4) cycle are recruited into networks defined by photosynthesis-related genes. Despite the wide evolutionary separation and independent origins of the C(4) phenotype, we report that these species use homologous transcription factors to both induce C(4) photosynthesis and to maintain the cell specific gene expression required for the pathway to operate. We define a core molecular signature associated with leaf and photosynthetic maturation that is likely shared by angiosperm species derived from the last common ancestor of the monocotyledons and dicotyledons. We show that deep evolutionary comparisons of gene expression can reveal novel insight into the molecular convergence of highly complex phenotypes and that parallel evolution of trans-factors underpins the repeated appearance of C(4) photosynthesis. Thus, exploitation of extant natural variation associated with complex traits can be used to identify regulators. Moreover, the transcription factors that are shared by independent C(4) lineages are key targets for engineering the C(4) pathway into C(3) crops such as rice

    An Open Science Peer Review Oath

    Get PDF
    One of the foundations of the scientific method is to be able to reproduce experiments and corroborate the results of research that has been done before. However, with the increasing complexities of new technologies and techniques, coupled with the specialisation of experiments, reproducing research findings has become a growing challenge. Clearly, scientific methods must be conveyed succinctly, and with clarity and rigour, in order for research to be reproducible. Here, we propose steps to help increase the transparency of the scientific method and the reproducibility of research results: specifically, we introduce a peer-review oath and accompanying manifesto. These have been designed to offer guidelines to enable reviewers (with the minimum friction or bias) to follow and apply open science principles, and support the ideas of transparency, reproducibility and ultimately greater societal impact. Introducing the oath and manifesto at the stage of peer review will help to check that the research being published includes everything that other researchers would need to successfully repeat the work. Peer review is the lynchpin of the publishing system: encouraging the community to consciously (and conscientiously) uphold these principles should help to improve published papers, increase confidence in the reproducibility of the work and, ultimately, provide strategic benefits to authors and their institutions

    Genomic Relationships, Novel Loci, and Pleiotropic Mechanisms across Eight Psychiatric Disorders

    Get PDF
    Genetic influences on psychiatric disorders transcend diagnostic boundaries, suggesting substantial pleiotropy of contributing loci. However, the nature and mechanisms of these pleiotropic effects remain unclear. We performed analyses of 232,964 cases and 494,162 controls from genome-wide studies of anorexia nervosa, attention-deficit/hyper-activity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, schizophrenia, and Tourette syndrome. Genetic correlation analyses revealed a meaningful structure within the eight disorders, identifying three groups of inter-related disorders. Meta-analysis across these eight disorders detected 109 loci associated with at least two psychiatric disorders, including 23 loci with pleiotropic effects on four or more disorders and 11 loci with antagonistic effects on multiple disorders. The pleiotropic loci are located within genes that show heightened expression in the brain throughout the lifespan, beginning prenatally in the second trimester, and play prominent roles in neurodevelopmental processes. These findings have important implications for psychiatric nosology, drug development, and risk prediction.Peer reviewe

    Analysis of shared heritability in common disorders of the brain

    Get PDF
    ience, this issue p. eaap8757 Structured Abstract INTRODUCTION Brain disorders may exhibit shared symptoms and substantial epidemiological comorbidity, inciting debate about their etiologic overlap. However, detailed study of phenotypes with different ages of onset, severity, and presentation poses a considerable challenge. Recently developed heritability methods allow us to accurately measure correlation of genome-wide common variant risk between two phenotypes from pools of different individuals and assess how connected they, or at least their genetic risks, are on the genomic level. We used genome-wide association data for 265,218 patients and 784,643 control participants, as well as 17 phenotypes from a total of 1,191,588 individuals, to quantify the degree of overlap for genetic risk factors of 25 common brain disorders. RATIONALE Over the past century, the classification of brain disorders has evolved to reflect the medical and scientific communities' assessments of the presumed root causes of clinical phenomena such as behavioral change, loss of motor function, or alterations of consciousness. Directly observable phenomena (such as the presence of emboli, protein tangles, or unusual electrical activity patterns) generally define and separate neurological disorders from psychiatric disorders. Understanding the genetic underpinnings and categorical distinctions for brain disorders and related phenotypes may inform the search for their biological mechanisms. RESULTS Common variant risk for psychiatric disorders was shown to correlate significantly, especially among attention deficit hyperactivity disorder (ADHD), bipolar disorder, major depressive disorder (MDD), and schizophrenia. By contrast, neurological disorders appear more distinct from one another and from the psychiatric disorders, except for migraine, which was significantly correlated to ADHD, MDD, and Tourette syndrome. We demonstrate that, in the general population, the personality trait neuroticism is significantly correlated with almost every psychiatric disorder and migraine. We also identify significant genetic sharing between disorders and early life cognitive measures (e.g., years of education and college attainment) in the general population, demonstrating positive correlation with several psychiatric disorders (e.g., anorexia nervosa and bipolar disorder) and negative correlation with several neurological phenotypes (e.g., Alzheimer's disease and ischemic stroke), even though the latter are considered to result from specific processes that occur later in life. Extensive simulations were also performed to inform how statistical power, diagnostic misclassification, and phenotypic heterogeneity influence genetic correlations. CONCLUSION The high degree of genetic correlation among many of the psychiatric disorders adds further evidence that their current clinical boundaries do not reflect distinct underlying pathogenic processes, at least on the genetic level. This suggests a deeply interconnected nature for psychiatric disorders, in contrast to neurological disorders, and underscores the need to refine psychiatric diagnostics. Genetically informed analyses may provide important "scaffolding" to support such restructuring of psychiatric nosology, which likely requires incorporating many levels of information. By contrast, we find limited evidence for widespread common genetic risk sharing among neurological disorders or across neurological and psychiatric disorders. We show that both psychiatric and neurological disorders have robust correlations with cognitive and personality measures. Further study is needed to evaluate whether overlapping genetic contributions to psychiatric pathology may influence treatment choices. Ultimately, such developments may pave the way toward reduced heterogeneity and improved diagnosis and treatment of psychiatric disorders
    corecore