173 research outputs found

    Snpdat: easy and rapid annotation of results from de novo snp discovery projects for model and non-model organisms

    Get PDF
    peer-reviewedBackground: Single nucleotide polymorphisms (SNPs) are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data. Results: Here we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat. Conclusions: SNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience

    Spherical:an iterative workflow for assembling metagenomic datasets

    Get PDF
    BACKGROUND: The consensus emerging from the study of microbiomes is that they are far more complex than previously thought, requiring better assemblies and increasingly deeper sequencing. However, current metagenomic assembly techniques regularly fail to incorporate all, or even the majority in some cases, of the sequence information generated for many microbiomes, negating this effort. This can especially bias the information gathered and the perceived importance of the minor taxa in a microbiome. RESULTS: We propose a simple but effective approach, implemented in Python, to address this problem. Based on an iterative methodology, our workflow (called Spherical) carries out successive rounds of assemblies with the sequencing reads not yet utilised. This approach also allows the user to reduce the resources required for very large datasets, by assembling random subsets of the whole in a "divide and conquer" manner. CONCLUSIONS: We demonstrate the accuracy of Spherical using simulated data based on completely sequenced genomes and the effectiveness of the workflow at retrieving lost information for taxa in three published metagenomics studies of varying sizes. Our results show that Spherical increased the amount of reads utilized in the assembly by up to 109% compared to the base assembly. The additional contigs assembled by the Spherical workflow resulted in a significant (P?<?0.05) changes in the predicted taxonomic profile of all datasets analysed. Spherical is implemented in Python 2.7 and freely available for use under the MIT license. Source code and documentation is hosted publically at: https://github.com/thh32/Spherical .publishersversionPeer reviewe

    Transcriptomics of liver and muscle in Holstein cows genetically divergent for fertility highlight differences in nutrient partitioning and inflammation processes.

    Get PDF
    peer-reviewedBM and SC were funded by the Teagasc Walsh Fellowship Scheme. This project was supported by Teagasc (RMIS 6075) and DAFM grant 13/S/528 led by SB. CJC was funded under the Science Foundation Ireland (SFI) Stokes lecturer scheme (07/SK/B1236A).Background The transition between pregnancy and lactation is a major physiological change for dairy cows. Complex systemic and local processes involving regulation of energy balance, galactopoiesis, utilisation of body reserves, insulin resistance, resumption of oestrous cyclicity and involution of the uterus can affect animal productivity and hence farm profitability. Here we used an established Holstein dairy cow model of fertility that displayed genetic and phenotypic divergence in calving interval. Cows had similar genetic merit for milk production traits, but either very good genetic merit for fertility traits (‘Fert+’; n = 8) or very poor genetic merit for fertility traits (‘Fert-’; n = 8). We used RNA sequencing to investigate gene expression profiles in both liver and muscle tissue biopsies at three distinct time-points: late pregnancy, early lactation and mid lactation (-18, 1 and 147 days relative to parturition, respectively). Results We found 807 and 815 unique genes to be differentially expressed in at least one time-point in liver and muscle respectively, of which 79 % and 83 % were only found in a single time-point; 40 and 41 genes were found differentially expressed at every time-point indicating possible systemic or chronic dysregulation. Functional annotation of all differentially expressed genes highlighted two physiological processes that were impacted at every time-point in the study, These were immune and inflammation, and metabolic, lipid and carbohydrate-binding. Conclusion These pathways have previously been identified by other researchers. We show that several specific genes which are differentially regulated, including IGF-1, might impact dairy fertility. We postulate that an increased burden of reactive oxidation species, coupled with a chronic inflammatory state, might reduce dairy cow fertility in our model.Teagasc Walsh Fellowship ProgrammeDepartment of Agriculture, Food and the MarineScience Foundation Irelan

    Whole genome association study identifies regions of the bovine genome and biological pathways involved in carcass trait performance in Holstein-Friesian cattle

    Get PDF
    peer-reviewedBackground Four traits related to carcass performance have been identified as economically important in beef production: carcass weight, carcass fat, carcass conformation of progeny and cull cow carcass weight. Although Holstein-Friesian cattle are primarily utilized for milk production, they are also an important source of meat for beef production and export. Because of this, there is great interest in understanding the underlying genomic structure influencing these traits. Several genome-wide association studies have identified regions of the bovine genome associated with growth or carcass traits, however, little is known about the mechanisms or underlying biological pathways involved. This study aims to detect regions of the bovine genome associated with carcass performance traits (employing a panel of 54,001 SNPs) using measures of genetic merit (as predicted transmitting abilities) for 5,705 Irish Holstein-Friesian animals. Candidate genes and biological pathways were then identified for each trait under investigation. Results Following adjustment for false discovery (q-value  0.5) with at least one of the four traits. In total, 557 unique bovine genes, which mapped to 426 human orthologs, were within 500kbs of QTL found associated with a trait using the Bayesian approach. Using this information, 24 significantly over-represented pathways were identified across all traits. The most significantly over-represented biological pathway was the peroxisome proliferator-activated receptor (PPAR) signaling pathway. Conclusions A large number of genomic regions putatively associated with bovine carcass traits were detected using two different statistical approaches. Notably, several significant associations were detected in close proximity to genes with a known role in animal growth such as glucagon and leptin. Several biological pathways, including PPAR signaling, were shown to be involved in various aspects of bovine carcass performance. These core genes and biological processes may form the foundation for further investigation to identify causative mutations involved in each trait. Results reported here support previous findings suggesting conservation of key biological processes involved in growth and metabolism

    Genome Phylogenies Indicate a Meaningful a-Proteobacterial Phylogeny and Support a Grouping of the Mitochondria with the Rickettsiales

    Get PDF
    Placement of the mitochondrial branch on the tree of life has been problematic. Sparse sampling, the uncertainty of how lateral gene transfer might overwrite phylogenetic signals, and the uncertainty of phylogenetic inference have all contributed to the issue. Here we address this issue using a supertree approach and completed genomic sequences. We first determine that a sensible a-proteobacterial phylogenetic tree exists and that it can confidently be inferred using orthologous genes. We show that congruence across these orthologous gene trees is significantly better than might be expected by random chance. There is some evidence of horizontal gene transfer within the a-proteobacteria, but it appears to be restricted to a minority of genes (;23%) most of whom (;74%) can be categorized as operational. This means that placement of the mitochondrion should not be excessively hampered by interspecies gene transfer. We then show that there is a consistently strong signal for placement of the mitochondrion on this tree and that this placement is relatively insensitive to methodological approach or data set. A concatenated alignment was created consisting of 15 mitochondrion-encoded proteins that are unlikely to have undergone any lateral gene transfer in the timeline under consideration. This alignment infers that the sister group of the mitochondria, for the taxa that have been sampled, is the order Rickettsiales

    The Opisthokonta and the Ecdysozoa May Not Be Clades: Stronger Support for the Grouping of Plant and Animal than for Animal and Fungi and Stronger Support for the Coelomata than Ecdysozoa

    Get PDF
    In considering the best possible solutions for answering phylogenetic questions from genomic sequences, we have chosen a strategy that we suggest is superior to others that have gone previously. We have ignored multigene families and instead have used single-gene families. This minimizes the inadvertent analysis of paralogs. We have employed strict data controls and have reasoned that if a protein is not capable of recovering the uncontroversial parts of a phylogenetic tree, then why should we use it for the more controversial parts? We have sliced and diced the data in as many ways as possible in order to uncover the signals in that data. Using this strategy, we have tested two controversial hypotheses concerning eukaryotic phylogenetic relationships: the placement of arthropoda and nematodes and the relationships of animals, plants, and fungi. We have constructed phylogenetic trees from 780 single-gene families from 10 completed genomes and amalgamated these into a single supertree. We have also carried out a total evidence analysis on the only universally distributed protein families that can accurately reconstruct the uncontroversial parts of the phylogenetic tree: a total of five families. In doing so, we ignore the majority of single-gene families that are universally distributed as they do not have the appropriate signals to recover the uncontroversial parts of the tree. We have also ignored every protein that has ever been used previously to address this issue, simply because none of them meet our strict criteria. Using these data controls, site stripping, and multiple analyses, 24 out of 26 analyses strongly support the grouping of vertebrates with arthropods (Coelomata hypothesis) and plants with animals. In the other two analyses, the data were ambivalent. The latter finding overturns an 11-year theory of Eukaryotic evolution; the first confirms what has already been said by others. In the light of this new tree, we reanalyze the evolution of intron gain and loss in the rpL14 gene and find that it is much more compatible with the hypothesis presented here than with the Opisthokonta hypothesis

    L.U.St: a tool for approximated maximum likelihood supertree reconstruction

    Get PDF
    BACKGROUND: Supertrees combine disparate, partially overlapping trees to generate a synthesis that provides a high level perspective that cannot be attained from the inspection of individual phylogenies. Supertrees can be seen as meta-analytical tools that can be used to make inferences based on results of previous scientific studies. Their meta-analytical application has increased in popularity since it was realised that the power of statistical tests for the study of evolutionary trends critically depends on the use of taxon-dense phylogenies. Further to that, supertrees have found applications in phylogenomics where they are used to combine gene trees and recover species phylogenies based on genome-scale data sets. RESULTS: Here, we present the L.U.St package, a python tool for approximate maximum likelihood supertree inference and illustrate its application using a genomic data set for the placental mammals. L.U.St allows the calculation of the approximate likelihood of a supertree, given a set of input trees, performs heuristic searches to look for the supertree of highest likelihood, and performs statistical tests of two or more supertrees. To this end, L.U.St implements a winning sites test allowing ranking of a collection of a-priori selected hypotheses, given as a collection of input supertree topologies. It also outputs a file of input-tree-wise likelihood scores that can be used as input to CONSEL for calculation of standard tests of two trees (e.g. Kishino-Hasegawa, Shimidoara-Hasegawa and Approximately Unbiased tests). CONCLUSION: This is the first fully parametric implementation of a supertree method, it has clearly understood properties, and provides several advantages over currently available supertree approaches. It is easy to implement and works on any platform that has python installed. Availability: bitBucket page - https://[email protected]/afro-juju/l.u.st.git. Contact: [email protected]
    corecore