17 research outputs found

    Assessing soil health benefits of forage grasses - A review of methods

    Get PDF

    Seq-ing improved gene expression estimates from microarrays using machine learning

    Get PDF
    BACKGROUND: Quantifying gene expression by RNA-Seq has several advantages over microarrays, including greater dynamic range and gene expression estimates on an absolute, rather than a relative scale. Nevertheless, microarrays remain in widespread use, demonstrated by the ever-growing numbers of samples deposited in public repositories. RESULTS: We propose a novel approach to microarray analysis that attains many of the advantages of RNA-Seq. This method, called Machine Learning of Transcript Expression (MaLTE), leverages samples for which both microarray and RNA-Seq data are available, using a Random Forest to learn the relationship between the fluorescence intensity of sets of microarray probes and RNA-Seq transcript expression estimates. We trained MaLTE on data from the Genotype-Tissue Expression (GTEx) project, consisting of Affymetrix gene arrays and RNA-Seq from over 700 samples across a broad range of human tissues. CONCLUSION: This approach can be used to accurately estimate absolute expression levels from microarray data, at both gene and transcript level, which has not previously been possible. This methodology will facilitate re-analysis of archived microarray data and broaden the utility of the vast quantities of data still being generated

    A mutation in a splicing factor that causes retinitis pigmentosa has a transcriptome-wide effect on mRNA splicing

    Get PDF
    Background: Substantial progress has been made in the identification of sequence elements that control mRNA splicing and the genetic variants in these elements that alter mRNA splicing (referred to as splicing quantitative trait loci – sQTLs). Genetic variants that affect mRNA splicing in trans are harder to identify because their effects can be more subtle and diffuse, and the variants are not co-located with their targets. We carried out a transcriptome-wide analysis of the effects of a mutation in a ubiquitous splicing factor that causes retinitis pigmentosa (RP) on mRNA splicing, using exon microarrays. Results: Exon microarray data was generated from whole blood samples obtained from four individuals with a mutation in the splicing factor PRPF8 and four sibling controls. Although the mutation has no known phenotype in blood, there was evidence of widespread differences in splicing between cases and controls (affecting approximately 20% of exons). Most probesets with significantly different inclusion (defined as the expression intensity of the exon divided by the expression of the corresponding transcript) between cases and controls had higher inclusion in cases and corresponded to exons that were shorter than average, AT rich, located towards the 5’ end of the gene and flanked by long introns. Introns flanking affected probesets were particularly depleted for the shortest category of introns, associated with splicing via intron definition. Conclusions: Our results show that a mutation in a splicing factor, with a phenotype that is restricted to retinal tissue, acts as a trans-sQTL cluster in whole blood samples. Characteristics of the affected exons suggest that they are spliced co-transcriptionally and via exon definition. However, due to the small sample size available for this study, further studies are required to confirm the widespread impact of this PRPF8 mutation on mRNA splicing outside the retina

    Evaluation of chickpea genotypes for resistance to Ascochyta blight (Ascochyta rabiei) disease in the dry highlands of Kenya

    Get PDF
    Chickpea (Cicer arietinum) is an edible legume grown widely for its nutritious seed, which is rich in protein, minerals, vitamins and dietary fibre. It’s a new crop in Kenya whose potential has not been utilized fully due to abiotic and biotic stresses that limit its productivity. The crop is affected mainly by Ascochyta blight (AB) which is widespread in cool dry highlands causing up to 100% yield loss. The objective of this study was to evalu- ate the resistance of selected chickpea genotypes to AB in dry highlands of Kenya. The study was done in 2 sites (Egerton University-Njoro) and Agricultural Training centre-ATC-Koibatek) for one season during long rains of 2010/2011 growing season. Thirty six genotypes from reference sets and mini-core samples introduced from ICR- SAT were evaluated. There were significant (P<0.001) differences in AB responses and grain yield performance in test genotypes in both sites. AB was more severe at Egerton-Njoro (mean score 5.7) than ATC-Koibatek (mean score 4.25), with subsequent low grain yield. Genotypes ICC7052, ICC4463, ICC4363, ICC2884, ICC7150, ICC15294 and ICC11627 had both highest grain yield in decreasing order (mean range 1790-1053 Kg ha-1) and best resist- ance to AB. Further evaluation is needed in other multi-locations and their use in breeding program determined especially because of their undesirable black seed color. Commercial varieties (LDT068, LDT065, Chania desi 1, and Saina K1) were all susceptible to AB, but with grain yield >1200 Kg ha-1. The findings of the study showed that chickpea should be sown during the short rains (summer) in the dry highlands of Kenya when conditions are drier and warmer and less favorable for AB infection. However yield could be increased by shifting the sowing date from dry season to long rain (winter) thus avoiding terminal drought if AB resistant cultivars with acceptable agronomic traits could be identified

    Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals.</p> <p>Results</p> <p>Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns.</p> <p>Conclusions</p> <p>Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant.</p

    PDBe: improved accessibility of macromolecular structure data from PDB and EMDB

    Get PDF
    © 2015 The Authors. Published by OUP. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1093/nar/gkv1047The Protein Data Bank in Europe (http://pdbe.org) accepts and annotates depositions of macromolecular structure data in the PDB and EMDB archives and enriches, integrates and disseminates structural information in a variety of ways. The PDBe website has been redesigned based on an analysis of user requirements, and now offers intuitive access to improved and value-added macromolecular structure information. Unique value-added information includes lists of reviews and research articles that cite or mention PDB entries as well as access to figures and legends from full-text open-access publications that describe PDB entries. A powerful new query system not only shows all the PDB entries that match a given query, but also shows the 'best structures' for a given macromolecule, ligand complex or sequence family using data-quality information from the wwPDB validation reports. A PDBe RESTful API has been developed to provide unified access to macromolecular structure data available in the PDB and EMDB archives as well as value-added annotations, e.g. regarding structure quality and up-to-date cross-reference information from the SIFTS resource. Taken together, these new developments facilitate unified access to macromolecular structure data in an intuitive way for non-expert users and support expert users in analysing macromolecular structure data.The Wellcome Trust [88944, 104948]; UK Biotechnology and Biological Sciences Research Council [BB/J007471/1, BB/K016970/1, BB/M013146/1, BB/M011674/1]; National Institutes of Health [GM079429]; UK Medical Research Council [MR/L007835/1]; European Union [284209]; CCP4; European Molecular Biology Laboratory (EMBL). Funding for open access charge: The Wellcome Trust.Published versio

    The Rural Household Multiple Indicator Survey, data from 13,310 farm households in 21 countries

    Get PDF
    The Rural Household Multiple Indicator Survey (RHoMIS) is a standardized farm household survey approach which collects information on 758 variables covering household demographics, farm area, crops grown and their production, livestock holdings and their production, agricultural product use and variables underlying standard socio-economic and food security indicators such as the Probability of Poverty Index, the Household Food Insecurity Access Scale, and household dietary diversity. These variables are used to quantify more than 40 different indicators on farm and household characteristics, welfare, productivity, and economic performance. Between 2015 and the beginning of 2018, the survey instrument was applied in 21 countries in Central America, sub-Saharan Africa and Asia. The data presented here include the raw survey response data, the indicator calculation code, and the resulting indicator values. These data can be used to quantify on- and off-farm pathways to food security, diverse diets, and changes in poverty for rural smallholder farm households

    Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    No full text
    Background: We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results: Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions: Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant

    Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    No full text
    Background: We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results: Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions: Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant

    Seq-ing improved gene expression estimates from microarrays using machine learning

    No full text
    Background: Quantifying gene expression by RNA-Seq has several advantages over microarrays, including greater dynamic range and gene expression estimates on an absolute, rather than a relative scale. Nevertheless, microarrays remain in widespread use, demonstrated by the ever-growing numbers of samples deposited in public repositories. Results: We propose a novel approach to microarray analysis that attains many of the advantages of RNA-Seq. This method, called Machine Learning of Transcript Expression (MaLTE), leverages samples for which both microarray and RNA-Seq data are available, using a Random Forest to learn the relationship between the fluorescence intensity of sets of microarray probes and RNA-Seq transcript expression estimates. We trained MaLTE on data from the Genotype-Tissue Expression (GTEx) project, consisting of Affymetrix gene arrays and RNA-Seq from over 700 samples across a broad range of human tissues. Conclusion: This approach can be used to accurately estimate absolute expression levels from microarray data, at both gene and transcript level, which has not previously been possible. This methodology will facilitate re-analysis of archived microarray data and broaden the utility of the vast quantities of data still being generated
    corecore