193 research outputs found

    Combining transcriptional datasets using the generalized singular value decomposition

    Get PDF
    Background Both microarrays and quantitative real-time PCR are convenient tools for studying the transcriptional levels of genes. The former is preferable for large scale studies while the latter is a more targeted technique. Because of platform-dependent systematic effects, simple comparisons or merging of datasets obtained by these technologies are difficult, even though they may often be desirable. These difficulties are exacerbated if there is only partial overlap between the experimental conditions and genes probed in the two datasets. Results We show here that the generalized singular value decomposition provides a practical tool for merging a small, targeted dataset obtained by quantitative real-time PCR of specific genes with a much larger microarray dataset. The technique permits, for the first time, the identification of genes present in only one dataset co-expressed with a target gene present exclusively in the other dataset, even when experimental conditions for the two datasets are not identical. With the rapidly increasing number of publically available large scale microarray datasets the latter is frequently the case. The method enables us to discover putative candidate genes involved in the biosynthesis of the (1,3;1,4)-β-D-glucan polysaccharide found in plant cell walls. Conclusion We show that the generalized singular value decomposition provides a viable tool for a combined analysis of two gene expression datasets with only partial overlap of both gene sets and experimental conditions. We illustrate how the decomposition can be optimized self-consistently by using a judicious choice of genes to define it. The ability of the technique to seamlessly define a concept of "co-expression" across both datasets provides an avenue for meaningful data integration. We believe that it will prove to be particularly useful for exploiting large, publicly available, microarray datasets for species with unsequenced genomes by complementing them with more limited in-house expression measurements.Andreas W Schreiber, Neil J Shirley, Rachel A Burton and Geoffrey B Finche

    A genome wide association scan for (1,3;1,4)-β-glucan content in the grain of contemporary 2-row Spring and Winter barleys

    Get PDF
    Published: 17 October 2014BACKGROUND: (1,3;1,4)-β-Glucan is an important component of the cell walls of barley grain as it affects processability during the production of alcoholic beverages and has significant human health benefits when consumed above recommended threshold levels. This leads to diametrically opposed quality requirements for different applications as low levels of (1,3;1,4)-β-glucan are required for brewing and distilling and high levels for positive impacts on human health. RESULTS: We quantified grain (1,3;1,4)-β-glucan content in a collection of 399 2-row Spring-type, and 204 2-row Winter-type elite barley cultivars originating mainly from north western Europe. We combined these data with genotypic information derived using a 9 K Illumina iSelect SNP platform and subsequently carried out a Genome Wide Association Scan (GWAS). Statistical analysis accounting for residual genetic structure within the germplasm collection allowed us to identify significant associations between molecular markers and the phenotypic data. By anchoring the regions that contain these associations to the barley genome assembly we catalogued genes underlying the associations. Based on gene annotations and transcript abundance data we identified candidate genes. CONCLUSIONS: We show that a region of the genome on chromosome 2 containing a cluster of CELLULOSE SYNTHASE-LIKE (Csl) genes, including CslF3, CslF4, CslF8, CslF10, CslF12 and CslH, as well as a region on chromosome 1H containing CslF9, are associated with the phenotype in this germplasm. We also observed that several regions identified by GWAS contain glycoside hydrolases that are possibly involved in (1,3;1,4)-β-glucan breakdown, together with other genes that might participate in (1,3;1,4)-β-glucan synthesis, re-modelling or regulation. This analysis provides new opportunities for understanding the genes related to the regulation of (1,3;1,4)-β-glucan content in cereal grains.Kelly Houston, Joanne Russell, Miriam Schreiber, Claire Halpin, Helena Oakey, Jennifer M Washington, Allan Booth, Neil Shirley, Rachel A Burton, Geoffrey B Fincher and Robbie Waug

    The dynamics of transcript abundance during cellularization of developing barley endosperm

    Get PDF
    Within the cereal grain, the endosperm and its nutrient reserves are critical for successful germination and in the context of grain utilization. The identification of molecular determinants of early endosperm development, particularly regulators of cell division and cell wall deposition, would help predict end-use properties such as yield, quality, and nutritional value. Custom microarray data have been generated using RNA isolated from developing barley grain endosperm 3 d to 8 d after pollination (DAP). Comparisons of transcript abundance over time revealed 47 gene expression modules that can be clustered into 10 broad groups. Superimposing these modules upon cytological data allowed patterns of transcript abundance to be linked with key stages of early grain development. Here, attention was focused on how the datasets could be mined to explore and define the processes of cell wall biosynthesis, remodeling, and degradation. Using a combination of spatial molecular network and gene ontology enrichment analyses, it is shown that genes involved in cell wall metabolism are found in multiple modules, but cluster into two main groups that exhibit peak expression at 3 DAP to 4 DAP and 5 DAP to 8 DAP. The presence of transcription factor genes in these modules allowed candidate genes for the control of wall metabolism during early barley grain development to be identified. The data are publicly available through a dedicated web interface (https://ics.hutton.ac.uk/barseed/), where they can be used to interrogate co- and differential expression for any other genes, groups of genes, or transcription factors expressed during early endosperm development.Runxuan Zhang, Matthew R. Tucker, Rachel A Burton, Neil J. Shirley, Alan Little, Jenny Morris, Linda Milne, Kelly Houston, Pete E. Hedley, Robbie Waugh, and Geoffrey B. Finche

    An ammonia spectral map of the L1495-B218 filaments in the Taurus molecular cloud. I. Physical properties of filaments and dense cores

    Get PDF
    We present deep NH3 observations of the L1495-B218 filaments in the Taurus molecular cloud covering over a 3° angular range using the K-band focal plane array on the 100 m Green Bank Telescope. The L1495-B218 filaments form an interconnected, nearby, large complex extending over 8 pc. We observed NH3 (1, 1) and (2, 2) with a spectral resolution of 0.038 km s−1 and a spatial resolution of 31''. Most of the ammonia peaks coincide with intensity peaks in dust continuum maps at 350 and 500 μm. We deduced physical properties by fitting a model to the observed spectra. We find gas kinetic temperatures of 8–15 K, velocity dispersions of 0.05–0.25 km s−1, and NH3 column densities of 5 × 1012 to 1 × 1014 cm−2. The CSAR algorithm, which is a hybrid of seeded-watershed and binary dendrogram algorithms, identifies a total of 55 NH3 structures, including 39 leaves and 16 branches. The masses of the NH3 sources range from 0.05 to 9.5 M{{M}_{\odot }}. The masses of NH3 leaves are mostly smaller than their corresponding virial mass estimated from their internal and gravitational energies, which suggests that these leaves are gravitationally unbound structures. Nine out of 39 NH3 leaves are gravitationally bound, and seven out of nine gravitationally bound NH3 leaves are associated with star formation. We also found that 12 out of 30 gravitationally unbound leaves are pressure confined. Our data suggest that a dense core may form as a pressure-confined structure, evolve to a gravitationally bound core, and undergo collapse to form a protostar

    Fine-mapping of the HNF1B multicancer locus identifies candidate variants that mediate endometrial cancer risk.

    Get PDF
    Common variants in the hepatocyte nuclear factor 1 homeobox B (HNF1B) gene are associated with the risk of Type II diabetes and multiple cancers. Evidence to date indicates that cancer risk may be mediated via genetic or epigenetic effects on HNF1B gene expression. We previously found single-nucleotide polymorphisms (SNPs) at the HNF1B locus to be associated with endometrial cancer, and now report extensive fine-mapping and in silico and laboratory analyses of this locus. Analysis of 1184 genotyped and imputed SNPs in 6608 Caucasian cases and 37 925 controls, and 895 Asian cases and 1968 controls, revealed the best signal of association for SNP rs11263763 (P = 8.4 × 10(-14), odds ratio = 0.86, 95% confidence interval = 0.82-0.89), located within HNF1B intron 1. Haplotype analysis and conditional analyses provide no evidence of further independent endometrial cancer risk variants at this locus. SNP rs11263763 genotype was associated with HNF1B mRNA expression but not with HNF1B methylation in endometrial tumor samples from The Cancer Genome Atlas. Genetic analyses prioritized rs11263763 and four other SNPs in high-to-moderate linkage disequilibrium as the most likely causal SNPs. Three of these SNPs map to the extended HNF1B promoter based on chromatin marks extending from the minimal promoter region. Reporter assays demonstrated that this extended region reduces activity in combination with the minimal HNF1B promoter, and that the minor alleles of rs11263763 or rs8064454 are associated with decreased HNF1B promoter activity. Our findings provide evidence for a single signal associated with endometrial cancer risk at the HNF1B locus, and that risk is likely mediated via altered HNF1B gene expression

    The Baryon Oscillation Spectroscopic Survey of SDSS-III

    Get PDF
    The Baryon Oscillation Spectroscopic Survey (BOSS) is designed to measure the scale of baryon acoustic oscillations (BAO) in the clustering of matter over a larger volume than the combined efforts of all previous spectroscopic surveys of large scale structure. BOSS uses 1.5 million luminous galaxies as faint as i=19.9 over 10,000 square degrees to measure BAO to redshifts z<0.7. Observations of neutral hydrogen in the Lyman alpha forest in more than 150,000 quasar spectra (g<22) will constrain BAO over the redshift range 2.15<z<3.5. Early results from BOSS include the first detection of the large-scale three-dimensional clustering of the Lyman alpha forest and a strong detection from the Data Release 9 data set of the BAO in the clustering of massive galaxies at an effective redshift z = 0.57. We project that BOSS will yield measurements of the angular diameter distance D_A to an accuracy of 1.0% at redshifts z=0.3 and z=0.57 and measurements of H(z) to 1.8% and 1.7% at the same redshifts. Forecasts for Lyman alpha forest constraints predict a measurement of an overall dilation factor that scales the highly degenerate D_A(z) and H^{-1}(z) parameters to an accuracy of 1.9% at z~2.5 when the survey is complete. Here, we provide an overview of the selection of spectroscopic targets, planning of observations, and analysis of data and data quality of BOSS.Comment: 49 pages, 16 figures, accepted by A

    From Data to Software to Science with the Rubin Observatory LSST

    Full text link
    The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) dataset will dramatically alter our understanding of the Universe, from the origins of the Solar System to the nature of dark matter and dark energy. Much of this research will depend on the existence of robust, tested, and scalable algorithms, software, and services. Identifying and developing such tools ahead of time has the potential to significantly accelerate the delivery of early science from LSST. Developing these collaboratively, and making them broadly available, can enable more inclusive and equitable collaboration on LSST science. To facilitate such opportunities, a community workshop entitled "From Data to Software to Science with the Rubin Observatory LSST" was organized by the LSST Interdisciplinary Network for Collaboration and Computing (LINCC) and partners, and held at the Flatiron Institute in New York, March 28-30th 2022. The workshop included over 50 in-person attendees invited from over 300 applications. It identified seven key software areas of need: (i) scalable cross-matching and distributed joining of catalogs, (ii) robust photometric redshift determination, (iii) software for determination of selection functions, (iv) frameworks for scalable time-series analyses, (v) services for image access and reprocessing at scale, (vi) object image access (cutouts) and analysis at scale, and (vii) scalable job execution systems. This white paper summarizes the discussions of this workshop. It considers the motivating science use cases, identified cross-cutting algorithms, software, and services, their high-level technical specifications, and the principles of inclusive collaborations needed to develop them. We provide it as a useful roadmap of needs, as well as to spur action and collaboration between groups and individuals looking to develop reusable software for early LSST science.Comment: White paper from "From Data to Software to Science with the Rubin Observatory LSST" worksho
    corecore