937 research outputs found

    f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq.

    Get PDF
    Single-cell RNA-sequencing (scRNA-seq) allows studying heterogeneity in gene expression in large cell populations. Such heterogeneity can arise due to technical or biological factors, making decomposing sources of variation difficult. We here describe f-scLVM (factorial single-cell latent variable model), a method based on factor analysis that uses pathway annotations to guide the inference of interpretable factors underpinning the heterogeneity. Our model jointly estimates the relevance of individual factors, refines gene set annotations, and infers factors without annotation. In applications to multiple scRNA-seq datasets, we find that f-scLVM robustly decomposes scRNA-seq datasets into interpretable components, thereby facilitating the identification of novel subpopulations

    Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data.

    Get PDF
    By profiling the transcriptomes of individual cells, single-cell RNA sequencing provides unparalleled resolution to study cellular heterogeneity. However, this comes at the cost of high technical noise, including cell-specific biases in capture efficiency and library generation. One strategy for removing these biases is to add a constant amount of spike-in RNA to each cell and to scale the observed expression values so that the coverage of spike-in transcripts is constant across cells. This approach has previously been criticized as its accuracy depends on the precise addition of spike-in RNA to each sample. Here, we perform mixture experiments using two different sets of spike-in RNA to quantify the variance in the amount of spike-in RNA added to each well in a plate-based protocol. We also obtain an upper bound on the variance due to differences in behavior between the two spike-in sets. We demonstrate that both factors are small contributors to the total technical variance and have only minor effects on downstream analyses, such as detection of highly variable genes and clustering. Our results suggest that scaling normalization using spike-in transcripts is reliable enough for routine use in single-cell RNA sequencing data analyses.This work was supported by Cancer Research UK (core funding to JCM, award no. A17197), the University of Cambridge and Hutchison Whampoa Limited. JCM was also supported by core funding from EMBL. LHV was supported by an EMBL Interdisciplinary Postdoctoral fellowship. Work in the G ottgens group was supported by Cancer Research UK, Bloodwise, the National Institute of Diabetes and Digestive and Kidney Diseases, the Leukemia and Lymphoma Society and core infrastructure grants from the Wellcome Trust and the Medical Research Council to the Cambridge Stem Cell Institute

    CTCF maintains regulatory homeostasis of cancer pathways

    Get PDF
    Abstract Background CTCF binding to DNA helps partition the mammalian genome into discrete structural and regulatory domains. Complete removal of CTCF from mammalian cells causes catastrophic genome dysregulation, likely due to widespread collapse of 3D chromatin looping and alterations to inter- and intra-TAD interactions within the nucleus. In contrast, Ctcf hemizygous mice with lifelong reduction of CTCF expression are viable, albeit with increased cancer incidence. Here, we exploit chronic Ctcf hemizygosity to reveal its homeostatic roles in maintaining genome function and integrity. Results We find that Ctcf hemizygous cells show modest but robust changes in almost a thousand sites of genomic CTCF occupancy; these are enriched for lower affinity binding events with weaker evolutionary conservation across the mouse lineage. Furthermore, we observe dysregulation of the expression of several hundred genes, which are concentrated in cancer-related pathways, and are caused by changes in transcriptional regulation. Chromatin structure is preserved but some loop interactions are destabilized; these are often found around differentially expressed genes and their enhancers. Importantly, the transcriptional alterations identified in vitro are recapitulated in mouse tumors and also in human cancers. Conclusions This multi-dimensional genomic and epigenomic profiling of a Ctcf hemizygous mouse model system shows that chronic depletion of CTCF dysregulates steady-state gene expression by subtly altering transcriptional regulation, changes which can also be observed in primary tumors

    Network analysis of canine brain morphometry links tumour risk to oestrogen deficiency and accelerated brain ageing.

    Get PDF
    Structural 'brain age' is a valuable but complex biomarker for several brain disorders. The dog is an unrivalled comparator for neurological disease modeling, however canine brain morphometric diversity creates computational and statistical challenges. Using a data-driven approach, we explored complex interactions between patient metadata, brain morphometry, and neurological disease. Twenty-four morphometric parameters measured from 286 canine brain magnetic resonance imaging scans were combined with clinical parameters to generate 9,438 data points. Network analysis was used to cluster patients according to their brain morphometry profiles. An 'aged-brain' profile, defined by a small brain width and volume combined with ventriculomegaly, was revealed in the Boxer breed. Key features of this profile were paralleled in neutered female dogs which, relative to un-neutered females, had an 11-fold greater risk of developing brain tumours. Boxer dog and geriatric dog groups were both enriched for brain tumour diagnoses, despite a lack of geriatric Boxers within the cohort. Our findings suggest that advanced brain ageing enhances brain tumour risk in dogs and may be influenced by oestrogen deficiency-a risk factor for dementia and brain tumours in humans. Morphometric features of brain ageing in dogs, like humans, might better predict neurological disease risk than patient chronological age

    The pitfalls of platform comparison: DNA copy number array technologies assessed

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The accurate and high resolution mapping of DNA copy number aberrations has become an important tool by which to gain insight into the mechanisms of tumourigenesis. There are various commercially available platforms for such studies, but there remains no general consensus as to the optimal platform. There have been several previous platform comparison studies, but they have either described older technologies, used less-complex samples, or have not addressed the issue of the inherent biases in such comparisons. Here we describe a systematic comparison of data from four leading microarray technologies (the Affymetrix Genome-wide SNP 5.0 array, Agilent High-Density CGH Human 244A array, Illumina HumanCNV370-Duo DNA Analysis BeadChip, and the Nimblegen 385 K oligonucleotide array). We compare samples derived from primary breast tumours and their corresponding matched normals, well-established cancer cell lines, and HapMap individuals. By careful consideration and avoidance of potential sources of bias, we aim to provide a fair assessment of platform performance.</p> <p>Results</p> <p>By performing a theoretical assessment of the reproducibility, noise, and sensitivity of each platform, notable differences were revealed. Nimblegen exhibited between-replicate array variances an order of magnitude greater than the other three platforms, with Agilent slightly outperforming the others, and a comparison of self-self hybridizations revealed similar patterns. An assessment of the single probe power revealed that Agilent exhibits the highest sensitivity. Additionally, we performed an in-depth visual assessment of the ability of each platform to detect aberrations of varying sizes. As expected, all platforms were able to identify large aberrations in a robust manner. However, some focal amplifications and deletions were only detected in a subset of the platforms.</p> <p>Conclusion</p> <p>Although there are substantial differences in the design, density, and number of replicate probes, the comparison indicates a generally high level of concordance between platforms, despite differences in the reproducibility, noise, and sensitivity. In general, Agilent tended to be the best aCGH platform and Affymetrix, the superior SNP-CGH platform, but for specific decisions the results described herein provide a guide for platform selection and study design, and the dataset a resource for more tailored comparisons.</p

    Evaluation of Sentinel-3A and Sentinel-3B ocean land colour instrument green instantaneous fraction of absorbed photosynthetically active radiation

    Get PDF
    This article presents the evaluation of the Copernicus Sentinel-3 Ocean Land Colour Instrument (OLCI) operational terrestrial products corresponding to the green instantaneous Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) and its associated rectified channels. These products are estimated using OLCI spectral measurements acquired at the top of the atmosphere by a physically-based approach and are available operationally at full (300 m) and reduced (1.2 km) spatial resolution daily. The evaluation of the quality of the FAPAR OLCI values was based on the availability of data acquired over several years by Sentinel-3A (S3A) and Sentinel-3B (S3B). The evaluation exercise consisted of several stages: first, an overall comparison of the two S3 platform products was carried out during the tandem phase; second, comparison with an FAPAR climatology derived from the Medium Resolution Imaging Spectrometer (MERIS) provided information on the seasonality of various types of land cover. Then, direct comparisons were made with the same type of FAPAR products retrieved from two sensors, the Moderate Resolution Imaging Spectroradiometer (MODIS) and the Sentinel-2 (S2) Multispectral Instrument (MSI), and with several ground-based estimates. In addition, an analysis of the efficiency of the retrieval algorithm with 3D radiative transfer simulations was performed. The results indicated that the consistency between daily and monthly S3A and S3B on a global scale was very good during the tandem phase (RMSD = 0.01 and a correlation R2 of 0.99 with a bias of 0.003); we found an agreement with a correlation of 0.95 and 0.93 (RMSD = 0.07 and 0.09) with JRC FAPAR S2 and JRC FAPAR MODIS, respectively. Compatibility with the ground-based data was between 0.056 and 0.24 in term of RMSD depending on the type of vegetation with an overall R2 of 0.89. Immler diagrams demonstrate that their variances were lower than the total uncertainties. The quality assurance using 3D radiative transfer model has shown that the apparent performance of the algorithm depends strongly on the type of in-situ measurement and canopy type

    DNA methylation of blood cells is associated with prevalent type 2 diabetes in a meta-analysis of four European cohorts

    Get PDF
    Background: Type 2 diabetes (T2D) is a heterogeneous disease with well-known genetic and environmental risk factors contributing to its prevalence. Epigenetic mechanisms related to changes in DNA methylation (DNAm), may also contribute to T2D risk, but larger studies are required to discover novel markers, and to confirm existing ones. Results: We performed a large meta-analysis of individual epigenome-wide association studies (EWAS) of prevalent T2D conducted in four European studies using peripheral blood DNAm. Analysis of differentially methylated regions (DMR) was also undertaken, based on the meta-analysis results. We found three novel CpGs associated with prevalent T2D in Europeans at cg00144180 (HDAC4), cg16765088 (near SYNM) and cg24704287 (near MIR23A) and confirmed three CpGs previously identified (mapping to TXNIP, ABCG1 and CPT1A). We also identified 77 T2D associated DMRs, most of them hypomethylated in T2D cases versus controls. In adjusted regressions among diabetic-free participants in ALSPAC, we found that all six CpGs identified in the meta-EWAS were associated with white cell-types. We estimated that these six CpGs captured 11% of the variation in T2D, which was similar to the variation explained by the model including only the common risk factors of BMI, sex, age and smoking (R2 = 10.6%). Conclusions: This study identifies novel loci associated with T2D in Europeans. We also demonstrate associations of the same loci with other traits. Future studies should investigate if our findings are generalizable in non-European populations, and potential roles of these epigenetic markers in T2D etiology or in determining long term consequences of T2D

    scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells.

    Get PDF
    Parallel single-cell sequencing protocols represent powerful methods for investigating regulatory relationships, including epigenome-transcriptome interactions. Here, we report a single-cell method for parallel chromatin accessibility, DNA methylation and transcriptome profiling. scNMT-seq (single-cell nucleosome, methylation and transcription sequencing) uses a GpC methyltransferase to label open chromatin followed by bisulfite and RNA sequencing. We validate scNMT-seq by applying it to differentiating mouse embryonic stem cells, finding links between all three molecular layers and revealing dynamic coupling between epigenomic layers during differentiation
    • …
    corecore