134 research outputs found

    ISOpureR: an R implementation of a computational purification algorithm of mixed tumour profiles

    Get PDF
    Background Tumour samples containing distinct sub-populations of cancer and normal cells present challenges in the development of reproducible biomarkers, as these biomarkers are based on bulk signals from mixed tumour profiles. ISOpure is the only mRNA computational purification method to date that does not require a paired tumour-normal sample, provides a personalized cancer profile for each patient, and has been tested on clinical data. Replacing mixed tumour profiles with ISOpure-preprocessed cancer profiles led to better prognostic gene signatures for lung and prostate cancer. Results To simplify the integration of ISOpure into standard R-based bioinformatics analysis pipelines, the algorithm has been implemented as an R package. The ISOpureR package performs analogously to the original code in estimating the fraction of cancer cells and the patient cancer mRNA abundance profile from tumour samples in four cancer datasets. Conclusions The ISOpureR package estimates the fraction of cancer cells and personalized patient cancer mRNA abundance profile from a mixed tumour profile. This open-source R implementation enables integration into existing computational pipelines, as well as easy testing, modification and extension of the model.Prostate Cancer CanadaMovember Foundation (Grant RS2014-01

    Genome-wide analysis of the maternal-to-zygotic transition in Drosophila primordial germ cells

    Get PDF
    Background: During the maternal-to-zygotic transition (MZT) vast changes in the embryonic transcriptome are produced by a combination of two processes: elimination of maternally provided mRNAs and synthesis of new transcripts from the zygotic genome. Previous genome-wide analyses of the MZT have been restricted to whole embryos. Here we report the first such analysis for primordial germ cells (PGCs), the progenitors of the germ-line stem cells. Results: We purified PGCs from Drosophila embryos, defined their proteome and transcriptome, and assessed the content, scale and dynamics of their MZT. Transcripts encoding proteins that implement particular types of biological functions group into nine distinct expression profiles, reflecting coordinate control at the transcriptional and posttranscriptional levels. mRNAs encoding germ-plasm components and cell-cell signaling molecules are rapidly degraded while new transcription produces mRNAs encoding the core transcriptional and protein synthetic machineries. The RNA-binding protein Smaug is essential for the PGC MZT, clearing transcripts encoding proteins that regulate stem cell behavior, transcriptional and posttranscriptional processes. Computational analyses suggest that Smaug and AU-rich element binding proteins function independently to control transcript elimination. Conclusions: The scale of the MZT is similar in the soma and PGCs. However, the timing and content of their MZTs differ, reflecting the distinct developmental imperatives of these cell types. The PGC MZT is delayed relative to that in the soma, likely because relief of PGC-specific transcriptional silencing is required for zygotic genome activation as well as for efficient maternal transcript clearance.http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000305391700004&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=8e1609b174ce4e31116a60747a720701Biotechnology & Applied MicrobiologyGenetics & HereditySCI(E)20ARTICLE2null1

    Similarity Regression predicts evolution of transcription factor sequence specificity

    Get PDF
    Transcription factor (TF) binding specificities (motifs) are essential to the analysis of noncoding DNA and gene regulation. Accurate prediction of the sequence specificities of TFs is critical, because the hundreds of sequenced eukaryotic genomes encompass hundreds of thousands of TFs, and assaying each is currently infeasible. There is ongoing controversy regarding the efficacy of motif prediction methods, as well as the degree of motif diversification among related species. Here, we describe Similarity Regression (SR), a significantly improved method for predicting motifs. We have updated and expanded the Cis-BP database using SR, and validate its predictive capacity with new data from diverse eukaryotic TFs. SR inherently quantifies TF motif evolution, and we show that previous claims of near-complete conservation of motifs between human and Drosophila are grossly inflated, with nearly half the motifs in each species absent from the other. We conclude that diversification in DNA binding motifs is pervasive, and present a new tool and updated resource to study TF diversity and gene regulation across eukaryotes

    Cytoscape Web: an interactive web-based network browser

    Get PDF
    Summary: Cytoscape Web is a web-based network visualization tool–modeled after Cytoscape–which is open source, interactive, customizable and easily integrated into web sites. Multiple file exchange formats can be used to load data into Cytoscape Web, including GraphML, XGMML and SIF

    Conservation of core gene expression in vertebrate tissues

    Get PDF
    Abstract Background Vertebrates share the same general body plan and organs, possess related sets of genes, and rely on similar physiological mechanisms, yet show great diversity in morphology, habitat and behavior. Alteration of gene regulation is thought to be a major mechanism in phenotypic variation and evolution, but relatively little is known about the broad patterns of conservation in gene expression in non-mammalian vertebrates. Results We measured expression of all known and predicted genes across twenty tissues in chicken, frog and pufferfish. By combining the results with human and mouse data and considering only ten common tissues, we have found evidence of conserved expression for more than a third of unique orthologous genes. We find that, on average, transcription factor gene expression is neither more nor less conserved than that of other genes. Strikingly, conservation of expression correlates poorly with the amount of conserved nonexonic sequence, even using a sequence alignment technique that accounts for non-collinearity in conserved elements. Many genes show conserved human/fish expression despite having almost no nonexonic conserved primary sequence. Conclusions There are clearly strong evolutionary constraints on tissue-specific gene expression. A major challenge will be to understand the precise mechanisms by which many gene expression patterns remain similar despite extensive cis-regulatory restructuring

    The TRIM-NHL protein NHL-2 is a Novel Co-Factor of the CSR-1 and HRDE-1 22G-RNA Pathways [preprint]

    Get PDF
    Proper regulation of germline gene expression is essential for fertility and maintaining species integrity. In the C. elegans germline, a diverse repertoire of regulatory pathways promote the expression of endogenous germline genes and limit the expression of deleterious transcripts to maintain genome homeostasis. Here we show that the conserved TRIM-NHL protein, NHL-2, plays an essential role in the C. elegans germline, modulating germline chromatin and meiotic chromosome organization. We uncover a role for NHL-2 as a co-factor in both positively (CSR-1) and negatively (HRDE-1) acting germline 22G-small RNA pathways and the somatic nuclear RNAi pathway. Furthermore, we demonstrate that NHL-2 is a bona fide RNA binding protein and, along with RNA-seq data point to a small RNA independent role for NHL-2 in regulating transcripts at the level of RNA stability. Collectively, our data implicate NHL-2 as an essential hub of gene regulatory activity in both the germline and soma

    PERT: A Method for Expression Deconvolution of Human Blood Samples from Varied Microenvironmental and Developmental Conditions

    Get PDF
    The cellular composition of heterogeneous samples can be predicted using an expression deconvolution algorithm to decompose their gene expression profiles based on pre-defined, reference gene expression profiles of the constituent populations in these samples. However, the expression profiles of the actual constituent populations are often perturbed from those of the reference profiles due to gene expression changes in cells associated with microenvironmental or developmental effects. Existing deconvolution algorithms do not account for these changes and give incorrect results when benchmarked against those measured by well-established flow cytometry, even after batch correction was applied. We introduce PERT, a new probabilistic expression deconvolution method that detects and accounts for a shared, multiplicative perturbation in the reference profiles when performing expression deconvolution. We applied PERT and three other state-of-the-art expression deconvolution methods to predict cell frequencies within heterogeneous human blood samples that were collected under several conditions (uncultured mono-nucleated and lineage-depleted cells, and culture-derived lineage-depleted cells). Only PERT's predicted proportions of the constituent populations matched those assigned by flow cytometry. Genes associated with cell cycle processes were highly enriched among those with the largest predicted expression changes between the cultured and uncultured conditions. We anticipate that PERT will be widely applicable to expression deconvolution strategies that use profiles from reference populations that vary from the corresponding constituent populations in cellular state but not cellular phenotypic identity

    Decreased body mass index in the preclinical stage of autosomal dominant Alzheimer’s disease

    Get PDF
    The relationship between body-mass index (BMI) and Alzheimer´s disease (AD) has been extensively investigated. However, BMI alterations in preclinical individuals with autosomal dominant AD (ADAD) have not yet been investigated. We analyzed cross-sectional data from 230 asymptomatic members of families with ADAD participating in the Dominantly Inherited Alzheimer Network (DIAN) study including 120 preclinical mutation carriers (MCs) and 110 asymptomatic non-carriers (NCs). Differences in BMI and their relation with cerebral amyloid load and episodic memory as a function of estimated years to symptom onset (EYO) were analyzed. Preclinical MCs showed significantly lower BMIs compared to NCs, starting 11.2 years before expected symptom onset. However, the BMI curves begun to diverge already at 17.8 years before expected symptom onset. Lower BMI in preclinical MCs was significantly associated with less years before estimated symptom onset, higher global Aβ brain burden, and with lower delayed total recall scores in the logical memory test. The study provides cross-sectional evidence that weight loss starts one to two decades before expected symptom onset of ADAD. Our findings point toward a link between the pathophysiology of ADAD and disturbance of weight control mechanisms. Longitudinal follow-up studies are warranted to investigate BMI changes over time

    The functional landscape of mouse gene expression

    Get PDF
    BACKGROUND: Large-scale quantitative analysis of transcriptional co-expression has been used to dissect regulatory networks and to predict the functions of new genes discovered by genome sequencing in model organisms such as yeast. Although the idea that tissue-specific expression is indicative of gene function in mammals is widely accepted, it has not been objectively tested nor compared with the related but distinct strategy of correlating gene co-expression as a means to predict gene function. RESULTS: We generated microarray expression data for nearly 40,000 known and predicted mRNAs in 55 mouse tissues, using custom-built oligonucleotide arrays. We show that quantitative transcriptional co-expression is a powerful predictor of gene function. Hundreds of functional categories, as defined by Gene Ontology 'Biological Processes', are associated with characteristic expression patterns across all tissues, including categories that bear no overt relationship to the tissue of origin. In contrast, simple tissue-specific restriction of expression is a poor predictor of which genes are in which functional categories. As an example, the highly conserved mouse gene PWP1 is widely expressed across different tissues but is co-expressed with many RNA-processing genes; we show that the uncharacterized yeast homolog of PWP1 is required for rRNA biogenesis. CONCLUSIONS: We conclude that 'functional genomics' strategies based on quantitative transcriptional co-expression will be as fruitful in mammals as they have been in simpler organisms, and that transcriptional control of mammalian physiology is more modular than is generally appreciated. Our data and analyses provide a public resource for mammalian functional genomics

    Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

    Full text link
    The type and genomic context of cancer mutations depend on their causes. These causes have been characterized using signatures that represent mutation types that co-occur in the same tumours. However, it remains unclear how mutation processes change during cancer evolution due to the lack of reliable methods to reconstruct evolutionary trajectories of mutational signature activity. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we present TrackSig, a new method that reconstructs these trajectories using optimal, joint segmentation and deconvolution of mutation type and allele frequencies from a single tumour sample. In simulations, we find TrackSig has a 3-5% activity reconstruction error, and 12% false detection rate. It outperforms an aggressive baseline in situations with branching evolution, CNA gain, and neutral mutations. Applied to data from 2658 tumours and 38 cancer types, TrackSig permits pan-cancer insight into evolutionary changes in mutational processes
    • …
    corecore