22 research outputs found

    Definition, conservation and epigenetics of housekeeping and tissue-enriched genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Housekeeping genes (HKG) are constitutively expressed in all tissues while tissue-enriched genes (TEG) are expressed at a much higher level in a single tissue type than in others. HKGs serve as valuable experimental controls in gene and protein expression experiments, while TEGs tend to represent distinct physiological processes and are frequently candidates for biomarkers or drug targets. The genomic features of these two groups of genes expressed in opposing patterns may shed light on the mechanisms by which cells maintain basic and tissue-specific functions.</p> <p>Results</p> <p>Here, we generate gene expression profiles of 42 normal human tissues on custom high-density microarrays to systematically identify 1,522 HKGs and 975 TEGs and compile a small subset of 20 housekeeping genes which are highly expressed in all tissues with lower variance than many commonly used HKGs. Cross-species comparison shows that both the functions and expression patterns of HKGs are conserved. TEGs are enriched with respect to both segmental duplication and copy number variation, while no such enrichment is observed for HKGs, suggesting the high expression of HKGs are not due to high copy numbers. Analysis of genomic and epigenetic features of HKGs and TEGs reveals that the high expression of HKGs across different tissues is associated with decreased nucleosome occupancy at the transcription start site as indicated by enhanced DNase hypersensitivity. Additionally, we systematically and quantitatively demonstrated that the CpG islands' enrichment in HKGs transcription start sites (TSS) and their depletion in TEGs TSS. Histone methylation patterns differ significantly between HKGs and TEGs, suggesting that methylation contributes to the differential expression patterns as well.</p> <p>Conclusion</p> <p>We have compiled a set of high quality HKGs that should provide higher and more consistent expression when used as references in laboratory experiments than currently used HKGs. The comparison of genomic features between HKGs and TEGs shows that HKGs are more conserved than TEGs in terms of functions, expression pattern and polymorphisms. In addition, our results identify chromatin structure and epigenetic features of HKGs and TEGs that are likely to play an important role in regulating their strikingly different expression patterns.</p

    DNA copy number, including telomeres and mitochondria, assayed using next-generation sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA copy number variations occur within populations and aberrations can cause disease. We sought to develop an improved lab-automatable, cost-efficient, accurate platform to profile DNA copy number.</p> <p>Results</p> <p>We developed a sequencing-based assay of nuclear, mitochondrial, and telomeric DNA copy number that draws on the unbiased nature of next-generation sequencing and incorporates techniques developed for RNA expression profiling. To demonstrate this platform, we assayed UMC-11 cells using 5 million 33 nt reads and found tremendous copy number variation, including regions of single and homogeneous deletions and amplifications to 29 copies; 5 times more mitochondria and 4 times less telomeric sequence than a pool of non-diseased, blood-derived DNA; and that UMC-11 was derived from a male individual.</p> <p>Conclusion</p> <p>The described assay outputs absolute copy number, outputs an error estimate (p-value), and is more accurate than array-based platforms at high copy number. The platform enables profiling of mitochondrial levels and telomeric length. The assay is lab-automatable and has a genomic resolution and cost that are tunable based on the number of sequence reads.</p

    Knowledge based identification of essential signaling from genome-scale siRNA experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A systems biology interpretation of genome-scale RNA interference (RNAi) experiments is complicated by scope, experimental variability and network signaling robustness. Over representation approaches (ORA), such as the Hypergeometric or z-score, are an established statistical framework used to associate RNA interference effectors to biologically annotated gene sets or pathways. These methods, however, do not directly take advantage of our growing understanding of the interactome. Furthermore, these methods can miss partial pathway activation and may be biased by protein complexes. Here we present a novel ORA, protein interaction permutation analysis (PIPA), that takes advantage of canonical pathways and established protein interactions to identify pathways enriched for protein interactions connecting RNAi hits.</p> <p>Results</p> <p>We use PIPA to analyze genome-scale siRNA cell growth screens performed in HeLa and TOV cell lines. First we show that interacting gene pair siRNA hits are more reproducible than single gene hits. Using protein interactions, PIPA identifies enriched pathways not found using the standard Hypergeometric analysis including the FAK <it>cytoskeletal remodeling pathway</it>. Different branches of the <it>FAK </it>pathway are distinctly essential in HeLa versus TOV cell lines while other portions are uneffected by siRNA perturbations. Enriched hits belong to protein interactions associated with cell cycle regulation, anti-apoptosis, and signal transduction.</p> <p>Conclusion</p> <p>PIPA provides an analytical framework to interpret siRNA screen data by merging biologically annotated gene sets with the human interactome. As a result we identify pathways and signaling hypotheses that are statistically enriched to effect cell growth in human cell lines. This method provides a complementary approach to standard gene set enrichment that utilizes the additional knowledge of specific interactions within biological gene sets. </p

    Digital Genome-Wide ncRNA Expression, Including SnoRNAs, across 11 Human Tissues Using PolyA-Neutral Amplification

    Get PDF
    Non-coding RNAs (ncRNAs) are an essential class of molecular species that have been difficult to monitor on high throughput platforms due to frequent lack of polyadenylation. Using a polyadenylation-neutral amplification protocol and next-generation sequencing, we explore ncRNA expression in eleven human tissues. ncRNAs 7SL, U2, 7SK, and HBII-52 are expressed at levels far exceeding mRNAs. C/D and H/ACA box snoRNAs are associated with rRNA methylation and pseudouridylation, respectively: spleen expresses both, hypothalamus expresses mainly C/D box snoRNAs, and testes show enriched expression of both H/ACA box snoRNAs and RNA telomerase TERC. Within the snoRNA 14q cluster, 14q(I-6) is expressed at much higher levels than other cluster members. More reads align to mitochondrial than nuclear tRNAs. Many lincRNAs are actively transcribed, particularly those overlapping known ncRNAs. Within the Prader-Willi syndrome loci, the snoRNA HBII-85 (group I) cluster is highly expressed in hypothalamus, greater than in other tissues and greater than group II or III. Additionally, within the disease locus we find novel transcription across a 400,000 nt span in ovaries. This genome-wide polyA-neutral expression compendium demonstrates the richness of ncRNA expression, their high expression patterns, their function-specific expression patterns, and is publicly available

    Identification of a Novel Class of Farnesylation Targets by Structure-Based Modeling of Binding Specificity

    Get PDF
    Farnesylation is an important post-translational modification catalyzed by farnesyltransferase (FTase). Until recently it was believed that a C-terminal CaaX motif is required for farnesylation, but recent experiments have revealed larger substrate diversity. In this study, we propose a general structural modeling scheme to account for peptide binding specificity and recapitulate the experimentally derived selectivity profile of FTase in vitro. In addition to highly accurate recovery of known FTase targets, we also identify a range of novel potential targets in the human genome, including a new substrate class with an acidic C-terminal residue (CxxD/E). In vitro experiments verified farnesylation of 26/29 tested peptides, including both novel human targets, as well as peptides predicted to tightly bind FTase. This study extends the putative range of biological farnesylation substrates. Moreover, it suggests that the ability of a peptide to bind FTase is a main determinant for the farnesylation reaction. Finally, simple adaptation of our approach can contribute to more accurate and complete elucidation of peptide-mediated interactions and modifications in the cell

    Modeling structurally variable regions in homologous proteins with rosetta

    No full text
    ABSTRACT A major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem. Here we describe a method based on the de novo structure prediction algorithm, Rosetta, for predicting conformations of structurally divergent regions in comparative models
    corecore