8,636 research outputs found

    Expression profiling of snoRNAs in normal hematopoiesis and AML

    Get PDF
    Key Points A subset of snoRNAs is expressed in a developmental- and lineage-specific manner during human hematopoiesis. Neither host gene expression nor alternative splicing accounted for the observed differential expression of snoRNAs in a subset of AML.</jats:p

    PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers

    Get PDF
    BACKGROUND: Long thought "relics" of evolution, not until recently have pseudogenes been of medical interest regarding regulation in cancer. Often, these regulatory roles are a direct by-product of their close sequence homology to protein-coding genes. Novel pseudogene-gene (PGG) functional associations can be identified through the integration of biomedical data, such as sequence homology, functional pathways, gene expression, pseudogene expression, and microRNA expression. However, not all of the information has been integrated, and almost all previous pseudogene studies relied on 1:1 pseudogene-parent gene relationships without leveraging other homologous genes/pseudogenes. RESULTS: We produce PGG families that expand beyond the current 1:1 paradigm. First, we construct expansive PGG databases by (i) CUDAlign graphics processing unit (GPU) accelerated local alignment of all pseudogenes to gene families (totaling 1.6 billion individual local alignments and >40,000 GPU hours) and (ii) BLAST-based assignment of pseudogenes to gene families. Second, we create an open-source web application (PseudoFuN [Pseudogene Functional Networks]) to search for integrative functional relationships of sequence homology, microRNA expression, gene expression, pseudogene expression, and gene ontology. We produce four "flavors" of CUDAlign-based databases (>462,000,000 PGG pairwise alignments and 133,770 PGG families) that can be queried and downloaded using PseudoFuN. These databases are consistent with previous 1:1 PGG annotation and also are much more powerful including millions of de novo PGG associations. For example, we find multiple known (e.g., miR-20a-PTEN-PTENP1) and novel (e.g., miR-375-SOX15-PPP4R1L) microRNA-gene-pseudogene associations in prostate cancer. PseudoFuN provides a "one stop shop" for identifying and visualizing thousands of potential regulatory relationships related to pseudogenes in The Cancer Genome Atlas cancers. CONCLUSIONS: Thousands of new PGG associations can be explored in the context of microRNA-gene-pseudogene co-expression and differential expression with a simple-to-use online tool by bioinformaticians and oncologists alike

    Systematic genetic analysis of the MHC region reveals mechanistic underpinnings of HLA type associations with disease.

    Get PDF
    The MHC region is highly associated with autoimmune and infectious diseases. Here we conduct an in-depth interrogation of associations between genetic variation, gene expression and disease. We create a comprehensive map of regulatory variation in the MHC region using WGS from 419 individuals to call eight-digit HLA types and RNA-seq data from matched iPSCs. Building on this regulatory map, we explored GWAS signals for 4083 traits, detecting colocalization for 180 disease loci with eQTLs. We show that eQTL analyses taking HLA type haplotypes into account have substantially greater power compared with only using single variants. We examined the association between the 8.1 ancestral haplotype and delayed colonization in Cystic Fibrosis, postulating that downregulation of RNF5 expression is the likely causal mechanism. Our study provides insights into the genetic architecture of the MHC region and pinpoints disease associations that are due to differential expression of HLA genes and non-HLA genes

    Down-regulatory effects of miR-211 on long non-coding RNA SOX2OT and SOX2 genes in esophageal squamous cell carcinoma

    Get PDF
    Objective: MicroRNAs (miRNAs) are a class of non-coding RNAs (ncRNAs) that transcriptionally or post-Transcriptionally regulate gene expression through degradation of their mRNA targets and/or translational suppression. However, there are a few reports on miRNA-mediated expression regulation of long ncRNAs (lncRNAs). We have previously reported a significant upregulation of the lncRNA SOX2OT and its intronic coding gene, SOX2, in esophageal squamous cell carcinoma (ESCC) tissue samples. In this study, we aimed to evaluate the effect of induced overexpression of miR-211 on SOX2OT and SOX2 expression in vitro. Materials and Methods: In this experimental study, we performed both bioinformatic and experimental analyses to examine whether these transcripts are regulated by miRNAs. From the list of potential candidate miRNAs, miR-211 was found to have complementary sequences to SOX2OT and SOX2 transcripts. To validate our finding experimentally, we transfected the NT-2 pluripotent cell line (an embryonal carcinoma stem cell) with an expression vector overexpressing miR-211. The expression changes of miR-211, SOX2OT, and SOX2 were then quantified by a real-Time polymerase chain reaction (RT-PCR) approach. Results: Compared with mock-Transfected cells, overexpression of miR-211 caused a significant down-regulation of both genes (P<0.05). Furthermore, flow-cytometry analysis revealed a significant elevation in sub-G1 cell population following ectopic expression of miR-211 in NT-2 cells. Conclusion: We report here, for the first time, the down-regulation of SOX2OT and SOX2 genes by an miRNA. Considering the vital role of SOX2OT and SOX2 genes in pluripotency and tumorigenesis, our data suggest an important and inhibitory role for miR-211 in the aforementioned processes

    Discovery and genotyping of structural variation from long-read haploid genome sequence data

    Get PDF
    In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we systematically assessed each genome independently for structural variants (SVs) and indels resolving the sequence structure of 461,553 genetic variants from 2 bp to 28 kbp in length. We find that &gt;89% of these variants have been missed as part of analysis of the 1000 Genomes Project even after adjusting for more common variants (MAF &gt; 1%). We estimate that this theoretical human diploid differs by as much as ∼16 Mbp with respect to the human reference, with long-read sequencing data providing a fivefold increase in sensitivity for genetic variants ranging in size from 7 bp to 1 kbp compared with short-read sequence data. Although a large fraction of genetic variants were not detected by short-read approaches, once the alternate allele is sequence-resolved, we show that 61% of SVs can be genotyped in short-read sequence data sets with high accuracy. Uncoupling discovery from genotyping thus allows for the majority of this missed common variation to be genotyped in the human population. Interestingly, when we repeat SV detection on a pseudodiploid genome constructed in silico by merging the two haploids, we find that ∼59% of the heterozygous SVs are no longer detected by SMRT-SV. These results indicate that haploid resolution of long-read sequencing data will significantly increase sensitivity of SV detection.</jats:p
    corecore