21,591 research outputs found

    PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers

    Get PDF
    BACKGROUND: Long thought "relics" of evolution, not until recently have pseudogenes been of medical interest regarding regulation in cancer. Often, these regulatory roles are a direct by-product of their close sequence homology to protein-coding genes. Novel pseudogene-gene (PGG) functional associations can be identified through the integration of biomedical data, such as sequence homology, functional pathways, gene expression, pseudogene expression, and microRNA expression. However, not all of the information has been integrated, and almost all previous pseudogene studies relied on 1:1 pseudogene-parent gene relationships without leveraging other homologous genes/pseudogenes. RESULTS: We produce PGG families that expand beyond the current 1:1 paradigm. First, we construct expansive PGG databases by (i) CUDAlign graphics processing unit (GPU) accelerated local alignment of all pseudogenes to gene families (totaling 1.6 billion individual local alignments and >40,000 GPU hours) and (ii) BLAST-based assignment of pseudogenes to gene families. Second, we create an open-source web application (PseudoFuN [Pseudogene Functional Networks]) to search for integrative functional relationships of sequence homology, microRNA expression, gene expression, pseudogene expression, and gene ontology. We produce four "flavors" of CUDAlign-based databases (>462,000,000 PGG pairwise alignments and 133,770 PGG families) that can be queried and downloaded using PseudoFuN. These databases are consistent with previous 1:1 PGG annotation and also are much more powerful including millions of de novo PGG associations. For example, we find multiple known (e.g., miR-20a-PTEN-PTENP1) and novel (e.g., miR-375-SOX15-PPP4R1L) microRNA-gene-pseudogene associations in prostate cancer. PseudoFuN provides a "one stop shop" for identifying and visualizing thousands of potential regulatory relationships related to pseudogenes in The Cancer Genome Atlas cancers. CONCLUSIONS: Thousands of new PGG associations can be explored in the context of microRNA-gene-pseudogene co-expression and differential expression with a simple-to-use online tool by bioinformaticians and oncologists alike

    Differentially-Expressed Pseudogenes in HIV-1 Infection.

    Get PDF
    Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these "functional" pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit

    Clinical exome performance for reporting secondary genetic findings.

    Get PDF
    BACKGROUND : Reporting clinically actionable incidental genetic findings in the course of clinical exome testing is recommended by the American College of Medical Genet- ics and Genomics (ACMG). However, the performance of clinical exome methods for reporting small subsets of genes has not been previously reported. METHODS : In this study, 57 exome data sets performed as clinical (n ! 12) or research (n ! 45) tests were retrospec- tively analyzed. Exome sequencing data was examined for adequacy in the detection of potentially pathogenic variant locations in the 56 genes described in the ACMG incidental findings recommendation. All exons of the 56 genes were examined for adequacy of sequencing coverage. In addition, nucleotide positions annotated in HGMD (Human Gene Mutation Database) were examined. RESULTS : The 56 ACMG genes have 18336 nucleotide variants annotated in HGMD. None of the 57 exome data sets possessed a HGMD variant. The clinical exome test had inadequate coverage for " 50% of HGMD vari- ant locations in 7 genes. Six exons from 6 different genes had consistent failure across all 3 test methods; these exons had high GC content (76%–84%). CONCLUSIONS : The use of clinical exome sequencing for the interpretation and reporting of subsets of genes requires recognition of the substantial possibility of inadequate depth and breadth of sequencing coverage at clinically relevant locations. Inadequate depth of coverage may contribute to false-negative clinical ex- ome results

    Hominin evolution was caused by introgression from Gorilla

    Full text link
    The discovery of Paranthropus deyiremeda in 3.3-3.5 million year old fossil sites in Afar, together with 30% of the gorilla genome showing lineage sorting between humans and chimpanzees, and a NUMT ("nuclear mitochondrial DNA segment") on chromosome 5 that is shared by both gorillas, humans and chimpanzees, and shown to have diverged at the time of the Pan-Homo split rather than the Gorilla/Pan-Homo split, provides conclusive evidence that introgression from the gorilla lineage caused the Pan-Homo split, and the speciation of both the Australopithecus lineage and the Paranthropus lineage.Comment: arXiv admin note: text overlap with arXiv:1808.0630

    Automated DNA Motif Discovery

    Get PDF
    Ensembl's human non-coding and protein coding genes are used to automatically find DNA pattern motifs. The Backus-Naur form (BNF) grammar for regular expressions (RE) is used by genetic programming to ensure the generated strings are legal. The evolved motif suggests the presence of Thymine followed by one or more Adenines etc. early in transcripts indicate a non-protein coding gene. Keywords: pseudogene, short and microRNAs, non-coding transcripts, systems biology, machine learning, Bioinformatics, motif, regular expression, strongly typed genetic programming, context-free grammar.Comment: 12 pages, 2 figure

    Large-scale and significant expression from pseudogenes in Sodalis glossinidius – a facultative bacterial endosymbiont

    Get PDF
    The majority of bacterial genomes have high coding efficiencies, but there are some genomes of intracellular bacteria that have low gene density. The genome of the endosymbiont Sodalis glossinidius contains almost 50 % pseudogenes containing mutations that putatively silence them at the genomic level. We have applied multiple ‘omic’ strategies, combining Illumina and Pacific Biosciences Single-Molecule Real-Time DNA sequencing and annotation, stranded RNA sequencing and proteome analysis to better understand the transcriptional and translational landscape of Sodalis pseudogenes, and potential mechanisms for their control. Between 53 and 74 % of the Sodalis transcriptome remains active in cell-free culture. The mean sense transcription from coding domain sequences (CDSs) is four times greater than that from pseudogenes. Comparative genomic analysis of six Illumina-sequenced Sodalis isolates from different host Glossina species shows pseudogenes make up ~40 % of the 2729 genes in the core genome, suggesting that they are stable and/or that Sodalis is a recent introduction across the genus Glossina as a facultative symbiont. These data shed further light on the importance of transcriptional and translational control in deciphering host–microbe interactions. The combination of genomics, transcriptomics and proteomics gives a multidimensional perspective for studying prokaryotic genomes with a view to elucidating evolutionary adaptation to novel environmental niches

    Neuronal Expression of Neural Nitric Oxide Synthase (nNOS) Protein is Suppressed by an Antisense RNA Transcribed from an NOS Pseudogene

    Get PDF
    Here, we show that a nitric oxide synthase (NOS) pseudogene is expressed in the CNS of the snail Lymnaea stagnalis. The pseudo-NOS transcript includes a region of significant antisense homology to a previously reported neuronal NOS (nNOS)-encoding mRNA. This suggested that the pseudo-NOS transcript acts as a natural antisense regulator of nNOS protein synthesis. In support of this, we show that both the nNOS-encoding and the pseudo-NOS transcripts are coexpressed in giant identified neurons (the cerebral giant cells) in the cerebral ganglion. Moreover, reverse transcription-PCR experiments on RNA isolated from the CNS establish that stable RNA-RNA duplex molecules do form between the two transcripts in vivo. Using an in vitro translation assay, we further demonstrate that the antisense region of the pseudogene transcript prevents the translation of nNOS protein from the nNOS-encoding mRNA. By analyzing NOS RNA and nNOS protein expression in two different identified neurons, we find that when both the nNOS-encoding and the pseudo-NOS transcripts are present in the same neuron, nNOS enzyme activity is substantially suppressed. Importantly, these results show that a natural antisense mechanism can mediate the translational control of nNOS expression in the Lymnaea CNS. Our findings also suggest that transcribed pseudogenes are not entirely without purpose and are a potential source of a new class of regulatory gene in the nervous system

    Chromosome mapping of dragline silk genes in the genomes of widow spiders (araneae, theridiidae)

    Get PDF
    With its incredible strength and toughness, spider dragline silk is widely lauded for its impressive material properties. Dragline silk is composed of two structural proteins, MaSp1 and MaSp2, which are encoded by members of the spidroin gene family. While previous studies have characterized the genes that encode the constituent proteins of spider silks, nothing is known about the physical location of these genes. We determined karyotypes and sex chromosome organization for the widow spiders, Latrodectus hesperus and L. geometricus (Araneae, Theridiidae). We then used fluorescence in situ hybridization to map the genomic locations of the genes for the silk proteins that compose the remarkable spider dragline. These genes included three loci for the MaSp1 protein and the single locus for the MaSp2 protein. In addition, we mapped a MaSp1 pseudogene. All the MaSp1 gene copies and pseudogene localized to a single chromosomal region while MaSp2 was located on a different chromosome of L. hesperus. Using probes derived from L. hesperus, we comparatively mapped all three MaSp1 loci to a single region of a L. geometricus chromosome. As with L. hesperus, MaSp2 was found on a separate L. geometricus chromosome, thus again unlinked to the MaSp1 loci. These results indicate orthology of the corresponding chromosomal regions in the two widow genomes. Moreover, the occurrence of multiple MaSp1 loci in a conserved gene cluster across species suggests that MaSp1 proliferated by tandem duplication in a common ancestor of L. geometricus and L. hesperus. Unequal crossover events during recombination could have given rise to the gene copies and could also maintain sequence similarity among gene copies over time. Further comparative mapping with taxa of increasing divergence from Latrodectus will pinpoint when the MaSp1 duplication events occurred and the phylogenetic distribution of silk gene linkage patterns. © 2010 Zhao et al

    Transcriptionally inactive oocyte-type 5S RNA genes of Xenopus laevis are complexed with TFIIIA in vitro

    Get PDF
    An extract from whole oocytes of Xenopus laevis was shown to transcribe somatic-type 5S RNA genes approximately 100-fold more efficiently than oocyte-type 5S RNA genes. This preference was at least 10-fold greater than the preference seen upon microinjection of 5S RNA genes into oocyte nuclei or upon in vitro transcription in an oocyte nuclear extract. The approximately 100-fold transcriptional bias in favor of the somatic-type 5S RNA genes observed in vitro in the whole oocyte extract was similar to the transcriptional bias observed in developing Xenopus embryos. We also showed that in the whole oocyte extract, a promoter-binding protein required for 5S RNA gene transcription, TFIIIA, was bound both to the actively transcribed somatic-type 5S RNA gene and to the largely inactive oocyte-type 5S RNA genes. These findings suggest that the mechanism for the differential expression of 5S RNA genes during Xenopus development does not involve differential binding of TFIIIA to 5S RNA genes
    • …
    corecore