27,615 research outputs found
Large-scale and significant expression from pseudogenes in Sodalis glossinidius – a facultative bacterial endosymbiont
The majority of bacterial genomes have high coding efficiencies, but there are some genomes of intracellular bacteria that have low gene density. The genome of the endosymbiont Sodalis glossinidius contains almost 50 % pseudogenes containing mutations that putatively silence them at the genomic level. We have applied multiple ‘omic’ strategies, combining Illumina and Pacific Biosciences Single-Molecule Real-Time DNA sequencing and annotation, stranded RNA sequencing and proteome analysis to better understand the transcriptional and translational landscape of Sodalis pseudogenes, and potential mechanisms for their control. Between 53 and 74 % of the Sodalis transcriptome remains active in cell-free culture. The mean sense transcription from coding domain sequences (CDSs) is four times greater than that from pseudogenes. Comparative genomic analysis of six Illumina-sequenced Sodalis isolates from different host Glossina species shows pseudogenes make up ~40 % of the 2729 genes in the core genome, suggesting that they are stable and/or that Sodalis is a recent introduction across the genus Glossina as a facultative symbiont. These data shed further light on the importance of transcriptional and translational control in deciphering host–microbe interactions. The combination of genomics, transcriptomics and proteomics gives a multidimensional perspective for studying prokaryotic genomes with a view to elucidating evolutionary adaptation to novel environmental niches
PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers
BACKGROUND:
Long thought "relics" of evolution, not until recently have pseudogenes been of medical interest regarding regulation in cancer. Often, these regulatory roles are a direct by-product of their close sequence homology to protein-coding genes. Novel pseudogene-gene (PGG) functional associations can be identified through the integration of biomedical data, such as sequence homology, functional pathways, gene expression, pseudogene expression, and microRNA expression. However, not all of the information has been integrated, and almost all previous pseudogene studies relied on 1:1 pseudogene-parent gene relationships without leveraging other homologous genes/pseudogenes.
RESULTS:
We produce PGG families that expand beyond the current 1:1 paradigm. First, we construct expansive PGG databases by (i) CUDAlign graphics processing unit (GPU) accelerated local alignment of all pseudogenes to gene families (totaling 1.6 billion individual local alignments and >40,000 GPU hours) and (ii) BLAST-based assignment of pseudogenes to gene families. Second, we create an open-source web application (PseudoFuN [Pseudogene Functional Networks]) to search for integrative functional relationships of sequence homology, microRNA expression, gene expression, pseudogene expression, and gene ontology. We produce four "flavors" of CUDAlign-based databases (>462,000,000 PGG pairwise alignments and 133,770 PGG families) that can be queried and downloaded using PseudoFuN. These databases are consistent with previous 1:1 PGG annotation and also are much more powerful including millions of de novo PGG associations. For example, we find multiple known (e.g., miR-20a-PTEN-PTENP1) and novel (e.g., miR-375-SOX15-PPP4R1L) microRNA-gene-pseudogene associations in prostate cancer. PseudoFuN provides a "one stop shop" for identifying and visualizing thousands of potential regulatory relationships related to pseudogenes in The Cancer Genome Atlas cancers.
CONCLUSIONS:
Thousands of new PGG associations can be explored in the context of microRNA-gene-pseudogene co-expression and differential expression with a simple-to-use online tool by bioinformaticians and oncologists alike
Differentially-Expressed Pseudogenes in HIV-1 Infection.
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these "functional" pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit
Rate variation during molecular evolution: creationism and the cytochrome c molecular clock
Molecular clocks based upon amino acid sequences in proteins have played a major role in the clarification of evolutionary phylogenies. Creationist criticisms of these methods sometimes rely upon data that might initially seem to be paradoxical. For example, human cytochrome c differs from that of an alligator by 13 amino acids but differs by 14
amino acids from a much more closely related primate, Otolemur garnettii. The apparent anomaly is resolved by taking into consideration the variable substitution rate of cytochrome c, particularly among primates. This paper traces some of the history of extensive research into the topic of rate heterogeneity in cytochrome c including data from
cytochrome c pseudogenes
Network analysis of pseudogene-gene relationships: from pseudogene evolution to their functional potentials
Pseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as "junk DNAs", since they do not code proteins in normal tissues. Although most of the human pseudogenes do not have noticeable functions, ∼20% of them exhibit transcriptional activity. There has been evidence showing that some pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, pseudogenes can even be "reactivated" in some conditions, such as cancer initiation. Some pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between pseudogenes and their gene counterparts could help us reveal the evolutionary path of pseudogenes and associate pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving pseudogenes with transcriptional and even translational activities.In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate pseudogene-gene relationships, and apply it to human gene homologs and pseudogenes. We generated a comprehensive set of 445 pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one pseudogene). Each PGG family contains multiple genes and pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these pseudogene-gene families and infer functional impact of pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of pseudogene function in disease context
The emergence and fate of horizontally acquired genes in Escherichia coli
Bacterial species, and even strains within species, can vary greatly in their gene contents and metabolic capabilities. We examine the evolution of this diversity by assessing the distribution and ancestry of each gene in 13 sequenced isolates of Escherichia coli and Shigella. We focus on the emergence and demise of two specific classes of genes, ORFans (genes with no homologs in present databases) and HOPs (genes with distant homologs), since these genes, in contrast to most conserved ancestral sequences, are known to be a major source of the novel features in each strain. We find that the rates of gain and loss of these genes vary greatly among strains as well as through time, and that ORFans and HOPs show very different behavior with respect to their emergence and demise. Although HOPs, which mostly represent gene acquisitions from other bacteria, originate more frequently, ORFans are much more likely to persist. This difference suggests that many adaptive traits are conferred by completely novel genes that do not originate in other bacterial genomes. With respect to the demise of these acquired genes, we find that strains of Shigella lose genes, both by disruption events and by complete removal, at accelerated rates
Complex evolutionary dynamics of massively expanded chemosensory receptor families in an extreme generalist chelicerate herbivore
While mechanisms to detoxify plant produced, anti-herbivore compounds have been associated with plant host use by herbivores, less is known about the role of chemosensory perception in their life histories. This is especially true for generalists, including chelicerate herbivores that evolved herbivory independently from the more studied insect lineages. To shed light on chemosensory perception in a generalist herbivore, we characterized the chemosensory receptors (CRs) of the chelicerate two-spotted spider mite, Tetranychus urticae, an extreme generalist. Strikingly, T. urticae has more CRs than reported in any other arthropod to date. Including pseudogenes, 689 gustatory receptors were identified, as were 136 degenerin/Epithelial Na+ Channels (ENaCs) that have also been implicated as CRs in insects. The genomic distribution of T. urticae gustatory receptors indicates recurring bursts of lineage-specific proliferations, with the extent of receptor clusters reminiscent of those observed in the CR-rich genomes of vertebrates or C. elegans. Although pseudogenization of many gustatory receptors within clusters suggests relaxed selection, a subset of receptors is expressed. Consistent with functions as CRs, the genomic distribution and expression of ENaCs in lineage-specific T. urticae expansions mirrors that observed for gustatory receptors. The expansion of ENaCs in T. urticae to > 3-fold that reported in other animals was unexpected, raising the possibility that ENaCs in T. urticae have been co-opted to fulfill a major role performed by unrelated CRs in other animals. More broadly, our findings suggest an elaborate role for chemosensory perception in generalist herbivores that are of key ecological and agricultural importance
- …
