18 research outputs found

    Global and unbiased detection of splice junctions from RNA-seq data

    Get PDF
    SplitSeek can be used to detect novel splicing events in SOLiD RNA-seq data without the need for a pre-defined library

    Genome-wide analysis of chimpanzee genes with premature termination codons

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Premature termination codons (PTCs) cause mRNA degradation or a truncated protein and thereby contribute to the transcriptome and proteome divergence between species. Here we present the first genome-wide study of PTCs in the chimpanzee. By comparing the human and chimpanzee genome sequences we identify and characterize genes with PTCs, in order to understand the contribution of these mutations to the transcriptome diversity between the species.</p> <p>Results</p> <p>We have studied a total of 13,487 human-chimpanzee gene pairs and found that ~8% were affected by PTCs in the chimpanzee. A majority (764/1,109) of PTCs were caused by insertions or deletions and the remaining part was caused by substitutions. The distribution of PTC genes varied between chromosomes, with Y having the highest proportion. Furthermore, the density of PTC genes varied on a megabasepair scale within chromosomes and we found the density to be correlated both with indel divergence and proximity to the telomere. Within genes, PTCs were more common close to the 5' and 3' ends of the amino acid sequence. Gene Ontology classification revealed that olfactory receptor genes were over represented among the PTC genes.</p> <p>Conclusion</p> <p>Our results showed that the density of PTC genes fluctuated across the genome depending on the local genomic context. PTCs were preferentially located in the terminal parts of the transcript, which generally have a lower frequency of functional domains, indicating that selection was operating against PTCs at sites central to protein function. The enrichment of GO terms associated with olfaction suggests that PTCs may have influenced the difference in the repertoire of olfactory genes between humans and chimpanzees. In summary, 8% of the chimpanzee genes were affected by PTCs and this type of variation is likely to have an important effect on the transcript and proteomic divergence between humans and chimpanzees.</p

    Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing

    Get PDF
    Background: We profile the chimpanzee transcriptome by using deep sequencing of cDNA from brain and liver, aiming to quantify expression of known genes and to identify novel transcribed regions. Results: Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues. We further identify 9,826 novel transcribed regions that are not overlapping with annotated exons, mRNAs or ESTs. Over 80 % of the novel transcribed regions map within or in the vicinity of known genes, and by combining sequencing data with de novo splice predictions we predict several of the novel transcribed regions to be new exons or 3 &apos; UTRs. For approximately 350 novel transcribed regions, the corresponding DNA sequence is absent in the human reference genome. The presence of novel transcribed regions in five genes and in one intergenic region is further validated with RT-PCR. Finally, we describe and experimentally validate a putative novel multi-exon gene that belongs to the ATP-cassette transporter gene family. This gene does not appear to be functional in human since one exon is absent from the human genome. In addition to novel exons and UTRs, novel transcribed regions may also stem from different types of noncoding transcripts. We note that expressed repeats and introns from unspliced mRNAs are especially common in our data. Conclusions: Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3 &apos; UTRs an

    Characterization of the Viral Microbiome in Patients with Severe Lower Respiratory Tract Infections, Using Metagenomic Sequencing

    Get PDF
    The human respiratory tract is heavily exposed to microorganisms. Viral respiratory tract pathogens, like RSV, influenza and rhinoviruses cause major morbidity and mortality from respiratory tract disease. Furthermore, as viruses have limited means of transmission, viruses that cause pathogenicity in other tissues may be transmitted through the respiratory tract. It is therefore important to chart the human virome in this compartment. We have studied nasopharyngeal aspirate samples submitted to the Karolinska University Laboratory, Stockholm, Sweden from March 2004 to May 2005 for diagnosis of respiratory tract infections. We have used a metagenomic sequencing strategy to characterize viruses, as this provides the most unbiased view of the samples. Virus enrichment followed by 454 sequencing resulted in totally 703,790 reads and 110,931 of these were found to be of viral origin by using an automated classification pipeline. The snapshot of the respiratory tract virome of these 210 patients revealed 39 species and many more strains of viruses. Most of the viral sequences were classified into one of three major families; Paramyxoviridae, Picornaviridae or Orthomyxoviridae. The study also identified one novel type of Rhinovirus C, and identified a number of previously undescribed viral genetic fragments of unknown origin

    Genome and Transcriptome Comparisons between Human and Chimpanzee

    No full text
    The chimpanzee is humankind’s closest living relative and the two species diverged ~6 million years ago. Comparative studies of the human and chimpanzee genomes and transcriptomes are of great interest to understand the molecular mechanisms of speciation and the development of species-specific traits. The aim of this thesis is to characterize differences between the two species with regard to their genome sequences and the resulting transcript profiles. The first two papers focus on indel divergence and in particular, indels causing premature termination codons (PTCs) in 8% of the chimpanzee genes. The density of PTC genes is correlated with both the distance to the telomere and the indel divergence. Many PTC genes have several associated transcripts and since not all are affected by the PTC we propose that PTCs may affect the pattern of expressed isoforms. In the third paper, we investigate the transcriptome divergence in cerebellum, heart and liver, using high-density exon arrays. The results show that gene expression differs more between tissues than between species. Approximately 15% of the genes are differentially expressed between species, and half of the genes show different splicing patterns. We identify 28 cassette exons which are only included in one of the species, often in a tissue-specific manner. In the fourth paper, we use massive parallel sequencing to study the chimpanzee transcriptome in frontal cortex and liver. We estimate gene expression and search for novel transcribed regions (TRs). The majority of TRs are located close to genes and possibly extend the annotations. A subset of TRs are not found in the human genome. The brain transcriptome differs substantially from that of the liver and we identify a subset of genes enriched with TRs in frontal cortex. In conclusion, this thesis provides evidence of extensive genomic and transcriptomic variability between human and chimpanzee. The findings provide a basis for further studies of the underlying differences affecting phenotypic divergence between human and chimpanzee.    

    A flow-chart describing the classification pipeline.

    No full text
    <p>A flow-chart describing the entire process from sample collection through the various data-analysis steps. Step 1 and 2 illustrate sample collection, preparation and sequencing while step 3 through 6 illustrate <i>in silico</i> efforts.</p

    An overview of the viral part of the sample.

    No full text
    <p>The virus part of the sample, where sequences were defined by closest homolog, split into families (panel A), species found in the <i>Paramyxoviridae</i> family (panel B) and species found in the <i>Picornaviridae</i> family (panel C). The numbers are the derived number of reads and the family and species designations were manually curated, only alignments with an e-value<1e-5 considered.</p
    corecore