13 research outputs found
Polyadenylation sites.
a<p>Replicate number 2 (P33).</p>b<p>PolyA tags mapped with mapping quality >40.</p
Raw and mapped sequence data.
a<p>Number of 2×100-nt read-pairs resulting from HiSeq 2000 sequencing.</p>b<p>Read-pairs containing the sequencing adapter. The left and right reads overlap due to sequencing of DNA fragments <200-nt. These reads (left and right) were merged and mapped as single-end reads.</p>c<p>Reads without the adapter sequence. These resulted from sequencing of DNA fragments >200-nt. The left and right mate did not overlap and these were mapped as pairs.</p>d<p>The average sequence length after merging left and right reads.</p>e<p>The combined number of nucleotides after merging of left and right reads but before mapping.</p
Transcription levels at various classes of protein-coding features.
<p>(A) Fraction of the genome occupied by various protein-coding features (WB isolate): uncharacterized genes (hypothetical genes), Protein 21.1, cysteine-rich membrane protein genes (<i>vsp</i> and HCMP genes), Kinase NEK and other genes. The x-axis shows the total protein-coding capacity of the genome for each group of genes. (B) Fraction of the RNA-seq data that mapped on categories in (A). (C) Smooth scatter plot of the relationship between gene expression and ORF length. The x- and y-axes show log<sub>10</sub>-scaled FPKM and ORF length (bp). ORFs longer than 8000 bp were not plotted (n = 71). Transition towards more intense blue means higher plotting density. (D) Box plots of gene expression of different categories of genes. Black dots represent outliers. One-way ANOVA concluded a significant difference between the groups (<i>p</i><2.2e-16). The Protein 21.1/HCMP pairwise comparison was significant at <i>p</i> = 0.0008972 and <i>vsp</i>/HCMP was significant at <i>p</i> = 0.04642 (Tukey's HSD test). The groups ‘uncharacterized’ and ‘others’ were ignored in the pairwise statistics. (E) Box plots of genes grouped according to Gene Ontology (GO). Genes were categorized into broader groups by Biological Process using the generic slimmed Gene Ontology. Black dots represent outliers. Numbers to the right indicate how many genes were in the category. Groups are sorted after median expression. The following Gene Ontology categories are shown (GO): 0006412, 0009056, 0006457, 0055085, 0006520, 0005975, 0044281, 0006810, 0006399, 0007165, 0034641, 0008150, 0009058, 0016192, 0006464, 0006629, 0006950, 0006259.</p
A non-transcribed region on chromosome 5.
<p>(A) Correlation of RNA-seq and RT-qPCR gene expression measurements for 49 genes in WB. Black dots are genes. The x- and y-axes show log<sub>10</sub> FPKM and −log<sub>10</sub> Ct (cycle threshold). Values were incremented by 1 before log<sub>10</sub>-transformation. Each RT-qPCR reaction was performed in triplicates, and the average Ct was used. The blue line is the linear regression (y = 0.07008x-1.50276). Included genes (prefix GL50803_): 7766, 9662, 6744, 7760, 112103, 11654, 11118, 2661, 24321, 17121, 17585, 14993, 16924, 13272, 6564, 5800, 3367, 17570, 16343, 93548, 11540, 5435, 15000, 21423, 10297, 114210, 86681, 7573, 7715, 102438, 7243, 16438, 17291, 1903, 17495, 102978, 11642, 17539, 90575, 32674, 13091, 137688, 3666, 25075, 16690, 2633, 92664, 13627, 4431. (B) Sliding window analysis of RNA-seq coverage on scaffold CH991767 (part of chromosome 5; WB). Analyzed windows were 500 bp wide and not overlapping. (X-axis) Position along the genomic segment (start position of the analyzed window). (Y-axis) RNA-seq depth in the window on a logarithmic scale. Drop in the RNA-seq coverage is seen in positions 1,340,000 to 1,381,000.</p
Analyses of allele-specific expression.
<p>(A) An example of a heterozygous locus identified from genomic Roche 454 reads (horizontal bars). Colors represent alignments in the forward and reverse directions. The arrow indicates a heterozygous locus. (B) Density plots of allelic expression ratios calculated from simulated reads and RNA-seq reads. The black and red lines correspond to simulated reads containing 0.01 and 0.02 errors/base. The blue line represents RNA-seq data (GS isolate). (C) Histogram of allele ratios (genomic) of heterozygous loci. The superimposed curve (brown) shows the density of the underlying data. The allele ratio was calculated from genomic reads as the fraction of allele A among allele A+B. Grey bars represent loci containing two presumed copies of each allele and blue bars three copies (or vice versa) of each allele. (D) Boxplots of allele expression ratios (y-axis) according to the allele ratio (x-axis). The black line of each box is the median. Dots represent outliers. (E) Allelic Expression Ratios of heterozygous loci of the same haplotype phase. Each dot represents one pair of linked heterozygous loci (SNP pairs). The allelic expression of SNPs 1 and 2 are shown on the x- and y-axes. Red dots indicate discordant heterozygous loci and black dots represent concordant heterozygous loci; i.e., the direction of gene expression change is the same.</p
RNA-seq technical details.
<p>(A) Insert size histogram of sequenced cDNA fragments inferred from mapped paired-end reads. The plotted data are from the WB isolate. The x- and y-axes show the fragment size in nucleotides and the frequency, respectively. The median length was 250 nt. (B) The relationship between detected transcripts and mapped paired-end reads. The x- and y-axes show the number of mapped reads and the number of detected transcripts, respectively. Colors correspond to: violet (WB), blue (P15), green (GS), yellow (AS175<sub>P4</sub>), and dotted (AS175<sub>P33</sub>). The plateau indicates saturation (deeper sequencing do not lead to detection of new transcripts). Since the reference genomes slightly differ in finishing, the plateau y-values are different. (C) Gene expression correlation of technical replicates (WB isolate). Technical replicates 1 and 2 are from the same sequencing library (biological sample) but sequenced independently on different lanes. Dots represent genes. The x- and y-axes show log<sub>10</sub>-scaled FPKM of technical replicates 1 and 2 respectively (values were incremented by 1 before transformation). The blue line corresponds to equal expression. Colors represent overlap in the plot; i.e., black means a single gene and red means higher plotting density. (D) Correlation of gene expression between <i>in vitro</i> passages 4 and 33 of the AS175 isolate, i.e., correlation of biological replicates.</p
Polyadenylation sites.
<p>(A) Histogram of clustered polyadenylation sites (PACs) and their normalized position on ORFs. On the x-axis, 0 and 1 refer to the first and last base of the ORF. Only polyadenylation sites of the sense direction with respect to the ORF are shown. The y-axis shows the frequency. The data are from the WB isolate. (B) Bar-plot of the number of sense PACs per ORF. (C) Example of alternative polyadenylation of the gene encoding 3-hydroxy-3-methylglutaryl-coenzyme A reductase (GL50803_7573). Arrows indicate locations of the polyA site. Blue arrows indicate polyA sites of the same direction as the genes, and the orange arrow indicates a polyA site of the reversed strand. Numbers of supporting reads (polyA tags) are shown on top of the arrows. The polyA sites are separated by 275 bp. (D) Histogram of 3′ UTR length (nt) inferred from mapped polyA sites. Only the WB isolate is shown. (E) Relationship between gene expression signal (GES; log<sub>10</sub> FPKM) and 3′ UTR length. The y-axis shows the log<sub>10</sub> GES and x-axis shows the 3′ UTR length (nt). Only 3′ UTRs <500 nt are plotted. (F) Nucleotide length differences between orthologous 3′ untranslated regions.</p
PolyA sites determined using 3′ RACE.
a<p>Length in nucleotides. Determined from the most common polyA site. The parenthesis shows alternative polyA sites.</p>b<p>Length in nucleotides. The parenthesis shows results from individual experimental replicates.</p
Additional file 6: of Comparative genomic analyses of freshly isolated Giardia intestinalis assemblage A isolates
Nucleotide diversity between assemblage A orthologs. List of assemblage AI and AII orthologs and the nucleotide diversity of the alignment. The most diverse genes are at the top, with decreasing diversity. (XLS 690 kb
Additional file 7: of Comparative genomic analyses of freshly isolated Giardia intestinalis assemblage A isolates
Strcutural modellling of two Giardia BPI-like proteins. (DOCX 486 kb