120 research outputs found
Pairwise comparison of expression levels.
<p>We compared the levels of expression of 15,861 genes with nonzero expression levels in both liver and testes, expressed in terms of average coverage per base. Each point represents one gene. A) Data prior to normalization. Housekeeping genes are highlighted as green points and labeled. The blue and red diagonals represent the relative correction factors computed based on total counts or the NCS method, relative to no normalization (black). The magenta and orange curves depict the percentiles when considering all genes or genes with nonzero values, respectively. B) Values after correction by NCS. Points in black or red denote genes with positive weights, and that therefore guided the scaling. Points in red denote the 39 genes with weight >0.5.</p
Comparison of performance of various normalization methods.
<p>Each method is evaluated by the number of genes observed to be consistently expressed across samples (abscissa); different methods also yield different numbers of genes identified as specific to one sample. The numbers in the orange circles denote the number of housekeeping genes combined using the geNorm algorithm. The dashed arrows show one stochastic path of the ES from the data prior to normalization (white square, “None”) to the best approximation to the optimal solution (gray square, ES). Brown squares represent the results obtained via the TMM method, using each of the 16 samples as reference.</p
Optimal Scaling of Digital Transcriptomes
<div><p>Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers.</p></div
Comparison between the scaling factors suggested by the different methods.
<p>Lower left: the resulting scaling factors for the heart sample. Upper right: Pairwise correlations between the methods, for all samples. Red shades denote high correlation values (above 0.75), blue denotes low correlation (or anticorrelation). The column to the right indicates the number of uniform genes identified by the method. The Quantile Normalization method is not included in this analysis since it does not produce scaling factors.</p
Conceptual taxonomy of scaling methods.
<p>Blue: published methods. Pink: variations on published methods. Red: novel methods. Dashed lines connect related methods.</p
Density distribution of Spearman correlations of sample rankings for some normalization methods.
<p>Density distribution of Spearman correlations of sample rankings for some normalization methods.</p
A Systems Approach to Rheumatoid Arthritis
<div><p>Rheumatoid arthritis (RA) is a chronic autoimmune disease that primarily attacks synovial joints. Despite the advances in diagnosis and treatment of RA, novel molecular targets are still needed to improve the accuracy of diagnosis and the therapeutic outcomes. Here, we present a systems approach that can effectively 1) identify core RA-associated genes (RAGs), 2) reconstruct RA-perturbed networks, and 3) select potential targets for diagnosis and treatments of RA. By integrating multiple gene expression datasets previously reported, we first identified 983 core RAGs that show RA dominant differential expression, compared to osteoarthritis (OA), in the multiple datasets. Using the core RAGs, we then reconstructed RA-perturbed networks that delineate key RA associated cellular processes and transcriptional regulation. The networks revealed that synovial fibroblasts play major roles in defining RA-perturbed processes, anti-TNF-α therapy restored many RA-perturbed processes, and 19 transcription factors (TFs) have major contribution to deregulation of the core RAGs in the RA-perturbed networks. Finally, we selected a list of potential molecular targets that can act as metrics or modulators of the RA-perturbed networks. Therefore, these network models identify a panel of potential targets that will serve as an important resource for the discovery of therapeutic targets and diagnostic markers, as well as providing novel insights into RA pathogenesis.</p> </div
Signatures of anti-TNF inhibitors in RA-perturbed network.
<p>A) A RA-perturbed networks showing the recovery of the elevated RAGs to normality by anti-TNF therapy. Green border colors represent the decreases in expression levels of 136 elevated RAGs. B) and C) Module enrichment scores representing the significances of overlaps of the genes decreased by anti-TNF therapy (B) or the genes whose expression levels are elevated by IL1B and TNF treatments (C) with the genes belonging to the network modules. AP = Antigen processing & presentation; TC = T-cell activation; BC = B-cell activation; IG = Immunoglobulins; CA = Complement activation; NK = Natural killer cell mediated cytotoxicity; IC = Inflammatory cytokines; CK = Chemokines; CMH = Cell migration & adhesion; TLR = Toll-like receptor signaling; AF = Angiogenic factors; JS = JAK-STAT signaling; CC = Cell cycle & DNA repair; CDS = Cell death & survival; ECM = ECM organization; MR = Matrix remodeling.</p
Novel molecular target candidates for diagnosis and therapy of RA.
<p>Novel molecular target candidates for diagnosis and therapy of RA.</p
A RA-perturbed network in the RA synovium and signatures of FLS and PBMC in the RA tissue network.
<p>A) A RA-perturbed network describing RA associated cellular processes in which 242 up-regulated RAGs are involved and their interactions. The network nodes are arranged into sixteen modules based on their GOBPs and the KEGG pathways that they belong to. The nodes with red boundary represent DEGs in RA FLS. B) and C) Module enrichment scores (see text for definition) representing the significances of overlaps of the DEGs in RA FLS (B) or PBMC (C) with the genes belonging to the sixteen network modules. See text for detailed discussion. AP = Antigen processing & presentation; TC = T-cell activation; BC = B-cell activation; IG = Immunoglobulins; CA = Complement activation; NK = Natural killer cell mediated cytotoxicity; IC = Inflammatory cytokines; CK = Chemokines; CMH = Cell migration & adhesion; TLR = Toll-like receptor signaling; AF = Angiogenic factors; JS = JAK-STAT signaling; CC = Cell cycle & DNA repair; CDS = Cell death & survival; ECM = ECM organization; MR = Matrix remodeling.</p
- …
