Search CORE

Directory of Open Access Journals

Roskilde Universitet

Copenhagen University Research Information System

Gene Expression Profiles Distinguish the Carcinogenic Effects of Aristolochic Acid in Target (Kidney) and Non-target (Liver) Tissues in Rats

Author: A Wang
BA Schwetz
BM Hendry
CP Fall
E White
HH Schmeiser
HJ Varghese
Hong Fang
IARC
James C Fuscoe
JL Nortier
JL Vanherweghem
JL Vanherweghem
JP Cosyns
Lei Guo
Leming Shi
LJ Vanherweghem
LS Gold
Lu Zhang
M Chen
M Depierreux
M Stiborova
M Vanhaelen
MA Harris
MC Liu
Nan Mei
R Gao
RC Fernando
Tao Chen
U Mengs
U Mengs
U Mengs
U Mengs
VG Tusher
VM Arlt
W Pfau
W Tong
W Tong
Y Shav-Tal
Yongming Sun
Z Su
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Aristolochic acid (AA) is the active component of herbal drugs derived from Aristolochia species that have been used for medicinal purposes since antiquity. AA, however, induced nephropathy and urothelial cancer in people and malignant tumors in the kidney and urinary tract of rodents. Although AA is bioactivated in both kidney and liver, it only induces tumors in kidney. To evaluate whether microarray analysis can be used for distinguishing the tissue-specific carcinogenicity of AA, we examined gene expression profiles in kidney and liver of rats treated with carcinogenic doses of AA. RESULTS: Microarray analysis was performed using the Rat Genome Survey Microarray and data analysis was carried out within ArrayTrack software. Principal components analysis and hierarchical cluster analysis of the expression profiles showed that samples were grouped together according to the tissues and treatments. The gene expression profiles were significantly altered by AA treatment in both kidney and liver (p < 0.01; fold change > 1.5). Functional analysis with Ingenuity Pathways Analysis showed that there were many more significantly altered genes involved in cancer-related pathways in kidney than in liver. Also, analysis with Gene Ontology for Functional Analysis (GOFFA) software indicated that the biological processes related to defense response, apoptosis and immune response were significantly altered by AA exposure in kidney, but not in liver. CONCLUSION: Our results suggest that microarray analysis is a useful tool for detecting AA exposure; that analysis of the gene expression profiles can define the differential responses to toxicity and carcinogenicity of AA from kidney and liver; and that significant alteration of genes associated with defense response, apoptosis and immune response in kidney, but not in liver, may be responsible for the tissue-specific toxicity and carcinogenicity of AA

Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform.

Author: Krajewski Paweł
Madrigal Pedro
Publication venue: BioData Min
Publication date: 01/01/2015
Field of study

BACKGROUND: Larger variation exists in epigenomes than in genomes, as a single genome shapes the identity of multiple cell types. With the advent of next-generation sequencing, one of the key problems in computational epigenomics is the poor understanding of correlations and quantitative differences between large scale data sets. RESULTS: Here we bring to genomics a scenario of functional principal component analysis, a finite Karhunen-Loève transform, and explicitly decompose the variation in the coverage profiles of 27 chromatin mark ChIP-seq datasets at transcription start sites for H1, one of the most used human embryonic stem cell lines. Using this approach we identify positive correlations between H3K4me3 and H3K36me3, as well as between H3K9ac and H3K36me3, so far undetected by the most commonly used Pearson correlation between read enrichment coverages. We uncover highly negative correlations between H2A.Z, H3K4me3, and several histone acetylation marks, but these occur only between principal components of first and second order. We also demonstrate that levels of gene expression correlate significantly with scores of components of order higher than one, demonstrating that transcriptional regulation by histone marks escapes simple one-to-one relationships. This correlations were higher in significance and magnitude in protein coding genes than in non-coding RNAs. CONCLUSIONS: In summary, we present a methodology to explore and uncover novel patterns of epigenomic variability and covariability in genomic data sets by using a functional eigenvalue decomposition of genomic data. R code is available at: http://github.com/pmb59/KLTepigenome

arXiv.org e-Print Archive

Apollo (Cambridge)

Visualizing dimensionality reduction of systems biology data

Author: A Hyvaerinen
A Hyvaerinen
A Inselberg
A Inselberg
A Saeed
Albert Pritzkau
Andreas Lehrmann
Aydin C. Polatkan
DH Jeong
DJ Lockhart
F Battke
F Battke
GH Golub
H Abdi
H Hotelling
HF Kaiser
J Shendure
JB Tenenbaum
K Pearson
Kay Nieselt
KQ Weinberger
LK Saul
M Fontes
M Harrower
M Schena
Michael Huber
P Mannfolk
R Karbauskaite
R Tarjan
S Roweis
Z Zhang
Ö Altug-Teber
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/06/2012
Field of study

One of the challenges in analyzing high-dimensional expression data is the detection of important biological signals. A common approach is to apply a dimension reduction method, such as principal component analysis. Typically, after application of such a method the data is projected and visualized in the new coordinate system, using scatter plots or profile plots. These methods provide good results if the data have certain properties which become visible in the new coordinate system and which were hard to detect in the original coordinate system. Often however, the application of only one method does not suffice to capture all important signals. Therefore several methods addressing different aspects of the data need to be applied. We have developed a framework for linear and non-linear dimension reduction methods within our visual analytics pipeline SpRay. This includes measures that assist the interpretation of the factorization result. Different visualizations of these measures can be combined with functional annotations that support the interpretation of the results. We show an application to high-resolution time series microarray data in the antibiotic-producing organism Streptomyces coelicolor as well as to microarray data measuring expression of cells with normal karyotype and cells with trisomies of human chromosomes 13 and 21

Publikationsserver der Universität Tübingen

Spectral gene set enrichment (SGSE)

Author: Frost H. Robert
Li Zhigang
Moore Jason H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/05/2014
Field of study

Motivation: Gene set testing is typically performed in a supervised context to quantify the association between groups of genes and a clinical phenotype. In many cases, however, a gene set-based interpretation of genomic data is desired in the absence of a phenotype variable. Although methods exist for unsupervised gene set testing, they predominantly compute enrichment relative to clusters of the genomic variables with performance strongly dependent on the clustering algorithm and number of clusters. Results: We propose a novel method, spectral gene set enrichment (SGSE), for unsupervised competitive testing of the association between gene sets and empirical data sources. SGSE first computes the statistical association between gene sets and principal components (PCs) using our principal component gene set enrichment (PCGSE) method. The overall statistical association between each gene set and the spectral structure of the data is then computed by combining the PC-level p-values using the weighted Z-method with weights set to the PC variance scaled by Tracey-Widom test p-values. Using simulated data, we show that the SGSE algorithm can accurately recover spectral features from noisy data. To illustrate the utility of our method on real data, we demonstrate the superior performance of the SGSE method relative to standard cluster-based techniques for testing the association between MSigDB gene sets and the variance structure of microarray gene expression data. Availability: http://cran.r-project.org/web/packages/PCGSE/index.html Contact: [email protected] or [email protected]

arXiv.org e-Print Archive

CiteSeerX

Dartmouth Digital Commons (Dartmouth College)

Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

Author: A Clare
A McCallum
AS Weigend
B Rost
B Schoikowski
B Shahbaba
Babak Shahbaba
BE Engelhardt
D Koller
EM Marcotte
FR Blattner
H Blockeel
I Tsochantaridis
IUBMB
J DeRisi
J Fox
J Goodman
J Struyf
J Zhang
JA Eisen
JR Guest
K Sjölander
L Cai
L Dehaspe
M Brown
M Deng
M Deng
M Eisen
M Riley
M Riley
N Cesa-Bianchi
O Dekel
P Pavlidis
R Caruana
R Eisner
Radford M Neal
RD King
RD King
RM Neal
RM Neal
RM Neal
S Rison
S Sattath
S Spiro
SF Altschul
ST Dumais
WR Pearson
Z Barutcuoglu
Publication venue
Publication date: 01/01/2006
Field of study

We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL) model, a hierarchical model based on a set of nested MNL models, and a MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs) from the E. coli genome. The results from all three models show substantial improvement over previous methods, which were based on the C5 algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining these sources of information, our approach results in a higher accuracy rate when compared to models that use each data source alone. Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information

arXiv.org e-Print Archive

University of Toronto Research Repository

Directory of Open Access Journals