38,993 research outputs found

    Development of a blood-based gene expression algorithm for assessment of obstructive coronary artery disease in non-diabetic patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alterations in gene expression in peripheral blood cells have been shown to be sensitive to the presence and extent of coronary artery disease (CAD). A non-invasive blood test that could reliably assess obstructive CAD likelihood would have diagnostic utility.</p> <p>Results</p> <p>Microarray analysis of RNA samples from a 195 patient Duke CATHGEN registry case:control cohort yielded 2,438 genes with significant CAD association (p < 0.05), and identified the clinical/demographic factors with the largest effects on gene expression as age, sex, and diabetic status. RT-PCR analysis of 88 CAD classifier genes confirmed that diabetic status was the largest clinical factor affecting CAD associated gene expression changes. A second microarray cohort analysis limited to non-diabetics from the multi-center PREDICT study (198 patients; 99 case: control pairs matched for age and sex) evaluated gene expression, clinical, and cell population predictors of CAD and yielded 5,935 CAD genes (p < 0.05) with an intersection of 655 genes with the CATHGEN results. Biological pathway (gene ontology and literature) and statistical analyses (hierarchical clustering and logistic regression) were used in combination to select 113 genes for RT-PCR analysis including CAD classifiers, cell-type specific markers, and normalization genes.</p> <p>RT-PCR analysis of these 113 genes in a PREDICT cohort of 640 non-diabetic subject samples was used for algorithm development. Gene expression correlations identified clusters of CAD classifier genes which were reduced to meta-genes using LASSO. The final classifier for assessment of obstructive CAD was derived by Ridge Regression and contained sex-specific age functions and 6 meta-gene terms, comprising 23 genes. This algorithm showed a cross-validated estimated AUC = 0.77 (95% CI 0.73-0.81) in ROC analysis.</p> <p>Conclusions</p> <p>We have developed a whole blood classifier based on gene expression, age and sex for the assessment of obstructive CAD in non-diabetic patients from a combination of microarray and RT-PCR data derived from studies of patients clinically indicated for invasive angiography.</p> <p>Clinical trial registration information</p> <p>PREDICT, Personalized Risk Evaluation and Diagnosis in the Coronary Tree, <url>http://www.clinicaltrials.gov</url>, <a href="http://www.clinicaltrials.gov/ct2/show/NCT00500617">NCT00500617</a></p

    Techniques for clustering gene expression data

    Get PDF
    Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

    Predictive response-relevant clustering of expression data provides insights into disease processes

    Get PDF
    This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of ;response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the ;meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes-including three transcription factors (Arntl, Bhlhe41 and Npas2)-that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets

    Essential guidelines for computational method benchmarking

    Get PDF
    In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology.Comment: Minor update

    Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle

    Full text link
    The effort to identify genes with periodic expression during the cell cycle from genome-wide microarray time series data has been ongoing for a decade. However, the lack of rigorous modeling of periodic expression as well as the lack of a comprehensive model for integrating information across genes and experiments has impaired the effort for the accurate identification of periodically expressed genes. To address the problem, we introduce a Bayesian model to integrate multiple independent microarray data sets from three recent genome-wide cell cycle studies on fission yeast. A hierarchical model was used for data integration. In order to facilitate an efficient Monte Carlo sampling from the joint posterior distribution, we develop a novel Metropolis--Hastings group move. A surprising finding from our integrated analysis is that more than 40% of the genes in fission yeast are significantly periodically expressed, greatly enhancing the reported 10--15% of the genes in the current literature. It calls for a reconsideration of the periodically expressed gene detection problem.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS300 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • ā€¦
    corecore