67 research outputs found

    Identification of Differentially Expressed Genes through Integrated Study of Alzheimer’s Disease Affected Brain Regions

    No full text
    <div><p>Background</p><p>Alzheimer’s disease (AD) is the most common form of dementia in older adults that damages the brain and results in impaired memory, thinking and behaviour. The identification of differentially expressed genes and related pathways among affected brain regions can provide more information on the mechanisms of AD. In the past decade, several studies have reported many genes that are associated with AD. This wealth of information has become difficult to follow and interpret as most of the results are conflicting. In that case, it is worth doing an integrated study of multiple datasets that helps to increase the total number of samples and the statistical power in detecting biomarkers. In this study, we present an integrated analysis of five different brain region datasets and introduce new genes that warrant further investigation.</p><p>Methods</p><p>The aim of our study is to apply a novel combinatorial optimisation based meta-analysis approach to identify differentially expressed genes that are associated to AD across brain regions. In this study, microarray gene expression data from 161 samples (74 non-demented controls, 87 AD) from the Entorhinal Cortex (EC), Hippocampus (HIP), Middle temporal gyrus (MTG), Posterior cingulate cortex (PC), Superior frontal gyrus (SFG) and visual cortex (VCX) brain regions were integrated and analysed using our method. The results are then compared to two popular meta-analysis methods, RankProd and GeneMeta, and to what can be obtained by analysing the individual datasets.</p><p>Results</p><p>We find genes related with AD that are consistent with existing studies, and new candidate genes not previously related with AD. Our study confirms the up-regualtion of <i>INFAR2</i> and <i>PTMA</i> along with the down regulation of <i>GPHN, RAB2A, PSMD14</i> and <i>FGF</i>. Novel genes <i>PSMB2, WNK1, RPL15, SEMA4C, RWDD2A</i> and <i>LARGE</i> are found to be differentially expressed across all brain regions. Further investigation on these genes may provide new insights into the development of AD. In addition, we identified the presence of 23 non-coding features, including four miRNA precursors (miR-7, miR570, miR-1229 and miR-6821), dysregulated across the brain regions. Furthermore, we compared our results with two popular meta-analysis methods RankProd and GeneMeta to validate our findings and performed a sensitivity analysis by removing one dataset at a time to assess the robustness of our results. These new findings may provide new insights into the disease mechanisms and thus make a significant contribution in the near future towards understanding, prevention and cure of AD.</p></div

    Heatmap for the 540 probes with BF-value < 0.0001 of combined analysis.

    No full text
    <p>There are 540 up and down regulated probes which are differentially expressed between control and AD. The first colour bar at the bottom indicates AD (blue) and control (red) samples. The second colour bar represents each sample group in different colour. EC (blue), HIP (red), MTG (orange), PC (grey) and SFG (cyan). The colour bar at the side of the heatmap represents the range of fold-changes with respect to the control samples mean value by means of a colour gradient ranging from green (log<sub>2</sub>(<i>FC</i>) = −5, down regulation) to red (log<sub>2</sub>(<i>FC</i>) = 5, up-regulation). See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0152342#pone.0152342.s002" target="_blank">S2 Fig</a> for a full size version of this figure.</p

    Overlapping genes in t-test, Coloured <i>(α</i>,<i>β)-k</i> and <i>(α</i>,<i>β)-k</i>-Feature Selection.

    No full text
    <p><b>Number of Datasets</b> shows the number of datasets considered to find the overlapping. <b><i>t</i>-test</b> gives the number of overlapping genes in t-test results for the considered datasets, <b><i>(α</i>,<i>β)-k</i> Feature Selection</b> gives the number of overlapping genes between individual <i>(α</i>,<i>β)-k</i> feature selection result for each case. <b>Coloured <i>(α</i>,<i>β)-k</i>-Feature Selection</b> gives the number of common genes in the result of Coloured <i>(α</i>,<i>β)-k</i>-feature selection considered case of datasets. For method details refer to Section 2.</p><p>Overlapping genes in t-test, Coloured <i>(α</i>,<i>β)-k</i> and <i>(α</i>,<i>β)-k</i>-Feature Selection.</p

    A New Combinatorial Optimization Approach for Integrated Feature Selection Using Different Datasets: A Prostate Cancer Transcriptomic Study

    No full text
    <div><p>Background</p><p>The joint study of multiple datasets has become a common technique for increasing statistical power in detecting biomarkers obtained from smaller studies. The approach generally followed is based on the fact that as the total number of samples increases, we expect to have greater power to detect associations of interest. This methodology has been applied to genome-wide association and transcriptomic studies due to the availability of datasets in the public domain. While this approach is well established in biostatistics, the introduction of new combinatorial optimization models to address this issue has not been explored in depth. In this study, we introduce a new model for the integration of multiple datasets and we show its application in transcriptomics.</p><p>Methods</p><p>We propose a new combinatorial optimization problem that addresses the core issue of biomarker detection in integrated datasets. Optimal solutions for this model deliver a feature selection from a panel of prospective biomarkers. The model we propose is a generalised version of the <i>(α</i>,<i>β)-k</i>-Feature Set problem. We illustrate the performance of this new methodology via a challenging meta-analysis task involving six prostate cancer microarray datasets. The results are then compared to the popular RankProd meta-analysis tool and to what can be obtained by analysing the individual datasets by statistical and combinatorial methods alone.</p><p>Results</p><p>Application of the integrated method resulted in a more informative signature than the rank-based meta-analysis or individual dataset results, and overcomes problems arising from real world datasets. The set of genes identified is highly significant in the context of prostate cancer. The method used does not rely on homogenisation or transformation of values to a common scale, and at the same time is able to capture markers associated with subgroups of the disease.</p></div

    Heatmap for the 23 probes from the combined analysis result that annotate to the non coding features.

    No full text
    <p>There are 23 up and down regulated probes which are differentially expressed between control and AD. The first colour bar at the bottom indicates AD (blue) and control (red) samples. The second colour bar represents each sample group in different colour. EC (blue), HIP (red), MTG (orange), PC (grey) and SFG (cyan). The colour bar at the side of the heatmap represents the range of fold-changes with respect to the control samples mean value by means of a colour gradient ranging from green (log<sub>2</sub>(<i>FC</i>) = −6, down regulation) to red (log<sub>2</sub>(<i>FC</i>) = 6, up-regulation).</p

    Summary of datasets used in this study.

    No full text
    <p><b>Name</b> is the name assigned to the study throughout this paper. <b>Plat</b> is the platform details of each dataset. <b>Series</b> is the Gene Expression Omnibus Series identifier for the dataset. <b>NS</b> is the original number of samples in the study, of which <b>Norm</b> are the number of healthy tissue samples, <b>PT</b> are the number of primary tumour samples, <b>Met</b> is the number of metastasis samples present in each dataset, <b>Probes</b> is the number of probes present in each dataset, <b>EF</b> is the number of probes present after entropy filtering.</p><p>Summary of datasets used in this study.</p

    <i>t</i>-test results on individual dataset.

    No full text
    <p><b>Dataset</b> is the short name used in this paper for the dataset. <b>Feat.No</b> is the number of features (probes) present in the dataset before applying t-test, and <b>Signature size</b> is the number of genes in the resulting solution for each dataset. For method details refer to Section 2.</p><p><i>t</i>-test results on individual dataset.</p

    Heatmap for the Coloured <i>(α</i>,<i>β)-k</i>-Feature Selection resulted genes that cover six datasets.

    No full text
    <p>There are 120 up and down regulated genes (columns) which are differentially expressed between normal and tumour classes. The two colour bars at the right represent the ordering of samples and sample groups, respectively, as explained in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127702#pone.0127702.g001" target="_blank">Fig 1</a>.</p

    Result of Coloured <i>(α</i>,<i>β)-k-</i>Feature Set selection methodology.

    No full text
    <p><b>No of Datasets</b> is the considered number of datasets to find the coverage. <b>No of Combined Probes</b> is the resulted number of features after applying Coloured <i>(α</i>,<i>β)-k</i>-Feature Set selection methodology and <b>No of Genes</b> is the number of genes corresponds to the number of combined probes.</p><p>Result of Coloured <i>(α</i>,<i>β)-k-</i>Feature Set selection methodology.</p
    • …
    corecore