22 research outputs found

    Identification of SERPINA1 as single marker for papillary thyroid carcinoma through microarray meta analysis and quantification of its discriminatory power in independent validation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Several DNA microarray based expression signatures for the different clinically relevant thyroid tumor entities have been described over the past few years. However, reproducibility of these signatures is generally low, mainly due to study biases, small sample sizes and the highly multivariate nature of microarrays. While there are new technologies available for a more accurate high throughput expression analysis, we show that there is still a lot of information to be gained from data deposited in public microarray databases. In this study we were aiming (1) to identify potential markers for papillary thyroid carcinomas through meta analysis of public microarray data and (2) to confirm these markers in an independent dataset using an independent technology.</p> <p>Methods</p> <p>We adopted a meta analysis approach for four publicly available microarray datasets on papillary thyroid carcinoma (PTC) nodules versus nodular goitre (NG) from N2-frozen tissue. The methodology included merging of datasets, bias removal using distance weighted discrimination (DWD), feature selection/inference statistics, classification/crossvalidation and gene set enrichment analysis (GSEA). External Validation was performed on an independent dataset using an independent technology, quantitative RT-PCR (RT-qPCR) in our laboratory.</p> <p>Results</p> <p>From meta analysis we identified one gene (SERPINA1) which identifies papillary thyroid carcinoma against benign nodules with 99% accuracy (n = 99, sensitivity = 0.98, specificity = 1, PPV = 1, NPV = 0.98). In the independent validation data, which included not only PTC and NG, but all major histological thyroid entities plus a few variants, SERPINA1 was again markedly up regulated (36-fold, p = 1:3*10<sup>-10</sup>) in PTC and identification of papillary carcinoma was possible with 93% accuracy (n = 82, sensitivity = 1, specificity = 0.90, PPV = 0.76, NPV = 1). We also show that the extracellular matrix pathway is strongly activated in the meta analysis data, suggesting an important role of tumor-stroma interaction in the carcinogenesis of papillary thyroid carcinoma.</p> <p>Conclusions</p> <p>We show that valuable new information can be gained from meta analysis of existing microarray data deposited in public repositories. While single microarray studies rarely exhibit a sample number which allows robust feature selection, this can be achieved by combining published data using DWD. This approach is not only efficient, but also very cost-effective. Independent validation shows the validity of the results from this meta analysis and confirms SERPINA1 as a potent mRNA marker for PTC in a total (meta analysis plus validation) of 181 samples.</p

    Analysis of Alzheimer's disease severity across brain regions by topological analysis of gene co-expression networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alzheimer's disease (AD) is a progressive neurodegenerative disorder involving variations in the transcriptome of many genes. AD does not affect all brain regions simultaneously. Identifying the differences among the affected regions may shed more light onto the disease progression. We developed a novel method involving the differential topology of gene coexpression networks to understand the association among affected regions and disease severity.</p> <p>Methods</p> <p>We analysed microarray data of four regions - entorhinal cortex (EC), hippocampus (HIP), posterior cingulate cortex (PCC) and middle temporal gyrus (MTG) from AD affected and normal subjects. A coexpression network was built for each region and the topological overlap between them was examined. Genes with zero topological overlap between two region-specific networks were used to characterise the differences between the two regions.</p> <p>Results and conclusion</p> <p>Results indicate that MTG shows early AD pathology compared to the other regions. We postulate that if the MTG gets affected later in the disease, post-mortem analyses of individuals with end-stage AD will show signs of early AD in the MTG, while the EC, HIP and PCC will have severe pathology. Such knowledge is useful for data collection in clinical studies where sample selection is a limiting factor as well as highlighting the underlying biology of disease progression.</p

    Mathematical models for immunology:current state of the art and future research directions

    Get PDF
    The advances in genetics and biochemistry that have taken place over the last 10 years led to significant advances in experimental and clinical immunology. In turn, this has led to the development of new mathematical models to investigate qualitatively and quantitatively various open questions in immunology. In this study we present a review of some research areas in mathematical immunology that evolved over the last 10 years. To this end, we take a step-by-step approach in discussing a range of models derived to study the dynamics of both the innate and immune responses at the molecular, cellular and tissue scales. To emphasise the use of mathematics in modelling in this area, we also review some of the mathematical tools used to investigate these models. Finally, we discuss some future trends in both experimental immunology and mathematical immunology for the upcoming years

    Planning identification experiments for cell signaling pathways: An NFκB case study

    No full text
    Mathematical modeling of cell signaling pathways has become a very important and challenging problem in recent years. The importance comes from possible applications of obtained models. It may help us to understand phenomena appearing in single cells and cell populations on a molecular level. Furthermore, it may help us with the discovery of new drug therapies. Mathematical models of cell signaling pathways take different forms. The most popular way of mathematical modeling is to use a set of nonlinear ordinary differential equations (ODEs). It is very difficult to obtain a proper model. There are many hypotheses about the structure of the model (sets of variables and phenomena) that should be verified. The next step, fitting the parameters of the model, is also very complicated because of the nature of measurements. The blotting technique usually gives only semi-quantitative observations, which are very noisy and collected only at a limited number of time moments. The accuracy of parameter estimation may be significantly improved by a proper experiment design. Recently, we have proposed a gradient-based algorithm for the optimization of a sampling schedule. In this paper we use the algorithm in order to optimize a sampling schedule for the identification of the mathematical model of the NF[...]B regulatory module, known from the literature. We propose a two-stage optimization approach: a gradient-based procedure to find all stationary points and then pair-wise replacement for finding optimal numbers of replicates of measurements. Convergence properties of the presented algorithm are examined

    Stability of gene selection methods for multiclass clssification

    No full text
    A big problem in applying DNA microarrays for classification is dimension of the dataset. Recently we proposed a gene selection method based on Partial Least Squares (PLS) for searching best genes for classification. The new idea is to use PLS not only as multiclass approach, but to construct more binary selections that use one versus rest and one versus one approaches. Ranked gene lists are highly instable in the sense, that a small change of the data set often leads to big change of the obtained ordered list. In this article, we take a look at the assessment of stability of our approaches. We compare the variability of the obtained ordered lists from proposed methods with well known Recursive Feature Elimination (RFE) method and classical t-test method. This paper focuses on effective identification of informative genes. As a result, a new strategy to find small subset of significant genes is designed. Our results on real cancer data show that our approach has very high accuracy rate for different combinations of classification methods giving in the same time very stable feature rankings

    The analysis of chromatin condensation state and transcriptional activity using DNA microarrays

    No full text
    The DNA microarray-based technique has been developed to semi-quantitatively measure the in vivo global chromatin condensation state at the resolution of a gene. Chromatin was fractionated due to the differential solubility of histone H1-containing and histone H1-free nucleosomes. A set of genes non-randomly distributed between histone H1-free (uncondensed or open) and histone H1-containing (condensed or closed) chromatin fractions has been identified. The transcript levels have been measured for the same group of genes. The correlation between transcriptional activity and chromatin fraction distribution of particular genes has been established

    Selecting Differentially Expressed Genes for Colon Tumor Classification

    No full text
    DNA microarrays provide a new technique of measuring gene expression, which has attracted a lot of research interest in recent years. It was suggested that gene expression data from microarrays (biochips) can be employed in many biomedical areas, e.g., in cancer classification. Although several, new and existing, methods of classification were tested, a selection of proper (optimal) set of genes, the expressions of which can serve during classification, is still an open problem. Recently we have proposed a new recursive feature replacement (RFR) algorithm for choosing a suboptimal set of genes. The algorithm uses the support vector machines (SVM) technique. In this paper we use the RFR method for finding suboptimal gene subsets for tumor/normal colon tissue classification. The obtained results are compared with the results of applying other methods recently proposed in the literature. The comparison shows that the RFR method is able to find the smallest gene subset (only six genes) that gives no misclassifications in leave-one-out cross-validation for a tumor/normal colon data set. In this sense the RFR algorithm outperforms all other investigated methods
    corecore