82 research outputs found

    Classification of microarrays; synergistic effects between normalization, gene selection and machine learning

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Machine learning is a powerful approach for describing and predicting classes in microarray data. Although several comparative studies have investigated the relative performance of various machine learning methods, these often do not account for the fact that performance (e.g. error rate) is a result of a series of analysis steps of which the most important are data normalization, gene selection and machine learning.</p> <p>Results</p> <p>In this study, we used seven previously published cancer-related microarray data sets to compare the effects on classification performance of five normalization methods, three gene selection methods with 21 different numbers of selected genes and eight machine learning methods. Performance in term of error rate was rigorously estimated by repeatedly employing a double cross validation approach. Since performance varies greatly between data sets, we devised an analysis method that first compares methods within individual data sets and then visualizes the comparisons across data sets. We discovered both well performing individual methods and synergies between different methods.</p> <p>Conclusion</p> <p>Support Vector Machines with a radial basis kernel, linear kernel or polynomial kernel of degree 2 all performed consistently well across data sets. We show that there is a synergistic relationship between these methods and gene selection based on the T-test and the selection of a relatively high number of genes. Also, we find that these methods benefit significantly from using normalized data, although it is hard to draw general conclusions about the relative performance of different normalization procedures.</p

    Gene array identification of Ipf1/Pdx1-/- regulated genes in pancreatic progenitor cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The homeodomain transcription factor IPF1/PDX1 exerts a dual role in the pancreas; <it>Ipf1/Pdx1 </it>global null mutants fail to develop a pancreas whereas conditional inactivation of <it>Ipf1/Pdx1 </it>in β-cells leads to impaired β-cell function and diabetes. Although several putative target genes have been linked to the β-cell function of <it>Ipf1/Pdx1</it>, relatively little is known with respect to genes regulated by IPF1/PDX1 in early pancreatic progenitor cells.</p> <p>Results</p> <p>Microarray analyses identified a total of 111 genes that were differentially expressed in e10.5 pancreatic buds of <it>Ipf1/Pdx1</it><sup>-/- </sup>embryos. The expression of one of these, <it>Spondin 1</it>, which encodes an extracellular matrix protein, has not previously been described in the pancreas. Quantitative real-time RT-PCR analyses and immunohistochemical analyses also revealed that the expression of <it>FgfR2IIIb</it>, that encodes the receptor for FGF10, was down-regulated in <it>Ipf1/Pdx1</it><sup>-/- </sup>pancreatic progenitor cells.</p> <p>Conclusion</p> <p>This microarray analysis has identified a number of candidate genes that are differentially expressed in <it>Ipf1/Pdx1</it><sup>-/- </sup>pancreatic buds. Several of the differentially expressed genes were known to be important for pancreatic progenitor cell proliferation and differentiation whereas others have not previously been associated with pancreatic development.</p

    Immortalization of T-cells is accompanied by gradual changes in CpG methylation resulting in a profile resembling a subset of T-cell leukemias

    Get PDF
    We have previously described gene expression changes during spontaneous immortalization of T-cells, thereby identifying cellular processes important for cell growth crisis escape and unlimited proliferation. Here, we analyze the same model to investigate the role of genome-wide methylation in the immortalization process at different time points pre-crisis and post-crisis using high-resolution arrays. We show that over time in culture there is an overall accumulation of methylation alterations, with preferential increased methylation close to transcription start sites (TSSs), islands, and shore regions. Methylation and gene expression alterations did not correlate for the majority of genes, but for the fraction that correlated, gain of methylation close to TSS was associated with decreased gene expression. Interestingly, the pattern of CpG site methylation observed in immortal T-cell cultures was similar to clinical T-cell acute lymphoblastic leukemia (T-ALL) samples classified as CpG island methylator phenotype positive. These sites were highly overrepresented by polycomb target genes and involved in developmental, cell adhesion, and cell signaling processes. The presence of non-random methylation events in in vitro immortalized T-cell cultures and diagnostic T-ALL samples indicates altered methylation of CpG sites with a possible role in malignant hematopoiesis

    SNX10 gene mutation leading to osteopetrosis with dysfunctional osteoclasts

    Get PDF
    Acknowledgements We sincerely thank the patients and family members who participated in this study. We would also like to thank Stefan Esher, Umeå University, for help with genealogy, and Anna Westerlund for excellent technical assistance. This work was supported by grants from the FOU, at the Umeå university hospital, and the Medical Faculty at Umeå University. The work at University of Gothenburg was supported by grants from The Swedish Research Council, the Swedish Rheumatism Association, the Royal 80-Year Fund of King Gustav V, ALF/LUA research grant from Sahlgrenska University Hospital in Gothenburg and the Lundberg Foundation. The work at the University of Gothenburg and the University of Aberdeen was supported by Euroclast, a Marie Curie FP7-People-2013-ITN: # 607446.Peer reviewedPublisher PD

    Outbreaks of Tularemia in a Boreal Forest Region Depends on Mosquito Prevalence

    Get PDF
    Background. We aimed to evaluate the potential association of mosquito prevalence in a boreal forest area with transmission of the bacterial disease tularemia to humans, and model the annual variation of disease using local weather data

    Buffering of Segmental and Chromosomal Aneuploidies in Drosophila melanogaster

    Get PDF
    Chromosomal instability, which involves the deletion and duplication of chromosomes or chromosome parts, is a common feature of cancers, and deficiency screens are commonly used to detect genes involved in various biological pathways. However, despite their importance, the effects of deficiencies, duplications, and chromosome losses on the regulation of whole chromosomes and large chromosome domains are largely unknown. Therefore, to explore these effects, we examined expression patterns of genes in several Drosophila deficiency hemizygotes and a duplication hemizygote using microarrays. The results indicate that genes expressed in deficiency hemizygotes are significantly buffered, and that the buffering effect is general rather than being mainly mediated by feedback regulation of individual genes. In addition, differentially expressed genes in haploid condition appear to be generally more strongly buffered than ubiquitously expressed genes in haploid condition, but, among genes present in triploid condition, ubiquitously expressed genes are generally more strongly buffered than differentially expressed genes. Furthermore, we show that the 4th chromosome is compensated in response to dose differences. Our results suggest general mechanisms have evolved that stimulate or repress gene expression of aneuploid regions as appropriate, and on the 4th chromosome of Drosophila this compensation is mediated by Painting of Fourth (POF)

    Estimation of the reliability of systems described by the Daniels Load-Sharing Model

    No full text
    We consider the problem of estimating the failure stresses of bundles (i.e. the tensile forces that destroy the bundles), constructed of several statisti-cally similar fibres, given a particular kind of censored data. Each bundle consists of several fibres which have their own independent identically dis-tributed failure stresses, and where the force applied on a bundle at any moment is distributed equally between the unbroken fibres in the bundle. A bundle with these properties is an example of an equal load-sharing sys-tem, often referred to as the Daniels failure model. The testing of several bundles generates a special kind of censored data, which is complexly struc-tured. Strongly consistent non-parametric estimators of the distribution laws of bundles are obtained by applying the theory of martingales, and by using the observed data. It is proved that random sampling, with replace-ment from the statistical data related to each tested bundle, can be used to obtain asymptotically correct estimators for the distribution functions of deviations of non-parametric estimators from true values. In the case when the failure stresses of the fibres are described by a Weibull distribution, we obtain strongly consistent parametric maximum likelihood estimators of the distribution functions of failure stresses of bundles, by using the complexly structured data. Numerical examples illustrate the behavior of the obtained estimators
    corecore