601 research outputs found

    Optimally splitting cases for training and testing high dimensional classifiers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE) of the prediction accuracy estimate?</p> <p>Results</p> <p>We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training) under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts.</p> <p>Conclusions</p> <p>By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n) and classification accuracy - with higher accuracy and smaller <it>n </it>resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (<it>n </it>≥ 100) with strong signals (i.e. 85% or greater full dataset accuracy). In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.</p

    A dedicated haem lyase is required for the maturation of a novel bacterial cytochrome c with unconventional covalent haem binding

    Get PDF
    In bacterial c-type cytochromes, the haem cofactor is covalently attached via two cysteine residues organized in a haem c-binding motif. Here, a novel octa-haem c protein, MccA, is described that contains only seven conventional haem c-binding motifs (CXXCH), in addition to several single cysteine residues and a conserved CH signature. Mass spectrometric analysis of purified MccA from Wolinella succinogenes suggests that two of the single cysteine residues are actually part of an unprecedented CX15CH sequence involved in haem c binding. Spectroscopic characterization of MccA identified an unusual high-potential haem c with a red-shifted absorption maximum, not unlike that of certain eukaryotic cytochromes c that exceptionally bind haem via only one thioether bridge. A haem lyase gene was found to be specifically required for the maturation of MccA in W. succinogenes. Equivalent haem lyase-encoding genes belonging to either the bacterial cytochrome c biogenesis system I or II are present in the vicinity of every known mccA gene suggesting a dedicated cytochrome c maturation pathway. The results necessitate reconsideration of computer-based prediction of putative haem c-binding motifs in bacterial proteomes

    Stoics against stoics in Cudworth's "A Treatise of Freewill"

    Get PDF
    In his 'A Treatise of Freewill', Ralph Cudworth argues against Stoic determinism by drawing on what he takes to be other concepts found in Stoicism, notably the claim that some things are ‘up to us’ and that these things are the product of our choice. These concepts are central to the late Stoic Epictetus and it appears at first glance as if Cudworth is opposing late Stoic voluntarism against early Stoic determinism. This paper argues that in fact, despite his claim to be drawing on Stoic doctrine, Cudworth uses these terms with a meaning first articulated only later, by the Peripatetic commentator Alexander of Aphrodisias

    Trace Metal Exposure is Associated with Increased Exhaled Nitric Oxide in Asthmatic Children

    Get PDF
    Background Children with asthma experience increased susceptibility to airborne pollutants. Exposure to traffic and industrial activity have been positively associated with exacerbation of symptoms as well as emergency room visits and hospitalisations. The effect of trace metals contained in fine particulate matter (aerodynamic diameter 2.5 μm and lower, PM2.5) on acute health effects amongst asthmatic children has not been well investigated. The objective of this panel study in asthmatic children was to determine the association between personal daily exposure to ambient trace metals and airway inflammation, as measured by fractional exhaled nitric oxide (FeNO). Methods Daily concentrations of trace metals contained on PM2.5 were determined from personal samples (n = 217) collected from 70 asthmatic school aged children in Montreal, Canada, over ten consecutive days. FeNO was measured daily using standard techniques. Results A positive association was found between FeNO and children’s exposure to an indicator of vehicular non-tailpipe emissions (8.9 % increase for an increase in the interquartile range (IQR) in barium, 95 % confidence interval (CI): 2.8, 15.4) as well as exposure to an indicator of industrial emissions (7.6 % increase per IQR increase in vanadium, 95 % CI: 0.1, 15.8). Elevated FeNO was also suggested for other metals on the day after the exposure: 10.3 % increase per IQR increase in aluminium (95 % CI: 4.2, 16.6) and 7.5 % increase per IQR increase in iron (95 % CI: 1.5, 13.9) at a 1-day lag period. Conclusions Exposures to ambient PM2.5 containing trace metals that are markers of traffic and industrial-derived emissions were associated in asthmatic children with an enhanced FeNO response

    Particulate oxidative burden as a predictor of exhaled nitric oxide in children with asthma

    Get PDF
    Background: Epidemiological studies have provided strong evidence that fine particulate matter (PM2.5; aerodynamic diameter ≤ 2.5 μm) can exacerbate asthmatic symptoms in children. Pro-oxidant components of PM2.5 are capable of directly generating reactive oxygen species. Oxidative burden is used to describe the capacity of PM2.5 to generate reactive oxygen species in the lung. Objective: In this study we investigated the association between airway inflammation in asthmatic children and oxidative burden of PM2.5 personal exposure. Methods: Daily PM2.5 personal exposure samples (n = 249) of 62 asthmatic school-aged children in Montreal were collected over 10 consecutive days. The oxidative burden of PM2.5 samples was determined in vitro as the depletion of low-molecular-weight antioxidants (ascorbate and glutathione) from a synthetic model of the fluid lining the respiratory tract. Airway inflammation was measured daily as fractional exhaled nitric oxide (FeNO). Results: A positive association was identified between FeNO and glutathione-related oxidative burden exposure in the previous 24 hr (6.0% increase per interquartile range change in glutathione). Glutathione-related oxidative burden was further found to be positively associated with FeNO over 1-day lag and 2-day lag periods. Results further demonstrate that corticosteroid use may reduce the FeNO response to elevated glutathione-related oxidative burden exposure (no use, 15.8%; irregular use, 3.8%), whereas mold (22.1%), dust (10.6%), or fur (13.1%) allergies may increase FeNO in children with versus children without these allergies (11.5%). No association was found between PM2.5 mass or ascorbate-related oxidative burden and FeNO levels. Conclusions: Exposure to PM2.5 with elevated glutathione-related oxidative burden was associated with increased FeNO

    Strain-dependent host transcriptional responses to toxoplasma infection are largely conserved in mammalian and avian hosts

    Get PDF
    Toxoplasma gondii has a remarkable ability to infect an enormous variety of mammalian and avian species. Given this, it is surprising that three strains (Types I/II/III) account for the majority of isolates from Europe/North America. The selective pressures that have driven the emergence of these particular strains, however, remain enigmatic. We hypothesized that strain selection might be partially driven by adaptation of strains for mammalian versus avian hosts. To test this, we examine in vitro, strain-dependent host responses in fibroblasts of a representative avian host, the chicken (Gallus gallus). Using gene expression profiling of infected chicken embryonic fibroblasts and pathway analysis to assess host response, we show here that chicken cells respond with distinct transcriptional profiles upon infection with Type II versus III strains that are reminiscent of profiles observed in mammalian cells. To identify the parasite drivers of these differences, chicken fibroblasts were infected with individual F1 progeny of a Type II x III cross and host gene expression was assessed for each by microarray. QTL mapping of transcriptional differences suggested, and deletion strains confirmed, that, as in mammalian cells, the polymorphic rhoptry kinase ROP16 is the major driver of strain-specific responses. We originally hypothesized that comparing avian versus mammalian host response might reveal an inversion in parasite strain-dependent phenotypes; specifically, for polymorphic effectors like ROP16, we hypothesized that the allele with most activity in mammalian cells might be less active in avian cells. Instead, we found that activity of ROP16 alleles appears to be conserved across host species; moreover, additional parasite loci that were previously mapped for strain-specific effects on mammalian response showed similar strain-specific effects in chicken cells. These results indicate that if different hosts select for different parasite genotypes, the selection operates downstream of the signaling occurring during the beginning of the host's immune response. © 2011 Ong et al

    Hydroxypyridinone and 5-Aminolaevulinic Acid Conjugates for Photodynamic Therapy

    Get PDF
    Photodynamic therapy (PDT) is a promising treatment strategy for malignant and nonmalignant lesions. 5-Aminolaevulinic acid (ALA) is used as a precursor of the photosensitizer, protoporphyrin IX (PpIX), in dermatology and urology. However, the effectiveness of ALA–PDT is limited by the relatively poor bioavailability of ALA and rapid conversion of PpIX to haem. The main goal of this study was to prepare and investigate a library of single conjugates designed to coadminister the bioactive agents ALA and hydroxypyridinone (HPO) iron chelators. A significant increase in intracellular PpIX levels was observed in all cell lines tested when compared to the administration of ALA alone. The higher PpIX levels observed using the conjugates correlated well with the observed phototoxicity following exposure of cells to light. Passive diffusion appears to be the main mechanism for the majority of ALA–HPOs investigated. This study demonstrates that ALA–HPOs significantly enhance phototherapeutic metabolite formation and phototoxicity

    Factors Influencing the Statistical Power of Complex Data Analysis Protocols for Molecular Signature Development from Microarray Data

    Get PDF
    Critical to the development of molecular signatures from microarray and other high-throughput data is testing the statistical significance of the produced signature in order to ensure its statistical reproducibility. While current best practices emphasize sufficiently powered univariate tests of differential expression, little is known about the factors that affect the statistical power of complex multivariate analysis protocols for high-dimensional molecular signature development.We show that choices of specific components of the analysis (i.e., error metric, classifier, error estimator and event balancing) have large and compounding effects on statistical power. The effects are demonstrated empirically by an analysis of 7 of the largest microarray cancer outcome prediction datasets and supplementary simulations, and by contrasting them to prior analyses of the same data.THE FINDINGS OF THE PRESENT STUDY HAVE TWO IMPORTANT PRACTICAL IMPLICATIONS: First, high-throughput studies by avoiding under-powered data analysis protocols, can achieve substantial economies in sample required to demonstrate statistical significance of predictive signal. Factors that affect power are identified and studied. Much less sample than previously thought may be sufficient for exploratory studies as long as these factors are taken into consideration when designing and executing the analysis. Second, previous highly-cited claims that microarray assays may not be able to predict disease outcomes better than chance are shown by our experiments to be due to under-powered data analysis combined with inappropriate statistical tests
    corecore