18,693 research outputs found

    Software defect prediction: do different classifiers find the same defects?

    Get PDF
    Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio

    Stochastic Modeling of Expression Kinetics Identifies Messenger Half-Lives and Reveals Sequential Waves of Co-ordinated Transcription and Decay

    Get PDF
    The transcriptome in a cell is finely regulated by a large number of molecular mechanisms able to control the balance between mRNA production and degradation. Recent experimental findings have evidenced that fine and specific regulation of degradation is needed for proper orchestration of a global cell response to environmental conditions. We developed a computational technique based on stochastic modeling, to infer condition-specific individual mRNA half-lives directly from gene expression time-courses. Predictions from our method were validated by experimentally measured mRNA decay rates during the intraerythrocytic developmental cycle of Plasmodium falciparum. We then applied our methodology to publicly available data on the reproductive and metabolic cycle of budding yeast. Strikingly, our analysis revealed, in all cases, the presence of periodic changes in decay rates of sequentially induced genes and co-ordination strategies between transcription and degradation, thus suggesting a general principle for the proper coordination of transcription and degradation machinery in response to internal and/or external stimuli. Citation: Cacace F, Paci P, Cusimano V, Germani A, Farina L (2012) Stochastic Modeling of Expression Kinetics Identifies Messenger Half-Lives and Reveals Sequential Waves of Co-ordinated Transcription and Decay. PLoS Comput Biol 8(11): e1002772. doi:10.1371/journal.pcbi.100277

    The circular SiZer, inferred persistence of shape parameters and application to early stem cell differentiation

    Full text link
    We generalize the SiZer of Chaudhuri and Marron (J. Amer. Statist. Assoc. 94 (1999) 807-823, Ann. Statist. 28 (2000) 408-428) for the detection of shape parameters of densities on the real line to the case of circular data. It turns out that only the wrapped Gaussian kernel gives a symmetric, strongly Lipschitz semi-group satisfying "circular" causality, that is, not introducing possibly artificial modes with increasing levels of smoothing. Some notable differences between Euclidean and circular scale space theory are highlighted. Based on this, we provide an asymptotic theory to make inference about the persistence of shape features. The resulting circular mode persistence diagram is applied to the analysis of early mechanically-induced differentiation in adult human stem cells from their actin-myosin filament structure. As a consequence, the circular SiZer based on the wrapped Gaussian kernel (WiZer) allows the verification at a controlled error level of the observation reported by Zemel et al. (Nat. Phys. 6 (2010) 468-473): Within early stem cell differentiation, polarizations of stem cells exhibit preferred directions in three different micro-environments.Comment: Published at http://dx.doi.org/10.3150/15-BEJ722 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
    • …
    corecore