18,693 research outputs found
Software defect prediction: do different classifiers find the same defects?
Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio
Stochastic Modeling of Expression Kinetics Identifies Messenger Half-Lives and Reveals Sequential Waves of Co-ordinated Transcription and Decay
The transcriptome in a cell is finely regulated by a large number of molecular mechanisms able to control the balance between mRNA production and degradation. Recent experimental findings have evidenced that fine and specific regulation of degradation is needed for proper orchestration of a global cell response to environmental conditions. We developed a computational technique based on stochastic modeling, to infer condition-specific individual mRNA half-lives directly from gene expression time-courses. Predictions from our method were validated by experimentally measured mRNA decay rates during the intraerythrocytic developmental cycle of Plasmodium falciparum. We then applied our methodology to publicly available data on the reproductive and metabolic cycle of budding yeast. Strikingly, our analysis revealed, in all cases, the presence of periodic changes in decay rates of sequentially induced genes and co-ordination strategies between transcription and degradation, thus suggesting a general principle for the proper coordination of transcription and degradation machinery in response to internal and/or external stimuli. Citation: Cacace F, Paci P, Cusimano V, Germani A, Farina L (2012) Stochastic Modeling of Expression Kinetics Identifies Messenger Half-Lives and Reveals Sequential Waves of Co-ordinated Transcription and Decay. PLoS Comput Biol 8(11): e1002772. doi:10.1371/journal.pcbi.100277
The circular SiZer, inferred persistence of shape parameters and application to early stem cell differentiation
We generalize the SiZer of Chaudhuri and Marron (J. Amer. Statist. Assoc. 94
(1999) 807-823, Ann. Statist. 28 (2000) 408-428) for the detection of shape
parameters of densities on the real line to the case of circular data. It turns
out that only the wrapped Gaussian kernel gives a symmetric, strongly Lipschitz
semi-group satisfying "circular" causality, that is, not introducing possibly
artificial modes with increasing levels of smoothing. Some notable differences
between Euclidean and circular scale space theory are highlighted. Based on
this, we provide an asymptotic theory to make inference about the persistence
of shape features. The resulting circular mode persistence diagram is applied
to the analysis of early mechanically-induced differentiation in adult human
stem cells from their actin-myosin filament structure. As a consequence, the
circular SiZer based on the wrapped Gaussian kernel (WiZer) allows the
verification at a controlled error level of the observation reported by Zemel
et al. (Nat. Phys. 6 (2010) 468-473): Within early stem cell differentiation,
polarizations of stem cells exhibit preferred directions in three different
micro-environments.Comment: Published at http://dx.doi.org/10.3150/15-BEJ722 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …