9 research outputs found
Additional file 1 of Minimum redundancy maximum relevance feature selection approach for temporal gene expression data
Supplementary materials. The supplementary PDF file contains relevant information omitted from the main manuscript such as: (1) the ranked list of the top 50 genes selected by the TMRMR-C approach for H3N2, HRV and RSV datasets, respectively and (2) error bars for the two groups, symptomatic and asymptomatic, for the top genes selected from the three datasets. (DOCX 240 kb
Additional file 1 of Structured feature selection using coordinate descent optimization
Supplementary materials. The supplementary PDF file contains relevant information omitted from the main manuscript such as: (1) other applications for the proposed features selection method; (2) derivation for two loss functions used in the experiments; (3) implementation details for BCGD; (4) synthetic data generation process; and (5) scalability results that are not reported in the main manuscript. (PDF 271 KB
Risk of readmission with and without the interaction term.
<p>Surface plot of the response (risk of readmission) from the model without (left) and with interaction between length of stay (LOS_LOG) and number of chronic diseases (NCHRONIC).</p
Ranked list of the most frequent positive coefficients including comorbidity terms for both proposed approaches.
<p>Ranked list of the most frequent positive coefficients including comorbidity terms for both proposed approaches.</p
Ranked list of the most frequent negative coefficients including comorbidity terms for both proposed approaches.
<p>Ranked list of the most frequent negative coefficients including comorbidity terms for both proposed approaches.</p
Complexity of the three observed approaches.
<p>Comparison of model complexity, measured as number of selected features, for the three compared approaches and four different settings of “Number of Discovered Interactions” (NDI).</p
Classification performance of the three observed approaches.
<p>Four sets of boxplots represent predictive performance measured in Area under the ROC curve (AUC) for 1-Standard Error (1SE), boosted C5.0 decision trees (C5.0), glinternet (GLI) and a model using optimal lambda (OPT) setting obtained using cross-validation. Each set is obtained for a different setting of “Number of Discovered Interactions” (NDI)–i.e. 5, 10, 15 and 20 interactions.</p
Reinventing Biostatistics Education for Basic Scientists
<div><p>Numerous studies demonstrating that statistical errors are common in basic science publications have led to calls to improve statistical training for basic scientists. In this article, we sought to evaluate statistical requirements for PhD training and to identify opportunities for improving biostatistics education in the basic sciences. We provide recommendations for improving statistics training for basic biomedical scientists, including: 1. Encouraging departments to require statistics training, 2. Tailoring coursework to the students’ fields of research, and 3. Developing tools and strategies to promote education and dissemination of statistical knowledge. We also provide a list of statistical considerations that should be addressed in statistics education for basic scientists.</p></div
Statistics usage and education in physiology.
<p><b>A:</b> A recent systematic review [<a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002430#pbio.1002430.ref004" target="_blank">4</a>] demonstrated that 97.2% of papers published in the top 25% of physiology journals included statistical analyses. <b>B:</b> Statistics courses are not always required for PhD students in top NIH funded physiology departments. Detailed methodology for panels A and B are described in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002430#pbio.1002430.s003" target="_blank">S1 Text</a>.</p