Search CORE

46,299 research outputs found

Evaluating probabilistic forecasts with scoringRules

Author: Jordan Alexander
Krüger Fabian
Lerch Sebastian
Publication venue
Publication date: 30/07/2018
Field of study

Probabilistic forecasts in the form of probability distributions over future events have become popular in several fields including meteorology, hydrology, economics, and demography. In typical applications, many alternative statistical models and data sources can be used to produce probabilistic forecasts. Hence, evaluating and selecting among competing methods is an important task. The scoringRules package for R provides functionality for comparative evaluation of probabilistic models based on proper scoring rules, covering a wide range of situations in applied work. This paper discusses implementation and usage details, presents case studies from meteorology and economics, and points to the relevant background literature

arXiv.org e-Print Archive

Journal of Statistical Software

Bern Open Repository and Information System (BORIS)

Recommended from our members

How to discriminate between computer-aided and computer-hindered decisions: a case study in mammography

Author: Alberdi E.
Ayton P.
Povyakalo A. A.
Strigini L.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2013
Field of study

Background. Computer aids can affect decisions in complex ways, potentially even making them worse; common assessment methods may miss these effects. We developed a method for estimating the quality of decisions, as well as how computer aids affect it, and applied it to computer-aided detection (CAD) of cancer, reanalyzing data from a published study where 50 professionals (“readers”) interpreted 180 mammograms, both with and without computer support. Method. We used stepwise regression to estimate how CAD affected the probability of a reader making a correct screening decision on a patient with cancer (sensitivity), thereby taking into account the effects of the difficulty of the cancer (proportion of readers who missed it) and the reader’s discriminating ability (Youden’s determinant). Using regression estimates, we obtained thresholds for classifying a posteriori the cases (by difficulty) and the readers (by discriminating ability). Results. Use of CAD was associated with a 0.016 increase in sensitivity (95% confidence interval [CI], 0.003–0.028) for the 44 least discriminating radiologists for 45 relatively easy, mostly CAD-detected cancers. However, for the 6 most discriminating radiologists, with CAD, sensitivity decreased by 0.145 (95% CI, 0.034–0.257) for the 15 relatively difficult cancers. Conclusions. Our exploratory analysis method reveals unexpected effects. It indicates that, despite the original study detecting no significant average effect, CAD helped the less discriminating readers but hindered the more discriminating readers. Such differential effects, although subtle, may be clinically significant and important for improving both computer algorithms and protocols for their use. They should be assessed when evaluating CAD and similar warning systems

City Research Online

Crossref

White Rose Research Online

Assessment of Source Code Obfuscation Techniques

Author: Basile Cataldo
Ceccato Mariano
Regano Leonardo
Tiella Roberto
Tonella Paolo
Torchiano Marco
Viticchié Alessio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Obfuscation techniques are a general category of software protections widely adopted to prevent malicious tampering of the code by making applications more difficult to understand and thus harder to modify. Obfuscation techniques are divided in code and data obfuscation, depending on the protected asset. While preliminary empirical studies have been conducted to determine the impact of code obfuscation, our work aims at assessing the effectiveness and efficiency in preventing attacks of a specific data obfuscation technique - VarMerge. We conducted an experiment with student participants performing two attack tasks on clear and obfuscated versions of two applications written in C. The experiment showed a significant effect of data obfuscation on both the time required to complete and the successful attack efficiency. An application with VarMerge reduces by six times the number of successful attacks per unit of time. This outcome provides a practical clue that can be used when applying software protections based on data obfuscation.Comment: Post-print, SCAM 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Catalogo dei prodotti della ricerca

Archivio istituzionale della ricerca - Università di Cagliari

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Used-habitat calibration plots: a new procedure for validating species distribution, resource selection, and step-selection models

Author: ArchMiller Althea A.
Fieberg John R.
Forester James D.
Johnson Douglas H.
Matthiopoulos Jason
Street Garrett M.
Publication venue: 'Wiley'
Publication date: 01/05/2018
Field of study

“Species distribution modeling” was recently ranked as one of the top five “research fronts” in ecology and the environmental sciences by ISI's Essential Science Indicators (Renner and Warton 2013), reflecting the importance of predicting how species distributions will respond to anthropogenic change. Unfortunately, species distribution models (SDMs) often perform poorly when applied to novel environments. Compounding on this problem is the shortage of methods for evaluating SDMs (hence, we may be getting our predictions wrong and not even know it). Traditional methods for validating SDMs quantify a model's ability to classify locations as used or unused. Instead, we propose to focus on how well SDMs can predict the characteristics of used locations. This subtle shift in viewpoint leads to a more natural and informative evaluation and validation of models across the entire spectrum of SDMs. Through a series of examples, we show how simple graphical methods can help with three fundamental challenges of habitat modeling: identifying missing covariates, non-linearity, and multicollinearity. Identifying habitat characteristics that are not well-predicted by the model can provide insights into variables affecting the distribution of species, suggest appropriate model modifications, and ultimately improve the reliability and generality of conservation and management recommendations

Crossref

Enlighten

An Exploratory Study of Patient Falls

Author: Coto Jeffrey A
Wilder Coleen
Publication venue: ValpoScholar
Publication date: 01/01/2016
Field of study

Debate continues between the contribution of education level and clinical expertise in the nursing practice environment. Research suggests a link between Baccalaureate of Science in Nursing (BSN) nurses and positive patient outcomes such as lower mortality, decreased falls, and fewer medication errors. Purpose: To examine if there a negative correlation between patient falls and the level of nurse education at an urban hospital located in Midwest Illinois during the years 2010-2014? Methods: A retrospective crosssectional cohort analysis was conducted using data from the National Database of Nursing Quality Indicators (NDNQI) from the years 2010-2014. Sample: Inpatients aged ≥ 18 years who experienced a unintentional sudden descent, with or without injury that resulted in the patient striking the floor or object and occurred on inpatient nursing units. Results: The regression model was constructed with annual patient falls as the dependent variable and formal education and a log transformed variable for percentage of certified nurses as the independent variables. The model overall is a good fit, F (2,22) = 9.014, p = .001, adj. R2 = .40. Conclusion: Annual patient falls will decrease by increasing the number of nurses with baccalaureate degrees and/or certifications from a professional nursing board-governing body

Valparaiso University

Using Random Forests to Describe Equity in Higher Education: A Critical Quantitative Analysis of Utah’s Postsecondary Pipelines

Author: McDaniel Tyler
Publication venue: Digital Commons @ Butler University
Publication date: 16/04/2018
Field of study

The following work examines the Random Forest (RF) algorithm as a tool for predicting student outcomes and interrogating the equity of postsecondary education pipelines. The RF model, created using longitudinal data of 41,303 students from Utah\u27s 2008 high school graduation cohort, is compared to logistic and linear models, which are commonly used to predict college access and success. Substantially, this work finds High School GPA to be the best predictor of postsecondary GPA, whereas commonly used ACT and AP test scores are not nearly as important. Each model identified several demographic disparities in higher education access, most significantly the effects of individual-level economic disadvantage. District- and school-level factors such as the proportion of Low Income students and the proportion of Underrepresented Racial Minority (URM) students were important and negatively associated with postsecondary success. Methodologically, the RF model was able to capture non-linearity in the predictive power of school- and district-level variables, a key finding which was undetectable using linear models. The RF algorithm outperforms logistic models in prediction of student enrollment, performs similarly to linear models in prediction of postsecondary GPA, and excels both models in its descriptions of non-linear variable relationships. RF provides novel interpretations of data, challenges conclusions from linear models, and has enormous potential to further the literature around equity in postsecondary pipelines

Digital Commons @ Butler University

Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer

Author: Aerts Hugo J. W. L.
Furstoss Christophe
Kay-Rivest Emily
Khaouam Nader
Liem Xavier
Naqa Issam El
Nguyen-Tan Phuc Felix
Perrin Léo Jean
Seuntjens Jan
Sultanem Khalil
Vallières Martin
Wang Chang-Shu
Publication venue
Publication date: 24/03/2017
Field of study

Quantitative extraction of high-dimensional mineable data from medical images is a process known as radiomics. Radiomics is foreseen as an essential prognostic tool for cancer risk assessment and the quantification of intratumoural heterogeneity. In this work, 1615 radiomic features (quantifying tumour image intensity, shape, texture) extracted from pre-treatment FDG-PET and CT images of 300 patients from four different cohorts were analyzed for the risk assessment of locoregional recurrences (LR) and distant metastases (DM) in head-and-neck cancer. Prediction models combining radiomic and clinical variables were constructed via random forests and imbalance-adjustment strategies using two of the four cohorts. Independent validation of the prediction and prognostic performance of the models was carried out on the other two cohorts (LR: AUC = 0.69 and CI = 0.67; DM: AUC = 0.86 and CI = 0.88). Furthermore, the results obtained via Kaplan-Meier analysis demonstrated the potential of radiomics for assessing the risk of specific tumour outcomes using multiple stratification groups. This could have important clinical impact, notably by allowing for a better personalization of chemo-radiation treatments for head-and-neck cancer patients from different risk groups.Comment: (1) Paper: 33 pages, 4 figures, 1 table; (2) SUPP info: 41 pages, 7 figures, 8 table

arXiv.org e-Print Archive

Harvard University - DASH

Directory of Open Access Journals