1,228 research outputs found

    An Improved Method for Bandwidth Selection when Estimating ROC Curves

    Get PDF
    The receiver operating characteristic (ROC) curve is used to describe the performance of a diagnostic test which classifies observations into two groups. We introduce a new method for selecting bandwidths when computing kernel estimates of ROC curves. Our technique allows for interaction between the distributions of each group of observations and gives substantial improvement in MISE over other proposed methods, especially when the two distributions are very different.Bandwidth selection; binary classification; kernel estimator; ROC curve

    A Bayesian Approach to Graphical Record Linkage and De-duplication

    Full text link
    We propose an unsupervised approach for linking records across arbitrarily many files, while simultaneously detecting duplicate records within files. Our key innovation involves the representation of the pattern of links between records as a bipartite graph, in which records are directly linked to latent true individuals, and only indirectly linked to other records. This flexible representation of the linkage structure naturally allows us to estimate the attributes of the unique observable people in the population, calculate transitive linkage probabilities across records (and represent this visually), and propagate the uncertainty of record linkage into later analyses. Our method makes it particularly easy to integrate record linkage with post-processing procedures such as logistic regression, capture-recapture, etc. Our linkage structure lends itself to an efficient, linear-time, hybrid Markov chain Monte Carlo algorithm, which overcomes many obstacles encountered by previously record linkage approaches, despite the high-dimensional parameter space. We illustrate our method using longitudinal data from the National Long Term Care Survey and with data from the Italian Survey on Household and Wealth, where we assess the accuracy of our method and show it to be better in terms of error rates and empirical scalability than other approaches in the literature.Comment: 39 pages, 8 figures, 8 tables. Longer version of arXiv:1403.0211, In press, Journal of the American Statistical Association: Theory and Methods (2015

    Non Parametric Confidence Intervals for Receiver Operating Characteristic Curves

    Get PDF
    We study methods for constructing confidence intervals, and confidence bands, for estimators of receiver operating characteristics. Particular emphasis is placed on the way in which smoothing should be implemented, when estimating either the characteristic itself or its variance. We show that substantial undersmoothing is necessary if coverage properties are not to be impaired. A theoretical analysis of the problem suggests an empirical, plug-in rule for bandwidth choice, optimising the coverage accuracy of interval estimators. The performance of this approach is explored. Our preferred technique is based on asymptotic approximation, rather than a more sophisticated approach using the bootstrap, since the latter requires a multiplicity of smoothing parameters all of which must be chosen in nonstandard ways. It is shown that the asymptotic method can give very good performance.Bandwidth selection, binary classification, kernel estimator, receiver operating characteristic curve.

    Development and Validation of the Robust - Pandemic Coping Scale (R-PCS)

    Get PDF
    The psychological consequences of epidemics/pandemics, such as the COVID-19 pandemic, include an increase in psychopathological symptoms, such as depression, anxiety, and stress, and negative emotions, such as fear. However, relatively little attention has been paid to how people cope with the pandemic. Coping is a multi-component process, helping to diminish the traumatic impact of stressful events in a variety of ways. We studied how university students coped with the first wave of the COVID-19 pandemic, by developing the Robust - Pandemic Coping Scale (R-PCS), a new scale for measuring coping strategies related to epidemics/pandemics. The scale is based on a classification of coping strategies referred to the needs of competence, relatedness, and autonomy. To create a robust scale, such that the item values would be independent of the sample used for developing it, we employed Rasch modeling. We used a sample of 2,987 Italian university students who participated in an online survey including the R-PCS and the Power to Live with Disasters Questionnaire (PLDQ), during March 2020. First, we applied a dual approach combining exploratory and confirmatory factor analyses, which supported the goodness of a 4-factor model (i.e., Despair, Adjustment, Proactivity, and Aversion) for the R-PCS, invariant across gender and age of respondents (younger or as old as 23 years, older than 23 years). We then transformed the raw scores of the R-PCS into interval logit scale scores applying the Rasch model. Second, our findings supported the discriminant validity and the criterion validity of the R-PCS, examining the correlations with the PLDQ. They also confirmed its predictive validity: the R-PCS scores were related to 2-month-later enjoyment and anger, indicating that Adjustment and Proactivity were adaptive while Despair and Aversion were maladaptive. Third, our study revealed gender and age differences: the scores were higher for Despair, Adjustment, and Proactivity for females; for Aversion for males; and for Proactivity for students older than 23 years. The study suffers from limitations related to social desirability, gender imbalance, and self-selection effects in the recruitment

    Internal tides in a dendritic submarine canyon

    Get PDF
    Submarine canyons are a common geomorphological feature along continental slopes worldwide and often found to be ‘hotspots’ of internal tide activity. However, the majority of well-studied submarine canyons are simple linear incisions or have meandering morphology; internal tide energetics in branching (dendritic) canyons has not previously been investigated. Here we present a high-resolution (500-m) numerical modelling study of the internal tide within Whittard Canyon, a large, dendritic submarine canyon system that incises the Celtic Sea continental slope. A modified version of the Princeton Ocean Model is used to simulate the M2 (semidiurnal) internal tide in the Whittard Canyon region, verified against a hydrographic dataset collected by an autonomous ocean glider. Much of the internal tide energy entering Whittard Canyon originates to the southeast, along the Celtic Sea shelf break. Internal tide generation also occurs within the canyon itself, but is in part compensated by areas of negative energy conversion. Depth-integrated internal tide energy fluxes exceed 8 kW m−1 in the eastern limb of the canyon. The internal tide is topographically steered through the major limbs and along-canyon energy flux is bottom intensified, suggesting topographic focusing. The down canyon extent of bottom intensification closely corresponds to the point that along-canyon slope becomes near-critical to the semidiurnal internal tide. Energetically, the multiple limbs of Whittard Canyon behave differently, some are net sources of internal tide energy whilst others are net sinks. Internal tide energy dissipation also varies between the canyon limbs; bulk dissipation rates are 2.1-7.7 × 10−8 W kg−1 . In addition, the effect of bathymetric resolution on internal tide generation and propagation is investigated by progressively smoothing the model domain. Decreasing the bathymetric resolution reduces internal tide generation and energy dissipation in both Whittard Canyon and the model domain as a whole, however, internal tide energy flux into the canyon is not consistently changed. At least 1.5-km resolution bathymetry is required to adequately resolve the semidiurnal internal tide field in this region of complex topography

    Diaries or questionnaires for collecting self-reported healthcare utilisation and patient cost data? CHERE Project Report No 20

    Get PDF
    The literature comparing diaries and questionnaires was reviewed in order to identify the most appropriate method of collecting patient self-reported data, on health service utilisation and out-ofpocket costs, for a longitudinal study. Nine published studies met the review inclusion criteria; four compared the diary method with a self-completed questionnaire and five with an interviewer administered questionnaire. None of the eligible studies measured patient costs, and only two measured some aspects of health service utilisation. Most of the studies reported higher response rates for questionnaires than for diaries, and there was some evidence of selection bias. There was a tendency to report more symptoms, symptom intensity or health care utilisation by questionnaires compared to diaries, and compared to physician reports (included in only two studies). The review provides some information about the two approaches for collecting self-reported data, but does not provide sufficient evidence to favour either approach.diaries, health care utilisation
    • …
    corecore