7,596 research outputs found
Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources
In designed experiments and surveys, known laws or design feat ures provide
checks on the most relevant aspects of a model and identify the target
parameters. In contrast, in most observational studies in the health and social
sciences, the primary study data do not identify and may not even bound target
parameters. Discrepancies between target and analogous identified parameters
(biases) are then of paramount concern, which forces a major shift in modeling
strategies. Conventional approaches are based on conditional testing of
equality constraints, which correspond to implausible point-mass priors. When
these constraints are not identified by available data, however, no such
testing is possible. In response, implausible constraints can be relaxed into
penalty functions derived from plausible prior distributions. The resulting
models can be fit within familiar full or partial likelihood frameworks. The
absence of identification renders all analyses part of a sensitivity analysis.
In this view, results from single models are merely examples of what might be
plausibly inferred. Nonetheless, just one plausible inference may suffice to
demonstrate inherent limitations of the data. Points are illustrated with
misclassified data from a study of sudden infant death syndrome. Extensions to
confounding, selection bias and more complex data structures are outlined.Comment: Published in at http://dx.doi.org/10.1214/09-STS291 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Computational intelligence contributions to readmisision risk prediction in Healthcare systems
136 p.The Thesis tackles the problem of readmission risk prediction in healthcare systems from a machine learning and computational intelligence point of view. Readmission has been recognized as an indicator of healthcare quality with primary economic importance. We examine two specific instances of the problem, the emergency department (ED) admission and heart failure (HF) patient care using anonymized datasets from three institutions to carry real-life computational experiments validating the proposed approaches. The main difficulties posed by this kind of datasets is their high class imbalance ratio, and the lack of informative value of the recorded variables. This thesis reports the results of innovative class balancing approaches and new classification architectures
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
Towards greater accuracy in individual-tree mortality regression
Background mortality is an essential component of any forest growth and yield model. Forecasts of mortality contribute largely to the variability and accuracy of model predictions at the tree, stand and forest level. In the present study, I implement and evaluate state-of-the-art techniques to increase the accuracy of individual tree mortality models, similar to those used in many of the current variants of the Forest Vegetation Simulator, using data from North Idaho and Montana. The first technique addresses methods to correct for bias induced by measurement error typically present in competition variables. The second implements survival regression and evaluates its performance against the traditional logistic regression approach.
I selected the regression calibration (RC) algorithm as a good candidate for addressing the measurement error problem. Two logistic regression models for each species were fitted, one ignoring the measurement error, which is the “naïve” approach, and the other applying RC. The models fitted with RC outperformed the naïve models in terms of discrimination when the competition variable was found to be statistically significant. The effect of RC was more obvious where measurement error variance was large and for more shade-intolerant species. The process of model fitting and variable selection revealed that past emphasis on DBH as a predictor variable for mortality, while producing models with strong metrics of fit, may make models less generalizable.
The evaluation of the error variance estimator developed by Stage and Wykoff (1998), and core to the implementation of RC, in different spatial patterns and diameter distributions, revealed that the Stage and Wykoff estimate notably overestimated the true variance in all simulated stands, but those that are clustered. Results show a systematic bias even when all the assumptions made by the authors are guaranteed. I argue that this is the result of the Poisson-based estimate ignoring the overlapping area of potential plots around a tree. Effects, especially in the application phase, of the variance estimate justify suggested future efforts of improving the accuracy of the variance estimate.
The second technique implemented and evaluated is a survival regression model that accounts for the time dependent nature of variables, such as diameter and competition variables, and the interval-censored nature of data collected from remeasured plots. The performance of the model is compared with the traditional logistic regression model as a tool to predict individual tree mortality. Validation of both approaches shows that the survival regression approach discriminates better between dead and alive trees for all species.
In conclusion, I showed that the proposed techniques do increase the accuracy of individual tree mortality models, and are a promising first step towards the next generation of background mortality models. I have also identified the next steps to undertake in order to advance mortality models further
- …