164 research outputs found
Comparison of the Bayesian and Randomised Decision Tree Ensembles within an Uncertainty Envelope Technique
Copyright © 2006 Springer. The final publication is available at link.springer.comMultiple Classifier Systems (MCSs) allow evaluation of the uncertainty of classification outcomes that is of crucial importance for safety critical applications. The uncertainty of classification is determined by a trade-off between the amount of data available for training, the classifier diversity and the required performance. The interpretability of MCSs can also give useful information for experts responsible for making reliable classifications. For this reason Decision Trees (DTs) seem to be attractive classification models for experts. The required diversity of MCSs exploiting such classification models can be achieved by using two techniques, the Bayesian model averaging and the randomised DT ensemble. Both techniques have revealed promising results when applied to real-world problems. In this paper we experimentally compare the classification uncertainty of the Bayesian model averaging with a restarting strategy and the randomised DT ensemble on a synthetic dataset and some domain problems commonly used in the machine learning community. To make the Bayesian DT averaging feasible, we use a Markov Chain Monte Carlo technique. The classification uncertainty is evaluated within an Uncertainty Envelope technique dealing with the class posterior distribution and a given confidence probability. Exploring a full posterior distribution, this technique produces realistic estimates which can be easily interpreted in statistical terms. In our experiments we found out that the Bayesian DTs are superior to the randomised DT ensembles within the Uncertainty Envelope technique
Experimental Comparison of Classification Uncertainty for Randomised and Bayesian Decision Tree Ensembles
Copyright © 2004 Springer-Verlag Berlin Heidelberg. The final publication is available at link.springer.comBook title: Intelligent Data Engineering and Automated Learning – IDEAL 20045th International Conference on Intelligent Data Engineering and Automated Learning – IDEAL 2004, Exeter, UK. August 25-27, 2004In this paper we experimentally compare the classification uncertainty of the randomised Decision Tree (DT) ensemble technique and the Bayesian DT technique with a restarting strategy on a synthetic dataset as well as on some datasets commonly used in the machine learning community. For quantitative evaluation of classification uncertainty, we use an Uncertainty Envelope dealing with the class posterior distribution and a given confidence probability. Counting the classifier outcomes, this technique produces feasible evaluations of the classification uncertainty. Using this technique in our experiments, we found that the Bayesian DT technique is superior to the randomised DT ensemble technique
Bayesian averaging over Decision Tree models for trauma severity scoring
Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the “gold” standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions
The Analysis and Application of Artificial Neural Networks for Early Warning Systems in Hydrology and the Environment
Final PhD thesis submissionArtificial Neural Networks (ANNs) have been comprehensively researched, both from a computer scientific perspective and with regard to their use for predictive modelling in a wide variety of applications including hydrology and the environment. Yet their adoption for live, real-time systems remains on the whole sporadic and experimental. A plausible hypothesis is that this may be at least in part due to their treatment heretofore as “black boxes” that implicitly contain something that is unknown, or even unknowable. It is understandable that many of those responsible for delivering Early Warning Systems (EWS) might not wish to take the risk of implementing solutions perceived as containing unknown elements, despite the computational advantages that ANNs offer.
This thesis therefore builds on existing efforts to open the box and develop tools and techniques that visualise, analyse and use ANN weights and biases especially from the viewpoint of neural pathways from inputs to outputs of feedforward networks. In so doing, it aims to demonstrate novel approaches to self-improving predictive model construction for both regression and classification problems. This includes Neural Pathway Strength Feature Selection (NPSFS), which uses ensembles of ANNs trained on differing subsets of data and analysis of the learnt weights to infer degrees of relevance of the input features and so build simplified models with reduced input feature sets.
Case studies are carried out for prediction of flooding at multiple nodes in urban drainage networks located in three urban catchments in the UK, which demonstrate rapid, accurate prediction of flooding both for regression and classification. Predictive skill is shown to reduce beyond the time of concentration of each sewer node, when actual rainfall is used as input to the models.
Further case studies model and predict statutory bacteria count exceedances for bathing water quality compliance at 5 beaches in Southwest England. An illustrative case study using a forest fires dataset from the UCI machine learning repository is also included. Results from these model ensembles generally exhibit improved performance, when compared with single ANN models. Also ensembles with reduced input feature sets, using NPSFS, demonstrate as good or improved performance when compared with the full feature set models.
Conclusions are drawn about a new set of tools and techniques, including NPSFS and visualisation techniques for inspection of ANN weights, the adoption of which it is hoped may lead to improved confidence in the use of ANN for live real-time EWS applications.EPSRCUKWIRThe Environment Agenc
Information technologies for pain management
Millions of people around the world suffer from pain, acute or chronic and this raises the
importance of its screening, assessment and treatment. The importance of pain is attested by
the fact that it is considered the fifth vital sign for indicating basic bodily functions, health
and quality of life, together with the four other vital signs: blood pressure, body
temperature, pulse rate and respiratory rate. However, while these four signals represent an
objective physical parameter, the occurrence of pain expresses an emotional status that
happens inside the mind of each individual and therefore, is highly subjective that makes
difficult its management and evaluation. For this reason, the self-report of pain is considered
the most accurate pain assessment method wherein patients should be asked to periodically
rate their pain severity and related symptoms. Thus, in the last years computerised systems
based on mobile and web technologies are becoming increasingly used to enable patients to
report their pain which lead to the development of electronic pain diaries (ED). This approach
may provide to health care professionals (HCP) and patients the ability to interact with the
system anywhere and at anytime thoroughly changes the coordinates of time and place and
offers invaluable opportunities to the healthcare delivery. However, most of these systems
were designed to interact directly to patients without presence of a healthcare professional
or without evidence of reliability and accuracy. In fact, the observation of the existing
systems revealed lack of integration with mobile devices, limited use of web-based interfaces
and reduced interaction with patients in terms of obtaining and viewing information. In
addition, the reliability and accuracy of computerised systems for pain management are
rarely proved or their effects on HCP and patients outcomes remain understudied.
This thesis is focused on technology for pain management and aims to propose a monitoring
system which includes ubiquitous interfaces specifically oriented to either patients or HCP
using mobile devices and Internet so as to allow decisions based on the knowledge obtained
from the analysis of the collected data. With the interoperability and cloud computing
technologies in mind this system uses web services (WS) to manage data which are stored in a
Personal Health Record (PHR).
A Randomised Controlled Trial (RCT) was implemented so as to determine the effectiveness
of the proposed computerised monitoring system. The six weeks RCT evidenced the
advantages provided by the ubiquitous access to HCP and patients so as to they were able to
interact with the system anywhere and at anytime using WS to send and receive data. In
addition, the collected data were stored in a PHR which offers integrity and security as well
as permanent on line accessibility to both patients and HCP. The study evidenced not only
that the majority of participants recommend the system, but also that they recognize it
suitability for pain management without the requirement of advanced skills or experienced users. Furthermore, the system enabled the definition and management of patient-oriented
treatments with reduced therapist time. The study also revealed that the guidance of HCP at
the beginning of the monitoring is crucial to patients' satisfaction and experience stemming
from the usage of the system as evidenced by the high correlation between the
recommendation of the application, and it suitability to improve pain management and to
provide medical information. There were no significant differences regarding to
improvements in the quality of pain treatment between intervention group and control group.
Based on the data collected during the RCT a clinical decision support system (CDSS) was
developed so as to offer capabilities of tailored alarms, reports, and clinical guidance. This
CDSS, called Patient Oriented Method of Pain Evaluation System (POMPES), is based on the
combination of several statistical models (one-way ANOVA, Kruskal-Wallis and Tukey-Kramer)
with an imputation model based on linear regression. This system resulted in fully accuracy
related to decisions suggested by the system compared with the medical diagnosis, and
therefore, revealed it suitability to manage the pain. At last, based on the aerospace systems
capability to deal with different complex data sources with varied complexities and
accuracies, an innovative model was proposed. This model is characterized by a qualitative
analysis stemming from the data fusion method combined with a quantitative model based on
the comparison of the standard deviation together with the values of mathematical
expectations. This model aimed to compare the effects of technological and pen-and-paper
systems when applied to different dimension of pain, such as: pain intensity, anxiety,
catastrophizing, depression, disability and interference. It was observed that pen-and-paper
and technology produced equivalent effects in anxiety, depression, interference and pain
intensity. On the contrary, technology evidenced favourable effects in terms of
catastrophizing and disability. The proposed method revealed to be suitable, intelligible, easy
to implement and low time and resources consuming. Further work is needed to evaluate the
proposed system to follow up participants for longer periods of time which includes a
complementary RCT encompassing patients with chronic pain symptoms. Finally, additional
studies should be addressed to determine the economic effects not only to patients but also
to the healthcare system
Approaches to quantifying and reducing uncertainty in GCMs over Southern Africa
Includes abstract.Includes bibliographical references (p.127-131).Global Circulation Models (GCMs) are the primary tool for simulating future climate changes. These models by necessity make use of various assumptions and simplifications due to computational constraints, and in so doing introduce biases and systematic error. Along with other sources of uncertainty regarding our understanding of the climate system and given the quasi-chaotic nature of the climate, climate projections differ between models whose climate simulation skill is poorly quantified. A new methodology is presented to assess the regional biases in GCMs and to, in part, compensate for some aspects of these biases. The study will focus on the Southern African region but could be replicated for other regions. Using Self-Organising Maps (SOMs), synoptic archetypal patterns are identified and the distribution and frequency of these patterns assessed. The use of synoptic archetypes to quantify model metrics presents a novel approach with many benefits over standard metrics, such as errors and means per variable. SOMs add a spatial and multi-variable dimension to the analysis as each metric is calculated based on its synoptic circulation pattern and associated to a set of atmospheric variables. Some persistent biases in the models are notable based on comparisons between the NCEP and GCM SOM node mapping, such as an overall cool bias in the models and a shift of the dominant high pressure cells and thus the westerly wave to the south. The weighting techniques provide insight into how much of the model bias is contributed by differences in synoptic frequency and what part is attributable to systematic biases in the models which result in a different mean state for a given synoptic process. The frequency correction enabled a correction of up to 25% of the difference between model and reanalysis data, but in most cases the change was far smaller than this. The differences in mean conditions remained the largest component of the bias. To correct for this the weighting was applied to the climate change anomaly (difference between future and control projections) per synoptic process to create a multi-model climate change component that is added to the NCEP baseline. This provides the most accurate depiction of future climate from the data provided. The models generally have different strengths, therefore the weighted multi-model solution allows models to give a greater contribution where they are skilful and less where they do not match the observed dynamics. Comparison of the magnitude of the climate change signal showed that winter states in the weighted multi-model composite had a smaller temperature increase and reduced rainfall compared to the unweighted results. In summer states there is greater warming and increased rainfall, especially over the oceans. This suggests the models are over estimating changes in temperature in winter and underestimating the increases in summer. Synoptic events are the primary driver of climate change impacts. Therefore errors in synoptic state will have a notable influence on the climate change projections and need to be fully considered in any climate change impact study. The use of the weighting technique helped to identify and reduce uncertainties in the climate change projections over Southern Africa
- …