1,282 research outputs found
The Bayesian Decision Tree Technique with a Sweeping Strategy
The uncertainty of classification outcomes is of crucial importance for many
safety critical applications including, for example, medical diagnostics. In
such applications the uncertainty of classification can be reliably estimated
within a Bayesian model averaging technique that allows the use of prior
information. Decision Tree (DT) classification models used within such a
technique gives experts additional information by making this classification
scheme observable. The use of the Markov Chain Monte Carlo (MCMC) methodology
of stochastic sampling makes the Bayesian DT technique feasible to perform.
However, in practice, the MCMC technique may become stuck in a particular DT
which is far away from a region with a maximal posterior. Sampling such DTs
causes bias in the posterior estimates, and as a result the evaluation of
classification uncertainty may be incorrect. In a particular case, the negative
effect of such sampling may be reduced by giving additional prior information
on the shape of DTs. In this paper we describe a new approach based on sweeping
the DTs without additional priors on the favorite shape of DTs. The
performances of Bayesian DT techniques with the standard and sweeping
strategies are compared on a synthetic data as well as on real datasets.
Quantitatively evaluating the uncertainty in terms of entropy of class
posterior probabilities, we found that the sweeping strategy is superior to the
standard strategy
Integrating the promotion of physical activity within a smoking cessation programme: Findings from collaborative action research in UK Stop Smoking Services
Background: Within the framework of collaborative action research, the aim was to explore the feasibility of
developing and embedding physical activity promotion as a smoking cessation aid within UK 6/7-week National
Health Service (NHS) Stop Smoking Services.
Methods: In Phase 1 three initial cycles of collaborative action research (observation, reflection, planning,
implementation and re-evaluation), in an urban Stop Smoking Service, led to the development of an integrated
intervention in which physical activity was promoted as a cessation aid, with the support of a theoretically based
self-help guide, and self monitoring using pedometers. In Phase 2 advisors underwent training and offered the
intervention, and changes in physical activity promoting behaviour and beliefs were monitored. Also, changes in
clients’ stage of readiness to use physical activity as a cessation aid, physical activity beliefs and behaviour and
physical activity levels were assessed, among those who attended the clinic at 4-week post-quit. Qualitative data
were collected, in the form of clinic observation, informal interviews with advisors and field notes.
Results: The integrated intervention emerged through cycles of collaboration as something quite different to
previous practice. Based on field notes, there were many positive elements associated with the integrated
intervention in Phase 2. Self-reported advisors’ physical activity promoting behaviour increased as a result of
training and adapting to the intervention. There was a significant advancement in clients’ stage of readiness to use physical activity as a smoking cessation aid.
Conclusions: Collaboration with advisors was key in ensuring that a feasible intervention was developed as an aid to smoking cessation. There is scope to further develop tailored support to increasing physical activity and
smoking cessation, mediated through changes in perceptions about the benefits of, and confidence to do physical activity
Optimising decision trees using multi-objective particle swarm optimisation
Copyright © 2009 Springer-Verlag Berlin Heidelberg. The final publication is available at link.springer.comBook title: Swarm Intelligence for Multi-objective Problems in Data MiningSummary.
Although conceptually quite simple, decision trees are still among the most popular classifiers applied to real-world problems. Their popularity is due to a number of factors – core among these is their ease of comprehension, robust performance and fast data processing capabilities. Additionally feature selection is implicit within the decision tree structure.
This chapter introduces the basic ideas behind decision trees, focusing on decision trees which only consider a rule relating to a single feature at a node (therefore making recursive axis-parallel slices in feature space to form their classification boundaries). The use of particle swarm optimization (PSO) to train near optimal decision trees is discussed, and PSO is applied both in a single objective formulation (minimizing misclassification cost), and multi-objective formulation (trading off misclassification rates across classes).
Empirical results are presented on popular classification data sets from the well-known UCI machine learning repository, and PSO is demonstrated as being fully capable of acting as an optimizer for trees on these problems. Results additionally support the argument that multi-objectification of a problem can improve uni-objective search in classification problems
Coexistence and critical behaviour in a lattice model of competing species
In the present paper we study a lattice model of two species competing for
the same resources. Monte Carlo simulations for d=1, 2, and 3 show that when
resources are easily available both species coexist. However, when the supply
of resources is on an intermediate level, the species with slower metabolism
becomes extinct. On the other hand, when resources are scarce it is the species
with faster metabolism that becomes extinct. The range of coexistence of the
two species increases with dimension. We suggest that our model might describe
some aspects of the competition between normal and tumor cells. With such an
interpretation, examples of tumor remission, recurrence and of different
morphologies are presented. In the d=1 and d=2 models, we analyse the nature of
phase transitions: they are either discontinuous or belong to the
directed-percolation universality class, and in some cases they have an active
subcritical phase. In the d=2 case, one of the transitions seems to be
characterized by critical exponents different than directed-percolation ones,
but this transition could be also weakly discontinuous. In the d=3 version,
Monte Carlo simulations are in a good agreement with the solution of the
mean-field approximation. This approximation predicts that oscillatory
behaviour occurs in the present model, but only for d>2. For d>=2, a steady
state depends on the initial configuration in some cases.Comment: 11 pages, 14 figure
Setting a precautionary catch limit for Antarctic krill
A revised precautionary catch limit for Antarctic krill (Euphausia superba) in the Scotia Sea of 4 million tons was recently adopted by the Commission for the Conservation of Antarctic Marine Living Resources (CCAMLR). The limit was based on a total biomass of 44.3 million tons, as estimated from an acoustic and net survey of krill across the Scotia Sea sector of the Southern Ocean, and a harvest rate of 9.1%, as determined from an analysis of the risks of exceeding defined conservation criteria. We caution, however, that before the fishery can expand to the 4-inillion-ton level it will be necessary to establish mechanisms to avoid concentration of fishing effort, particularly in proximity to colonies of land-breeding krill predators, and to consider the effects of krill immigrating into the region from multiple sources
A new method to measure Bowen ratios using high-resolution vertical dry and wet bulb temperature profiles
The Bowen ratio surface energy balance method is a relatively simple method
to determine the latent heat flux and the actual land surface evaporation.
The Bowen ratio method is based on the measurement of air temperature and
vapour pressure gradients. If these measurements are performed at only two
heights, correctness of data becomes critical. In this paper we present the
concept of a new measurement method to estimate the Bowen ratio based on
vertical dry and wet bulb temperature profiles with high spatial resolution.
A short field experiment with distributed temperature sensing (DTS) in a
fibre optic cable with 13 measurement points in the vertical was undertaken.
A dry and a wetted section of a fibre optic cable were suspended on a 6 m
high tower installed over a sugar beet trial plot near Pietermaritzburg
(South Africa). Using the DTS cable as a psychrometer, a near continuous
observation of vapour pressure and air temperature at 0.20 m intervals was
established. These data allowed the computation of the Bowen ratio with a
high spatial and temporal precision. The daytime latent and sensible heat
fluxes were estimated by combining the Bowen ratio values from the DTS-based
system with independent measurements of net radiation and soil heat flux. The
sensible heat flux, which is the relevant term to evaluate, derived from the
DTS-based Bowen ratio (BR-DTS) was compared with that derived from co-located
eddy covariance (<i>R</i><sup>2</sup> = 0.91), surface layer scintillometer
(<i>R</i><sup>2</sup> = 0.81) and surface renewal (<i>R</i><sup>2</sup> = 0.86) systems. By using
multiple measurement points instead of two, more confidence in the derived Bowen
ratio values is obtained
Visualising high-dimensional Pareto relationships in two-dimensional scatterplots
Copyright © 2013 Springer-Verlag Berlin Heidelberg. The final publication is availablevia the DOI in this recordBook title: Evolutionary Multi-Criterion Optimization7th International Conference on Evolutionary Multi-Criterion Optimization (EMO 2013), Sheffield, UK, March 19-22, 2013The codebase for this paper is available at https://github.com/fieldsend/emo_2013_vizIn this paper two novel methods for projecting high dimensional data into two dimensions for visualisation are introduced, which aim to limit the loss of dominance and Pareto shell relationships between solutions to multi-objective optimisation problems. It has already been shown that, in general, it is impossible to completely preserve the dominance relationship when mapping from a higher to a lower dimension – however, approaches that attempt this projection with minimal loss of dominance information are useful for a number of reasons. (1) They may represent the data to the user of a multi-objective optimisation problem in an intuitive fashion, (2) they may help provide insights into the relationships between solutions which are not immediately apparent through other visualisation methods, and (3) they may offer a useful visual medium for interactive optimisation. We are concerned here with examining (1) and (2), and developing relatively rapid methods to achieve visualisations, rather than generating an entirely new search/optimisation problem which has to be solved to achieve the visualisation– which may prove infeasible in an interactive environment for real time use. Results are presented on randomly generated data, and the search population of an optimiser as it progresses. Structural insights into the evolution of a set-based optimiser that can be derived from this visualisation are also discussed
Closed-Form Bayesian Inferences for the Logit Model via Polynomial Expansions
Articles in Marketing and choice literatures have demonstrated the need for
incorporating person-level heterogeneity into behavioral models (e.g., logit
models for multiple binary outcomes as studied here). However, the logit
likelihood extended with a population distribution of heterogeneity doesn't
yield closed-form inferences, and therefore numerical integration techniques
are relied upon (e.g., MCMC methods).
We present here an alternative, closed-form Bayesian inferences for the logit
model, which we obtain by approximating the logit likelihood via a polynomial
expansion, and then positing a distribution of heterogeneity from a flexible
family that is now conjugate and integrable. For problems where the response
coefficients are independent, choosing the Gamma distribution leads to rapidly
convergent closed-form expansions; if there are correlations among the
coefficients one can still obtain rapidly convergent closed-form expansions by
positing a distribution of heterogeneity from a Multivariate Gamma
distribution. The solution then comes from the moment generating function of
the Multivariate Gamma distribution or in general from the multivariate
heterogeneity distribution assumed.
Closed-form Bayesian inferences, derivatives (useful for elasticity
calculations), population distribution parameter estimates (useful for
summarization) and starting values (useful for complicated algorithms) are
hence directly available. Two simulation studies demonstrate the efficacy of
our approach.Comment: 30 pages, 2 figures, corrected some typos. Appears in Quantitative
Marketing and Economics vol 4 (2006), no. 2, 173--20
- …