2,771 research outputs found

    Classification using distance nearest neighbours

    Get PDF
    This paper proposes a new probabilistic classification algorithm using a Markov random field approach. The joint distribution of class labels is explicitly modelled using the distances between feature vectors. Intuitively, a class label should depend more on class labels which are closer in the feature space, than those which are further away. Our approach builds on previous work by Holmes and Adams (2002, 2003) and Cucala et al. (2008). Our work shares many of the advantages of these approaches in providing a probabilistic basis for the statistical inference. In comparison to previous work, we present a more efficient computational algorithm to overcome the intractability of the Markov random field model. The results of our algorithm are encouraging in comparison to the k-nearest neighbour algorithm.Comment: 12 pages, 2 figures. To appear in Statistics and Computin

    Fast calibrated additive quantile regression

    Full text link
    We propose a novel framework for fitting additive quantile regression models, which provides well calibrated inference about the conditional quantiles and fast automatic estimation of the smoothing parameters, for model structures as diverse as those usable with distributional GAMs, while maintaining equivalent numerical efficiency and stability. The proposed methods are at once statistically rigorous and computationally efficient, because they are based on the general belief updating framework of Bissiri et al. (2016) to loss based inference, but compute by adapting the stable fitting methods of Wood et al. (2016). We show how the pinball loss is statistically suboptimal relative to a novel smooth generalisation, which also gives access to fast estimation methods. Further, we provide a novel calibration method for efficiently selecting the 'learning rate' balancing the loss with the smoothing priors during inference, thereby obtaining reliable quantile uncertainty estimates. Our work was motivated by a probabilistic electricity load forecasting application, used here to demonstrate the proposed approach. The methods described here are implemented by the qgam R package, available on the Comprehensive R Archive Network (CRAN)

    Geo-additive models of Childhood Undernutrition in three Sub-Saharan African Countries

    Get PDF
    We investigate the geographical and socioeconomic determinants of childhood undernutrition in Malawi, Tanzania and Zambia, three neighboring countries in Southern Africa using the 1992 Demographic and Health Surveys. We estimate models of undernutrition jointly for the three countries to explore regional patterns of undernutrition that transcend boundaries, while allowing for country-specific interactions. We use semiparametric models to flexibly model the effects of selected so-cioeconomic covariates and spatial effects. Our spatial analysis is based on a flexible geo-additive model using the district as the geographic unit of anal-ysis, which allows to separate smooth structured spatial effects from random effect. Inference is fully Bayesian and uses recent Markov chain Monte Carlo techniques. While the socioeconomic determinants generally confirm what is known in the literature, we find distinct residual spatial patterns that are not explained by the socioeconomic determinants. In particular, there appears to be a belt run-ning from Southern Tanzania to Northeastern Zambia which exhibits much worse undernutrition, even after controlling for socioeconomic effects. These effects do transcend borders between the countries, but to a varying degree. These findings have important implications for targeting policy as well as the search for left-out variables that might account for these residual spatial patterns

    A comparison of block and semi-parametric bootstrap methods for variance estimation in spatial statistics

    Get PDF
    Efron (1979) introduced the bootstrap method for independent data but it cannot be easily applied to spatial data because of their dependency. For spatial data that are correlated in terms of their locations in the underlying space the moving block bootstrap method is usually used to estimate the precision measures of the estimators. The precision of the moving block bootstrap estimators is related to the block size which is difficult to select. In the moving block bootstrap method also the variance estimator is underestimated. In this paper, first the semi-parametric bootstrap is used to estimate the precision measures of estimators in spatial data analysis. In the semi-parametric bootstrap method, we use the estimation of the spatial correlation structure. Then, we compare the semi-parametric bootstrap with a moving block bootstrap for variance estimation of estimators in a simulation study. Finally, we use the semi-parametric bootstrap to analyze the coal-ash data

    Twelve (not so) angry men: jurors work better in small groups. Lorraine Hope and Bridget Waller propose a simple modification to jury deliberations

    Get PDF
    Twelve-person juries are often regarded as one of the cornerstones of democracy. In the UK, the right to a trial by jury is considered an important feature of the criminal justice system. Indeed, it has been rated as more important than a number of other rights, including the right to protest against the government, the right not to be detained for an extended period without charge and the right to free speech in public (Roberts and Hough, 2009). The public also trusts juries comprising randomly selected ordinary people and relies on the contribution of 12 individuals to eliminate bias and prejudice from the decision making process

    Classifying shape of internal pores within AlSi10Mg alloy manufactured by laser powder bed fusion using 3D X-ray micro computed tomography : influence of processing parameters and heat treatment

    Get PDF
    The authors gratefully acknowledge the support provided by the EPSRC (grant EP/R021694/1). The authors also wish to thank Rosie Bird at the University of Aberdeen for assisting with Avizo.Peer reviewedPostprin

    Fast stable direct fitting and smoothness selection for Generalized Additive Models

    Get PDF
    Existing computationally efficient methods for penalized likelihood GAM fitting employ iterative smoothness selection on working linear models (or working mixed models). Such schemes fail to converge for a non-negligible proportion of models, with failure being particularly frequent in the presence of concurvity. If smoothness selection is performed by optimizing `whole model' criteria these problems disappear, but until now attempts to do this have employed finite difference based optimization schemes which are computationally inefficient, and can suffer from false convergence. This paper develops the first computationally efficient method for direct GAM smoothness selection. It is highly stable, but by careful structuring achieves a computational efficiency that leads, in simulations, to lower mean computation times than the schemes based on working-model smoothness selection. The method also offers a reliable way of fitting generalized additive mixed models

    Cleaning sky survey databases using Hough Transform and Renewal String approaches

    Get PDF
    Large astronomical databases obtained from sky surveys such as the SuperCOSMOS Sky Survey (SSS) invariably suffer from spurious records coming from artefactual effects of the telescope, satellites and junk objects in orbit around earth and physical defects on the photographic plate or CCD. Though relatively small in number these spurious records present a significant problem in many situations where they can become a large proportion of the records potentially of interest to a given astronomer. Accurate and robust techniques are needed for locating and flagging such spurious objects, and we are undertaking a programme investigating the use of machine learning techniques in this context. In this paper we focus on the four most common causes of unwanted records in the SSS: satellite or aeroplane tracks, scratches, fibres and other linear phenomena introduced to the plate, circular halos around bright stars due to internal reflections within the telescope and diffraction spikes near to bright stars. Appropriate techniques are developed for the detection of each of these. The methods are applied to the SSS data to develop a dataset of spurious object detections, along with confidence measures, which can allow these unwanted data to be removed from consideration. These methods are general and can be adapted to other astronomical survey data.Comment: Accepted for MNRAS. 17 pages, latex2e, uses mn2e.bst, mn2e.cls, md706.bbl, shortbold.sty (all included). All figures included here as low resolution jpegs. A version of this paper including the figures can be downloaded from http://www.anc.ed.ac.uk/~amos/publications.html and more details on this project can be found at http://www.anc.ed.ac.uk/~amos/sattrackres.htm
    corecore