23,506 research outputs found

    The detection of globular clusters in galaxies as a data mining problem

    Get PDF
    We present an application of self-adaptive supervised learning classifiers derived from the Machine Learning paradigm, to the identification of candidate Globular Clusters in deep, wide-field, single band HST images. Several methods provided by the DAME (Data Mining & Exploration) web application, were tested and compared on the NGC1399 HST data described in Paolillo 2011. The best results were obtained using a Multi Layer Perceptron with Quasi Newton learning rule which achieved a classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% of contamination. An extensive set of experiments revealed that the use of accurate structural parameters (effective radius, central surface brightness) does improve the final result, but only by 5%. It is also shown that the method is capable to retrieve also extreme sources (for instance, very extended objects) which are missed by more traditional approaches.Comment: Accepted 2011 December 12; Received 2011 November 28; in original form 2011 October 1

    Lecture notes on ridge regression

    Full text link
    The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to constrained estimation, and its relation to Bayesian regression. Finally, its behaviour and use are illustrated in simulation and on omics data. Subsequently, ridge regression is generalized to allow for a more general penalty. The ridge penalization framework is then translated to logistic regression and its properties are shown to carry over. To contrast ridge penalized estimation, the final chapter introduces its lasso counterpart

    Local coherence and deflation of the low quark modes in lattice QCD

    Get PDF
    The spontaneous breaking of chiral symmetry in QCD is known to be linked to a non-zero density of eigenvalues of the massless Dirac operator near the origin. Numerical studies of two-flavour QCD now suggest that the low quark modes are locally coherent to a certain extent. As a consequence, the modes can be simultaneously deflated, using local projectors, with a total computational effort proportional to the lattice volume (rather than its square). Deflation has potentially many uses in lattice QCD. The technique is here worked out for the case of quark propagator calculations, where large speed-up factors and a flat scaling behaviour with respect to the quark mass are achieved.Comment: Plain TeX, 23 pages, 4 figures included; minor text modifications; version published in JHE

    Cumulative sum quality control charts design and applications

    Get PDF
    Includes bibliographical references (pages 165-169).Classical Statistical Process Control Charts are essential in Statistical Control exercises and thus constantly obtained attention for quality improvements. However, the establishment of control charts requires large-sample data (say, no less than I 000 data points). On the other hand, we notice that the small-sample based Grey System Theory Approach is well-established and applied in many areas: social, economic, industrial, military and scientific research fields. In this research, the short time trend curve in terms of GM( I, I) model will be merged into Shewhart and CU SUM two-sided version control charts and establish Grey Predictive Shewhart Control chart and Grey Predictive CUSUM control chart. On the other hand the GM(2, I) model is briefly checked its of how accurate it could be as compared to GM( I, 1) model in control charts. Industrial process data collected from TBF Packaging Machine Company in Taiwan was analyzed in terms of these new developments as an illustrative example for grey quality control charts

    Atmospheric extinction properties above Mauna Kea from the Nearby Supernova Factory spectro-photometric data set

    Full text link
    We present a new atmospheric extinction curve for Mauna Kea spanning 3200--9700 \AA. It is the most comprehensive to date, being based on some 4285 standard star spectra obtained on 478 nights spread over a period of 7 years obtained by the Nearby SuperNova Factory using the SuperNova Integral Field Spectrograph. This mean curve and its dispersion can be used as an aid in calibrating spectroscopic or imaging data from Mauna Kea, and in estimating the calibration uncertainty associated with the use of a mean extinction curve. Our method for decomposing the extinction curve into physical components, and the ability to determine the chromatic portion of the extinction even on cloudy nights, is described and verified over the wide range of conditions sampled by our large dataset. We demonstrate good agreement with atmospheric science data obtain at nearby Mauna Loa Observatory, and with previously published measurements of the extinction above Mauna Kea.Comment: 22 pages, 24 figures, 6 table

    Data comparison schemes for Pattern Recognition in Digital Images using Fractals

    Get PDF
    Pattern recognition in digital images is a common problem with application in remote sensing, electron microscopy, medical imaging, seismic imaging and astrophysics for example. Although this subject has been researched for over twenty years there is still no general solution which can be compared with the human cognitive system in which a pattern can be recognised subject to arbitrary orientation and scale. The application of Artificial Neural Networks can in principle provide a very general solution providing suitable training schemes are implemented. However, this approach raises some major issues in practice. First, the CPU time required to train an ANN for a grey level or colour image can be very large especially if the object has a complex structure with no clear geometrical features such as those that arise in remote sensing applications. Secondly, both the core and file space memory required to represent large images and their associated data tasks leads to a number of problems in which the use of virtual memory is paramount. The primary goal of this research has been to assess methods of image data compression for pattern recognition using a range of different compression methods. In particular, this research has resulted in the design and implementation of a new algorithm for general pattern recognition based on the use of fractal image compression. This approach has for the first time allowed the pattern recognition problem to be solved in a way that is invariant of rotation and scale. It allows both ANNs and correlation to be used subject to appropriate pre-and post-processing techniques for digital image processing on aspect for which a dedicated programmer's work bench has been developed using X-Designer
    • …
    corecore