6 research outputs found
Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations
We generalize the well-known mixtures of Gaussians approach to density
estimation and the accompanying Expectation--Maximization technique for finding
the maximum likelihood parameters of the mixture to the case where each data
point carries an individual -dimensional uncertainty covariance and has
unique missing data properties. This algorithm reconstructs the
error-deconvolved or "underlying" distribution function common to all samples,
even when the individual data points are samples from different distributions,
obtained by convolving the underlying distribution with the heteroskedastic
uncertainty distribution of the data point and projecting out the missing data
directions. We show how this basic algorithm can be extended with conjugate
priors on all of the model parameters and a "split-and-merge" procedure
designed to avoid local maxima of the likelihood. We demonstrate the full
method by applying it to the problem of inferring the three-dimensional
velocity distribution of stars near the Sun from noisy two-dimensional,
transverse velocity measurements from the Hipparcos satellite.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS439 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The velocity distribution of nearby stars from Hipparcos data I. The significance of the moving groups
We present a three-dimensional reconstruction of the velocity distribution of
nearby stars (<~ 100 pc) using a maximum likelihood density estimation
technique applied to the two-dimensional tangential velocities of stars. The
underlying distribution is modeled as a mixture of Gaussian components. The
algorithm reconstructs the error-deconvolved distribution function, even when
the individual stars have unique error and missing-data properties. We apply
this technique to the tangential velocity measurements from a kinematically
unbiased sample of 11,865 main sequence stars observed by the Hipparcos
satellite. We explore various methods for validating the complexity of the
resulting velocity distribution function, including criteria based on Bayesian
model selection and how accurately our reconstruction predicts the radial
velocities of a sample of stars from the Geneva-Copenhagen survey (GCS). Using
this very conservative external validation test based on the GCS, we find that
there is little evidence for structure in the distribution function beyond the
moving groups established prior to the Hipparcos mission. This is in sharp
contrast with internal tests performed here and in previous analyses, which
point consistently to maximal structure in the velocity distribution. We
quantify the information content of the radial velocity measurements and find
that the mean amount of new information gained from a radial velocity
measurement of a single star is significant. This argues for complementary
radial velocity surveys to upcoming astrometric surveys
Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences
Recurrent neural networks are powerful models for sequential data, able to
represent complex dependencies in the sequence that simpler models such as
hidden Markov models cannot handle. Yet they are notoriously hard to train.
Here we introduce a training procedure using a gradient ascent in a Riemannian
metric: this produces an algorithm independent from design choices such as the
encoding of parameters and unit activities. This metric gradient ascent is
designed to have an algorithmic cost close to backpropagation through time for
sparsely connected networks. We use this procedure on gated leaky neural
networks (GLNNs), a variant of recurrent neural networks with an architecture
inspired by finite automata and an evolution equation inspired by
continuous-time networks. GLNNs trained with a Riemannian gradient are
demonstrated to effectively capture a variety of structures in synthetic
problems: basic block nesting as in context-free grammars (an important feature
of natural languages, but difficult to learn), intersections of multiple
independent Markov-type relations, or long-distance relationships such as the
distant-XOR problem. This method does not require adjusting the network
structure or initial parameters: the network used is a sparse random graph and
the initialization is identical for all problems considered.Comment: 4th version: some changes in notation, more experiment
PENERAPAN GRAY LEVEL CO-OCCURRENCE MATRIX (GLCM) DAN LEARNING VECTOR QUANTIZATION (LVQ) UNTUK KLASIFIKASI PENYAKIT RETINA MATA
Mata sebagai indera penglihatan merupakan organ vital yang mempunyai peranan sangat penting untuk menerima informasi visual, sebagai pengembangan diri dan kualitas hidup manusia, Hasil survei yang dilakukan tahun 1993 di Indonesia penderita penyakit mata yang mengakibatkan kebutaan mencapai 1,5%, sedangkan tahun 2003 mencapai angka 2,2% dan pada tahun 2007 dengan angka 1,67%. Angka-angka tersebut membuat Indonesia menjadi negara dengan tingkat kebutaan tertinggi kedua setelah Ethophia, Identifikasi diabetic retinophaty secara manual cenderung membutuhkan waktu yang lama dan memungkinkan terjadinya kesalahan dalam pengamatan, Penerapan Metode Gray Level Co-Occurrence Matrix (GLCM) sebagai ekstraksi ciri orde dua dapat mengambil ciri dari citra retina diabetic retinophaty dan penggunaan metode Generalized Learning Vector Quantization (GLVQ) dapat mengelompokkan hasil-hasil citra sehingga mempermudah klasifikasi penyakit diabetic retinophaty. Pada penelitian ini hasil yang di peroleh menunjukkan pembelajaran terbaik dilakukan dengan menggunakan α=0,01, maksimum epoch = 1000 dan mendapat akurasi terbesar pada persentasi 90%
Rejection and online learning with prototype-based classifiers in adaptive metrical spaces
Fischer L. Rejection and online learning with prototype-based classifiers in adaptive metrical spaces. Bielefeld: Universität Bielefeld; 2016.The rising amount of digital data, which is available in almost every domain, causes the need for intelligent, automated data processing. Classification models constitute particularly popular techniques from the machine learning domain with applications ranging from fraud detection up to advanced image classification tasks. Within this thesis, we will focus on so-called prototype-based classifiers as one prominent family of classifiers, since they offer a simple classification scheme, interpretability of the model in terms of prototypes, and good generalisation performance. We will face a few crucial questions which arise whenever such classifiers are used in real-life scenarios which require robustness and reliability of classification and the ability to deal with complex and possibly streaming data sets. Particularly, we will address the following problems:
- Deterministic prototype-based classifiers deliver a class label, but no confidence of the classification. The latter is particularly relevant whenever the costs of an error are higher than the costs to reject an example, e.g. in a safety critical system. We investigate ways to enhance prototype-based classifiers by a certainty measure which can efficiently be computed based on the given classifier only and which can be used to reject an unclear classification.
- For an efficient rejection, the choice of a suitable threshold is crucial. We investigate in which situations the performance of local rejection can surpass the choice of only a global one, and we propose efficient schemes how to optimally compute local thresholds on a given training set.
- For complex data and lifelong learning, the required classifier complexity can be unknown a priori. We propose an efficient, incremental scheme which adjusts the model complexity of a prototype-based classifier based on the certainty of the classification. Thereby, we put particular emphasis on the question how to adjust prototype locations and metric parameters, and how to insert and/or delete prototypes in an efficient way.
- As an alternative to the previous solution, we investigate a hybrid architecture which combines an offline classifier with an online classifier based on their certainty values, thus directly addressing the stability/plasticity dilemma. While this is straightforward for classical prototype-based schemes, it poses some challenges as soon as metric learning is integrated into the scheme due to the different inherent data representations.
- Finally, we investigate the performance of the proposed hybrid prototype-based classifier within a realistic visual road-terrain-detection scenario