1,117 research outputs found
Classification of geometrical objects by integrating currents and functional data analysis. An application to a 3D database of Spanish child population
This paper focuses on the application of Discriminant Analysis to a set of
geometrical objects (bodies) characterized by currents. A current is a relevant
mathematical object to model geometrical data, like hypersurfaces, through
integration of vector fields along them. As a consequence of the choice of a
vector-valued Reproducing Kernel Hilbert Space (RKHS) as a test space to
integrate hypersurfaces, it is possible to consider that hypersurfaces are
embedded in this Hilbert space. This embedding enables us to consider
classification algorithms of geometrical objects. A method to apply Functional
Discriminant Analysis in the obtained vector-valued RKHS is given. This method
is based on the eigenfunction decomposition of the kernel. So, the novelty of
this paper is the reformulation of a size and shape classification problem in
Functional Data Analysis terms using the theory of currents and vector-valued
RKHS. This approach is applied to a 3D database obtained from an anthropometric
survey of the Spanish child population with a potential application to online
sales of children's wear
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Eigenvector-based Dimensionality Reduction for Human Activity Recognition and Data Classification
In the context of appearance-based human motion compression, representation, and recognition, we have proposed a robust framework based on the eigenspace technique. First, the new appearance-based template matching approach which we named Motion Intensity Image for compressing a human motion video into a simple and concise, yet very expressive representation. Second, a learning strategy based on the eigenspace technique is employed for dimensionality reduction using each of PCA and FDA, while providing maximum data variance and maximum class separability, respectively. Third, a new compound eigenspace is introduced for multiple directed motion recognition that takes care also of the possible changes in scale. This method extracts two more features that are used to control the recognition process. A similarity measure, based on Euclidean distance, has been employed for matching dimensionally-reduced testing templates against a projected set of known motions templates. In the stream of nonlinear classification, we have introduced a new eigenvector-based recognition model, built upon the idea of the kernel technique. A practical study on the use of the kernel technique with 18 different functions has been carried out. We have shown in this study how crucial choosing the right kernel function is, for the success of the subsequent linear discrimination in the feature space for a particular problem. Second, building upon the theory of reproducing kernels, we have proposed a new robust nonparametric discriminant analysis approach with kernels. Our proposed technique can efficiently find a nonparametric kernel representation where linear discriminants can perform better. Data classification is achieved by integrating the linear version of the NDA with the kernel mapping. Based on the kernel trick, we have provided a new formulation for Fisher\u27s criterion, defined in terms of the Gram matrix only
Can we identify non-stationary dynamics of trial-to-trial variability?"
Identifying sources of the apparent variability in non-stationary scenarios is a fundamental problem in many biological data analysis settings. For instance, neurophysiological responses to the same task often vary from each repetition of the same experiment (trial) to the next. The origin and functional role of this observed variability is one of the fundamental questions in neuroscience. The nature of such trial-to-trial dynamics however remains largely elusive to current data analysis approaches. A range of strategies have been proposed in modalities such as electro-encephalography but gaining a fundamental insight into latent sources of trial-to-trial variability in neural recordings is still a major challenge. In this paper, we present a proof-of-concept study to the analysis of trial-to-trial variability dynamics founded on non-autonomous dynamical systems. At this initial stage, we evaluate the capacity of a simple statistic based on the behaviour of trajectories in classification settings, the trajectory coherence, in order to identify trial-to-trial dynamics. First, we derive the conditions leading to observable changes in datasets generated by a compact dynamical system (the Duffing equation). This canonical system plays the role of a ubiquitous model of non-stationary supervised classification problems. Second, we estimate the coherence of class-trajectories in empirically reconstructed space of system states. We show how this analysis can discern variations attributable to non-autonomous deterministic processes from stochastic fluctuations. The analyses are benchmarked using simulated and two different real datasets which have been shown to exhibit attractor dynamics. As an illustrative example, we focused on the analysis of the rat's frontal cortex ensemble dynamics during a decision-making task. Results suggest that, in line with recent hypotheses, rather than internal noise, it is the deterministic trend which most likely underlies the observed trial-to-trial variability. Thus, the empirical tool developed within this study potentially allows us to infer the source of variability in in-vivo neural recordings
Generalized Linear Models for Geometrical Current predictors. An application to predict garment fit
The aim of this paper is to model an ordinal response variable in terms
of vector-valued functional data included on a vector-valued RKHS. In particular,
we focus on the vector-valued RKHS obtained when a geometrical object (body) is
characterized by a current and on the ordinal regression model. A common way to
solve this problem in functional data analysis is to express the data in the orthonormal
basis given by decomposition of the covariance operator. But our data present very important differences with respect to the usual functional data setting. On the one
hand, they are vector-valued functions, and on the other, they are functions in an
RKHS with a previously defined norm. We propose to use three different bases: the
orthonormal basis given by the kernel that defines the RKHS, a basis obtained from
decomposition of the integral operator defined using the covariance function, and a
third basis that combines the previous two. The three approaches are compared and
applied to an interesting problem: building a model to predict the fit of children’s
garment sizes, based on a 3D database of the Spanish child population. Our proposal
has been compared with alternative methods that explore the performance of other
classifiers (Suppport Vector Machine and k-NN), and with the result of applying
the classification method proposed in this work, from different characterizations of
the objects (landmarks and multivariate anthropometric measurements instead of
currents), obtaining in all these cases worst results
Positive Definite Kernels in Machine Learning
This survey is an introduction to positive definite kernels and the set of
methods they have inspired in the machine learning literature, namely kernel
methods. We first discuss some properties of positive definite kernels as well
as reproducing kernel Hibert spaces, the natural extension of the set of
functions associated with a kernel defined
on a space . We discuss at length the construction of kernel
functions that take advantage of well-known statistical models. We provide an
overview of numerous data-analysis methods which take advantage of reproducing
kernel Hilbert spaces and discuss the idea of combining several kernels to
improve the performance on certain tasks. We also provide a short cookbook of
different kernels which are particularly useful for certain data-types such as
images, graphs or speech segments.Comment: draft. corrected a typo in figure
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Nearest Neighbor Discriminant Analysis Based Face Recognition Using Ensembled Gabor Features
Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Bilişim Enstitüsü, 2009Thesis (M.Sc.) -- İstanbul Technical University, Institute of Informatics, 2009Son yıllarda, ışık varyasyonlarına ve yüz ifade değişikliklerine karşı gürbüz olduğu üzere yüz tanıma alanında Gabor öznitelikleri tabanlı yüz temsil etme çok umut vaad edici sonuç vermiştir. Seçilen uzamsal frekans, uzamsal lokalizasyon ve yönelime göre yerel yapıyı hesaplaması, elle işaretlendirmeye ihtiyaç duymaması Gabor özniteliklerini efektif yapan özellikleridir. Bu tez çalışmasındaki katkı, Gabor süzgeçleri ve En Yakın Komşu Ayrışım Analizi'nin (EYKAA) güçlerini birleştirerek önemli ayrışım öznitelikleri ortaya çıkaran Gabor En Yakın Komşu Sınıflandırıcısı (GEYKS) genişletip Parçalı Gabor En Yakın Komşu Sınıflandırıcısı (PGEYKS) metodunu ortaya koymaktır. PGEYKS; alçaltılmış gabor öznitelikleri barındıran farklı segmanları kullanarak, her biri ayrı dizayn edilen birçok EYKAA tabanlı bileşen sınıflandırıcılarını bir araya getiren grup sınıflandırıcısıdır. Tüm gabor özniteliklerinin alçaltılmış boyutu tek bir EYKAA bileşeninden çıkarıldığı gibi, PGEYKS; ayrışım bilgi kaybını minimum yapıp 3S (yetersiz örnek miktarı) problemini önleyerek alçaltılmış gabor öznitelikleri içindeki ayrıştırabilirliği daha iyi kullanır. PGEYKS yönteminin tanıma başarımı karşılaştırmalı performans çalışması ile gösterilmiştir. Farklı ışıklandırma ve yüz ifadesi deişiklikleri barındıran 200 sınıflık FERET veritabanı alt kümesinde, 65 öznitelik için PGEYKS %100 başarım elde ederek atası olan GEYKS'nın aldığı %98 başarısını ve diğer GFS (Gabor Fisher Sınıflandırıcı) ve GTS (Gabor Temel Sınıflandırıcı) gibi standard methodlardan daha iyi sonuçlar vermiştir. Ayrıca YALE veritabanı üzerindeki testlerde PGEYKS her türlü (k, alpha) çiftleri için GEYKS'ten daha başarılıdır ve 14 öznitelik için step size = 5, k = 5, alpha = 3 parametlerinde %96 tanıma başarısına ulaşmıştır.In last decades, Gabor features based face representation performed very promising results in face recognition area as its robust to variations due to illumination and facial expression changes. The properties of Gabor are, which makes it effective, it computes the local structure corresponding to spatial frequency (scale), spatial localization, and orientation selectivity and no need for manual annotations. The contribution of this thesis, an Ensemble based Gabor Nearest Neighbor Classifier (EGNNC) method is proposed extending Gabor Nearest Neighbor Classifier (GNNC) where GNNC extracts important discriminant features both utilizing the power of Gabor filters and Nearest Neighbor Discriminant Analysis (NNDA). EGNNC is an ensemble classifier combining multiple NNDA based component classifiers designed respectively using different segments of the reduced Gabor feature. Since reduced dimension of the entire Gabor feature is extracted by one component NNDA classifier, EGNNC has better use of the discriminability implied in reduced Gabor features by the avoiding 3S (small sample size) problem as making minimum loss of discriminative information. The accuracy of the EGNNC is shown by comparative performance work. Using a 200 class subset of FERET database covering illumination and expression variations, EGNNC achieved 100% recognition rate, outperforming its ancestor GNNC perform 98 percent as well as standard methods such GFC and GPC for 65 features. Also for the YALE database, EGNNC outperformed GNNC on all (k, alpha) tuples and EGNNC reaches 96 percent accuracy in 14 feature dimension, along with parameters step size = 5, k = 5, alpha = 3.Yüksek LisansM.Sc
- …