93 research outputs found
Optimal Bayes Classifiers for Functional Data and Density Ratios
Bayes classifiers for functional data pose a challenge. This is because
probability density functions do not exist for functional data. As a
consequence, the classical Bayes classifier using density quotients needs to be
modified. We propose to use density ratios of projections on a sequence of
eigenfunctions that are common to the groups to be classified. The density
ratios can then be factored into density ratios of individual functional
principal components whence the classification problem is reduced to a sequence
of nonparametric one-dimensional density estimates. This is an extension to
functional data of some of the very earliest nonparametric Bayes classifiers
that were based on simple density ratios in the one-dimensional case. By means
of the factorization of the density quotients the curse of dimensionality that
would otherwise severely affect Bayes classifiers for functional data can be
avoided. We demonstrate that in the case of Gaussian functional data, the
proposed functional Bayes classifier reduces to a functional version of the
classical quadratic discriminant. A study of the asymptotic behavior of the
proposed classifiers in the large sample limit shows that under certain
conditions the misclassification rate converges to zero, a phenomenon that has
been referred to as "perfect classification". The proposed classifiers also
perform favorably in finite sample applications, as we demonstrate in
comparisons with other functional classifiers in simulations and various data
applications, including wine spectral data, functional magnetic resonance
imaging (fMRI) data for attention deficit hyperactivity disorder (ADHD)
patients, and yeast gene expression data
DEVELOPMENTS IN NONPARAMETRIC REGRESSION METHODS WITH APPLICATION TO RAMAN SPECTROSCOPY ANALYSIS
Raman spectroscopy has been successfully employed in the classification of breast pathologies involving basis spectra for chemical constituents of breast tissue and resulted in high sensitivity (94%) and specificity (96%) (Haka et al, 2005). Motivated by recent developments in nonparametric regression, in this work, we adapt stacking, boosting, and dynamic ensemble learning into a nonparametric regression framework with application to Raman spectroscopy analysis for breast cancer diagnosis. In Chapter 2, we apply compound estimation (Charnigo and Srinivasan, 2011) in Raman spectra analysis to classify normal, benign, and malignant breast tissue. We explore both the spectra profiles and their derivatives to differentiate different types of breast tissue. In Chapters 3-5 of this dissertation, we develop a novel paradigm for incorporating ensemble learning classification methodology into a nonparametric regression framework. Specifically, in Chapter 3 we set up modified stacking framework and combine different classifiers together to make better predictions in nonparametric regression settings. In Chapter 4 we develop a method by incorporating a modified AdaBoost algorithm in nonparametric regression settings to improve classification accuracy. In Chapter 5 we propose a dynamic ensemble integration based on multiple meta-learning strategies for nonparametric regression based classification. In Chapter 6, we revisit the Raman spectroscopy data in Chapter 2, and make improvements based on the developments of the methods from Chapter 3 to Chapter 4. Finally we summarize the major findings and contributions of this work as well as identify opportunities for future research and their public health implications
Optimal Estimation of Derivatives in Nonparametric Regression
Abstract We propose a simple framework for estimating derivatives without fitting the regression function in nonparametric regression. Unlike most existing methods that use the symmetric difference quotients, our method is constructed as a linear combination of observations. It is hence very flexible and applicable to both interior and boundary points, including most existing methods as special cases of ours. Within this framework, we define the variance-minimizing estimators for any order derivative of the regression function with a fixed bias-reduction level. For the equidistant design, we derive the asymptotic variance and bias of these estimators. We also show that our new method will, for the first time, achieve the asymptotically optimal convergence rate for difference-based estimators. Finally, we provide an effective criterion for selection of tuning parameters and demonstrate the usefulness of the proposed method through extensive simulation studies of the firstand second-order derivative estimators
Bayesian hierarchical modelling of growth curve derivatives via sequences of quotient differences
Growth curve studies are typically conducted to evaluate differences between group or treatment-specific curves. Most analyses focus solely on the growth curves, but it has been argued that the derivative of growth curves can highlight differences between groups that may be masked when considering the raw curves only. Motivated by the desire to estimate derivative curves hierarchically, we introduce a new sequence of quotient differences (empirical derivatives) which, among other things, are well behaved near the boundaries compared with other sequences in the literature. Using the sequence of quotient differences, we develop a Bayesian method to estimate curve derivatives in a multilevel setting (a common scenario in growth studies) and show ow the method can be used to estimate individual and group derivative curves and to make comparisons. We apply the new methodology to data collected from a study conducted to explore the effect that radiation-based therapies have on growth in female children diagnosed with acute lymphoblastic leukaemia
On a Projection Estimator of the Regression Function Derivative
In this paper, we study the estimation of the derivative of a regression
function in a standard univariate regression model. The estimators are defined
either by derivating nonparametric least-squares estimators of the regression
function or by estimating the projection of the derivative. We prove two simple
risk bounds allowing to compare our estimators. More elaborate bounds under a
stability assumption are then provided. Bases and spaces on which we can
illustrate our assumptions and first results are both of compact or non compact
type, and we discuss the rates reached by our estimators. They turn out to be
optimal in the compact case. Lastly, we propose a model selection procedure and
prove the associated risk bound. To consider bases with a non compact support
makes the problem difficult.Comment: 29 pages, 4 figure
Classification of functional data: a weighted distance approach
A popular approach for classifying functional data is based on the distances from the function or its derivatives to group representative (usually the mean) functions or their derivatives. In this paper, we propose using a combination of those
distances. Simulation studies show that our procedure performs very well, resulting
in smaller testing classication errors. Applications to real data show that our
procedure performs as well as –and in some cases better than– other classication
methods
- …