10 research outputs found

    Representing functional data in reproducing Kernel Hilbert Spaces with applications to clustering and classification

    Get PDF
    Functional data are difficult to manage for many traditional statistical techniques given their very high (or intrinsically infinite) dimensionality. The reason is that functional data are essentially functions and most algorithms are designed to work with (low) finite-dimensional vectors. Within this context we propose techniques to obtain finitedimensional representations of functional data. The key idea is to consider each functional curve as a point in a general function space and then project these points onto a Reproducing Kernel Hilbert Space with the aid of Regularization theory. In this work we describe the projection method, analyze its theoretical properties and propose a model selection procedure to select appropriate Reproducing Kernel Hilbert spaces to project the functional data.Functional data, Reproducing, Kernel Hilbert Spaces, Regularization theory

    Time-Domain Joint Parameter Estimation of Chirp Signal Based on SVR

    Get PDF
    Parameter estimation of chirp signal, such as instantaneous frequency (IF), instantaneous frequency rate (IFR), and initial phase (IP), arises in many applications of signal processing. During the phase-based parameter estimation, a phase unwrapping process is needed to recover the phase information correctly and impact the estimation performance remarkably. Therefore, we introduce support vector regression (SVR) to predict the variation trend of instantaneous phase and unwrap phases efficiently. Even though with that being the case, errors still exist in phase unwrapping process because of its ambiguous phase characteristic. Furthermore, we propose an SVR-based joint estimation algorithm and make it immune to these error phases by means of setting the SVR's parameters properly. Our results show that, compared with the other three algorithms of chirp signal, not only does the proposed one maintain quality capabilities at low frequencies, but also improves accuracy at high frequencies and decreases the impact with the initial phase

    Practical selection of SVM parameters and noise estimation for SVM regression”, Neural

    Get PDF
    Abstract We investigate practical selection of hyper-parameters for support vector machines (SVM) regression (that is, 1-insensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than re-sampling approaches commonly used in SVM applications. In particular, we describe a new analytical prescription for setting the value of insensitive zone 1; as a function of training sample size. Good generalization performance of the proposed parameter selection is demonstrated empirically using several low-and high-dimensional regression problems. Further, we point out the importance of Vapnik's 1-insensitive loss for regression problems with finite samples. To this end, we compare generalization performance of SVM regression (using proposed selection of 1-values) with regression using 'least-modulus' loss Ă°1 ÂĽ 0Ăž and standard squared loss. These comparisons indicate superior generalization performance of SVM regression under sparse sample settings, for various types of additive noise.

    A Full Characterization of Excess Risk via Empirical Risk Landscape

    Full text link
    In this paper, we provide a unified analysis of the excess risk of the model trained by a proper algorithm with both smooth convex and non-convex loss functions. In contrast to the existing bounds in the literature that depends on iteration steps, our bounds to the excess risk do not diverge with the number of iterations. This underscores that, at least for smooth loss functions, the excess risk can be guaranteed after training. To get the bounds to excess risk, we develop a technique based on algorithmic stability and non-asymptotic characterization of the empirical risk landscape. The model obtained by a proper algorithm is proved to generalize with this technique. Specifically, for non-convex loss, the conclusion is obtained via the technique and analyzing the stability of a constructed auxiliary algorithm. Combining this with some properties of the empirical risk landscape, we derive converged upper bounds to the excess risk in both convex and non-convex regime with the help of some classical optimization results.Comment: 38page

    Evaluating the quality of remote sensing-based agricultural water productivity data

    Get PDF

    Model Selection and l1 Penalization for Individualized Treatment Rules.

    Full text link
    Because many illnesses show heterogeneous response to treatment, there is increasing interest in individualizing treatment to patients. An individualized treatment rule is a decision rule that recommends treatment according to patient characteristics. Assuming high clinical outcomes are favorable, we consider the use of clinical trial data in the construction of an individualized treatment rule leading to highest mean outcome. This is a difficult computational problem because the objective function is the expectation of a weighted indicator function that is non-concave in the parameters. To deal with the computational difficulty, we consider estimation based on minimization of a quadratic loss. This dissertation investigates model selection and L1 penalization techniques aiming to improve the quality of the quadratic loss minimization method. Note that there are frequently many pretreatment variables that may or may not be useful in constructing an optimal individualized treatment rule, yet cost and interpretability considerations imply that only a few variables should be used by the treatment rule. In the first approach we consider the use of an L1 penalty in addition to the quadratic loss. Furthermore, although the quadratic minimization approach reduces the computational difficulty, it may deviate from the goal of estimating the best individualized treatment rule since a different loss function is used. In the second approach, we consider the use of model selection techniques, where a treatment rule is obtained by minimizing the quadratic loss within each model and then a model is selected by maximizing the original objective function. To justify these two approaches, we provide finite sample upper bounds on the difference between the mean outcome due to the estimated individualized treatment rule and the mean response due to the optimal individualized treatment rule.Ph.D.StatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/77919/1/minqian_1.pd

    Principal component based system identification and its application to the study of cardiovascular regulation

    Get PDF
    Includes bibliographical references (p. 197-212).Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2004.(cont.) Our methods analyze the coupling between instantaneous lung volume and heart rate and, subsequently, derive representative indices of parasympathetic and sympathetic control based on physiological and experimental findings. The validity of each method is evaluated via experimental data collected following interventions with known effect on the parasympathetic or sympathetic control. With the above techniques, this thesis explores an important topic in the field of space medicine: effects of simulated microgravity on cardiac autonomic control and orthostatic intolerance (OI). Experimental data from a prolonged bed rest study (simulation of microgravity condition) are analyzed and the conclusions are: 1) prolonged bed rest may impair autonomic control of heart rate; 2) orthostatic intolerance after bed rest is associated with impaired sympathetic responsiveness; 3) there may be a pre-bed rest predisposition to the development of OI after bed rest. These findings may have significance for studying Earth-bound orthostatic hypotension as well as for designing effective countermeasures to post-flight OI. In addition, they also indicate the efficacy of our proposed methods for autonomic function quantification.System identification is an effective approach for the quantitative study of physiologic systems. It deals with the problem of building mathematical models based on observed data and enables a dynamical characterization of the underlying physiologic mechanisms specific to the individual being studied. In this thesis, we develop and validate a new linear time-invariant system identification approach which is based on a weighted-principal component regression (WPCR) method. An important feature of this approach is its asymptotic frequency-selective property in solving time-domain parametric system identification problems. Owing to this property, data-specific candidate models can be built by considering the dominant frequency components inherent in the input (and output) signals, which is advantageous when the signals are colored, as are most physiologic signals. The efficacy of this method in modeling open-loop and closed-loop systems is demonstrated with respect to simulated and experimental data. In conjunction with the WPCR-based system identification approach, we propose new methods to noninvasively quantify cardiac autonomic control. Such quantification is important in understanding basic pathophysiological mechanisms or in patient monitoring, treatment design and follow-up.by Xinshu Xiao.Ph.D

    Comments on: “Model Complexity Control for Regression Using VC Generalization Bounds”: IEEE TNN,

    No full text
    In [1], various model selection approaches were experimentally inter-compared; one of the considered model selection criteria was the Schwarz Information Criterion (SIC); however, SIC was incorrectly implemented. The same mistake has been repeated in other more recent papers. Here, we show why the SIC formula originally employed was wrong. We report instead the correct approach, which is well-known in statistics literature. We then show that the SIC performance is far better than the one described in [1], by repeating several experiments of the original paper. Nevertheless, we confirm that VC-based model selection is more powerful than SIC, especially for small samples.
    corecore