171 research outputs found

    Découpage de courbes de densité : Application au dépistage du cancer

    Get PDF
    International audienceLe dépistage actuel du cancer broncho-pulmonaire est effectué à l'aide d'une radiographie pulmonaire, d'un scanner thoracique et d'un examen cytologique des expectorations. La cytologie automatisée des expectorations est une méthode permettant l'analyse informatique des cellules d'un crachat sur la lame d'un microscope. Comme une personne est représentée par l'ensemble des cellules de sa lame, il nous a paru intéressant d'utiliser la densité de probabilité comme unité statistique. La modélisation fonctionnelle des données, méthode pour laquelle l'unité statistique est à valeurs dans un espace infini, répond bien à cette problématique statistique puisque, par définition, une densité de probabilité est une fonction. Lors de cet exposé nous présenterons la méthode de classification supervisée de courbes de densité que nous avons développée, pour discriminer des personnes ayant un cancer et des personnes saines, et nous vous donnerons quelques résultats issus de données réelles

    Weak pointwise consistency of the cross validatory window estimate in non parametric regression estimation

    Get PDF

    Advances on nonparametric regression for functional variables

    Full text link
    We consider the problem of predicting a real random variable from a functional explanatory variable. The problem is attacked by mean of nonparametric kernel approach which has been recently adapted to this functional context. We derive theoretical results by giving a deep asymptotic study of the behaviour of the estimate, including mean squared convergence (with rates and precise evaluation of the constant terms) as well as asymptotic distribution. Practical use of these results are relying on the ability to estimate these constants. Some perspectives in this direction are discussed including the presentation of a functional version of bootstrapping ideas

    A kkNN procedure in semiparametric functional data analysis

    Full text link
    A fast and flexible kkNN procedure is developed for dealing with a semiparametric functional regression model involving both partial-linear and single-index components. Rates of uniform consistency are presented. Simulated experiments highlight the advantages of the kkNN procedure. A real data analysis is also shown.Comment: 14 pages, 1 figure, 6 table

    Utilisation de tests de structure en régression sur variable fonctionnelle.

    Get PDF
    International audienceCe travail s'intéresse à la construction et à l'utilisation de tests de structure en régression sur variable fonctionnelle. Nous proposons, de manière générale, de construire notre statistique de test à partir d'un estimateur spécifique au modèle particulier dont nous voulons tester la validité et de méthodes d'estimation à noyau fonctionnel. Un résultat théorique montre, sous des hypothèses générales, la normalité asymptotique de notre statistique de test sous l'hypothèse nulle (c'est à dire lorsque l'hypothèse sur la structure du modèle est valide) et sa divergence sous des alternatives locales. Ce résultat permet d'envisager la construction de tests de structure de nature très variée permettant par exemple de tester si la variable explicative n'a pas d'effet, si cet effet est linéaire, ou bien si l'effet de la variable explicative fonctionnelle se résume par l'effet de quelques caractéristiques réelles associées à celle-ci. Différentes méthodes de rééchantillonnage sont proposées pour calculer la valeur seuil du test. La méthode la plus adaptée (au vu de simulations) est ensuite utilisée dans le cadre de l'étude de données spectrométriques. L'utilisation de différents tests construits à partir de l'approche que nous proposons permet d'apporter des éléments de réponses à des questions concrètes liées à ces données. Nous discutons finalement les points qui peuvent être améliorés et présentons brièvement des perspectives intéressantes qu'offre l'utilisation de tests de structure dans le cadre de procédures s'intéressant à l'extraction de caractéristiques importantes pour la prédiction au sein de la courbe explicative mais aussi au choix de la semi-métrique

    Choosing the most relevant level sets for depicting a sample of densities

    Get PDF
    The final publication is available at link.springer.comWhen exploring a sample composed with a set of bivariate density functions, the question of the visualisation of the data has to front with the choice of the relevant level set(s). The approach proposed in this paper consists in defining the optimal level set(s) as being the one(s) allowing for the best reconstitution of the whole density. A fully data-driven procedure is developed in order to estimate the link between the level set(s) and their corresponding density, to construct optimal level set(s) and to choose automatically the number of relevant level set(s). The method is based on recent advances in functional data analysis when both response and predictors are functional. After a wide description of the methodology, finite sample studies are presented (including both real and simulated data) while theoretical studies are reported to a final appendix.Peer ReviewedPostprint (author's final draft

    Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

    Full text link
    This paper aims to front with dimensionality reduction in regression setting when the predictors are a mixture of functional variable and high-dimensional vector. A flexible model, combining both sparse linear ideas together with semiparametrics, is proposed. A wide scope of asymptotic results is provided: this covers as well rates of convergence of the estimators as asymptotic behaviour of the variable selection procedure. Practical issues are analysed through finite sample simulated experiments while an application to Tecator's data illustrates the usefulness of our methodology.Comment: 40 pages, 7 figures, 5 table

    Fast and efficient algorithms for sparse semiparametric bi-functional regression

    Full text link
    A new sparse semiparametric model is proposed, which incorporates the influence of two functional random variables in a scalar response in a flexible and interpretable manner. One of the functional covariates is included through a single-index structure, while the other is included linearly through the high-dimensional vector formed by its discretised observations. For this model, two new algorithms are presented for selecting relevant variables in the linear part and estimating the model. Both procedures utilise the functional origin of linear covariates. Finite sample experiments demonstrated the scope of application of both algorithms: the first method is a fast algorithm that provides a solution (without loss in predictive ability) for the significant computational time required by standard variable selection methods for estimating this model, and the second algorithm completes the set of relevant linear covariates provided by the first, thus improving its predictive efficiency. Some asymptotic results theoretically support both procedures. A real data application demonstrated the applicability of the presented methodology from a predictive perspective in terms of the interpretability of outputs and low computational cost.Comment: 33 pages, 6 figures, 10 table

    Variable selection in functional regression models: a review

    Full text link
    Despite of various similar features, Functional Data Analysis and High-Dimensional Data Analysis are two major fields in Statistics that grew up recently almost independently one from each other. The aim of this paper is to propose a survey on methodological advances for variable selection in functional regression, which is typically a question for which both functional and multivariate ideas are crossing. More than a simple survey, this paper aims to promote even more new links between both areas.Comment: 22 page

    Optimal level sets for bivariate density representation

    Get PDF
    In bivariate density representation there is an extensive literature on level set estimation when the level is fixed, but this is not so much the case when choosing which level is (or which levels are) of most interest. This is an important practical question which depends on the kind of problem one has to deal with as well as the kind of feature one wishes to highlight in the density, the answer to which requires both the definition of what the optimal level is and the construction of a method for finding it. We consider two scenarios for this problem. The first one corresponds to situations in which one has just a single density function to be represented. However, as a result of the technical progress in data collecting, problems are emerging in which one has to deal with a sample of densities. In these situations, the need arises to develop joint representation for all these densities, and this is the second scenario considered in this paper. For each case, we provide consistency results for the estimated levels and present wide Monte Carlo simulated experiments illustrating the interest and feasibility of the proposed method. (C) 2015 Elsevier Inc. All rights reserved.Peer ReviewedPostprint (author's final draft
    • …
    corecore