492 research outputs found

    Testing the order of a model

    Full text link
    This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein's lemma yields an optimal underestimation error exponent. The lemma also implies that the overestimation error exponent is necessarily trivial. Our tests admit nontrivial underestimation error exponents. The optimal underestimation error exponent is achieved in some situations. The overestimation error can decay exponentially with respect to a positive power of the number of observations. These results are proved under mild assumptions by relating the underestimation (resp. overestimation) error to large (resp. moderate) deviations of the log-likelihood process. In particular, it is not necessary that the classical Cram\'{e}r condition be satisfied; namely, the log⁥\log-densities are not required to admit every exponential moment. Three benchmark examples with specific difficulties (location mixture of normal distributions, abrupt changes and various regressions) are detailed so as to illustrate the generality of our results.Comment: Published at http://dx.doi.org/10.1214/009053606000000344 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Faster Rates for Policy Learning

    Full text link
    This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme

    Classification in postural style

    Get PDF
    This article contributes to the search for a notion of postural style, focusing on the issue of classifying subjects in terms of how they maintain posture. Longer term, the hope is to make it possible to determine on a case by case basis which sensorial information is prevalent in postural control, and to improve/adapt protocols for functional rehabilitation among those who show deficits in maintaining posture, typically seniors. Here, we specifically tackle the statistical problem of classifying subjects sampled from a two-class population. Each subject (enrolled in a cohort of 54 participants) undergoes four experimental protocols which are designed to evaluate potential deficits in maintaining posture. These protocols result in four complex trajectories, from which we can extract four small-dimensional summary measures. Because undergoing several protocols can be unpleasant, and sometimes painful, we try to limit the number of protocols needed for the classification. Therefore, we first rank the protocols by decreasing order of relevance, then we derive four plug-in classifiers which involve the best (i.e., more informative), the two best, the three best and all four protocols. This two-step procedure relies on the cutting-edge methodologies of targeted maximum likelihood learning (a methodology for robust and efficient inference) and super-learning (a machine learning procedure for aggregating various estimation procedures into a single better estimation procedure). A simulation study is carried out. The performances of the procedure applied to the real data set (and evaluated by the leave-one-out rule) go as high as an 87% rate of correct classification (47 out of 54 subjects correctly classified), using only the best protocol.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS542 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

    Get PDF
    We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli rewards and a variant of KL-UCB for both single-parameter exponential families and bounded, finitely supported rewards. We show these algorithms are asymptotically optimal, both in rateand leading problem-dependent constants, including in the thick margin setting where multiple arms fall on the decision boundary

    Practical targeted learning from large data sets by survey sampling

    Get PDF
    We address the practical construction of asymptotic confidence intervals for smooth (i.e., path-wise differentiable), real-valued statistical parameters by targeted learning from independent and identically distributed data in contexts where sample size is so large that it poses computational challenges. We observe some summary measure of all data and select a sub-sample from the complete data set by Poisson rejective sampling with unequal inclusion probabilities based on the summary measures. Targeted learning is carried out from the easier to handle sub-sample. We derive a central limit theorem for the targeted minimum loss estimator (TMLE) which enables the construction of the confidence intervals. The inclusion probabilities can be optimized to reduce the asymptotic variance of the TMLE. We illustrate the procedure with two examples where the parameters of interest are variable importance measures of an exposure (binary or continuous) on an outcome. We also conduct a simulation study and comment on its results. keywords: semiparametric inference; survey sampling; targeted minimum loss estimation (TMLE

    La microcirculation chez les patients en choc septique à deux objectifs de pression artérielle moyenne de 65 et 85 mmHg

    Get PDF
    Le choc septique est caractĂ©risĂ© par des anomalies macro et microcirculatoires. La technique de spectroscopie en proche infra-rouge (NIRS) permet d’approcher la microcirculation au niveau de l’éminence thĂ©nar. Cette technique mesure la saturation musculaire en oxygĂšne (StO2) et la pente de resaturation lors d’un test d’occlusion vasculaire qui possĂšde un intĂ©rĂȘt physiologique (indicateur de la rĂ©serve microcirculatoire) et pronostique. Cette Ă©tude avait pour objectif principal de comparer l’état microcirculatoire et en particulier la pente de resaturation des patients en choc septique Ă  deux niveaux de PAm cible : 65 mmHg et 85 mmHg. Vingt-deux patients ont Ă©tĂ© inclus Ă  la phase prĂ©coce d’un choc septique aprĂšs rĂ©tablissement d’une Pam Ă  65 mmHg. Nous avons trouvĂ© globalement une amĂ©lioration de la pente de resaturation de la StO2 Ă  85 mmHg de PAm vs. 65 mmHg (2,6 [1,5] vs. 3 [1,4] ; p=0,021). Il n’y avait pas de diffĂ©rence significative pour les autres variables microcirculatoires (StO2, pente d’occlusion, aire d’hyperhĂ©mie) aux deux niveaux de PAm. Toutefois chez certains patients, la pente de resaturation Ă©tait clairement meilleure (plus Ă©levĂ©e) Ă  65 mmHG de PAm vs. 85 mmHg. Nous n’avons pas pu identifier les caractĂ©ristiques de cette sous-population.Au total, cibler une PAm de 85 mmHg est globalement associĂ© Ă  un meilleur Ă©tat microcirculatoire qu’une PAm de 65 mmHg, surtout en terme de pente de resaturation de StO2. Il existe une forte variabilitĂ© interindividuelle plaidant en faveur d’une Ă©valuation personnalisĂ©e de la microcirculation afin de mieux dĂ©finir le niveau de PAm que doit cibler la rĂ©animation du choc septique par remplissage vasculaire et vasopresseurs

    La prévision des crues du bassin versant de l'Oued Dis (Sebaou) par la métode DPFT

    Get PDF
    La modĂ©lisation pluie-dĂ©bit dans le cas de la prĂ©vision des crues peut ĂȘtre Ă©tudiĂ©e par la mĂ©thode DPFT (diffĂ©rence premiĂšre de la fonction de transfert) qui est une extension de la mĂ©thode de l'hydrogramme unitaire. Contrairement aux autres mĂ©thodes, la mĂ©thode DPFT permet d'obtenir Ă  la fois la fonction de transfert Ă  travers sa diffĂ©rence premiĂšre (DPFT) et une sĂ©rie des pluies efficaces. L'avantage principal de la formulation en diffĂ©rences est la diminution de l'auto corrĂ©lation des dĂ©bits successifs et des coefficients de la fonction de transfert.L'algorithme de calcul procĂšde par itĂ©rations en rĂ©solvant alternativement un systĂšme multi-Ă©vĂ©nements qui identifie la fonction de transfert et un systĂšme de dĂ©convolution qui estime une sĂ©rie de pluies efficaces, cette fois-ci crue par crue. L'initialisation se fait Ă  l'aide des pluies brutes comme premiĂšre approximation des pluies efficaces puisque les rĂ©sultats ne dĂ©pendent que des variations des dĂ©bits. La convergence de l'algorithme est Ă©tablie aisĂ©ment lorsqu'on applique les diffĂ©rentes contraintes (positivitĂ© des ordonnĂ©es de la fonction de transfert et des pluies efficaces; normalisation de la fonction de transfert).La mĂ©thode DPFT ne nĂ©cessite donc que les mesures des pluies brutes et les dĂ©bits pour effectuer les identifications de la fonction de transfert et des pluies efficaces. Elle n'impose pas de prĂ©ciser la fonction de production. Une fois la fonction de transfert calĂ©e et les pluies efficaces estimĂ©es par la DPFT, l'ajustement de la fonction de production se fait par la suite en rĂ©solvant un problĂšme du type entrĂ©e-sortie.Une application de la mĂ©thode DPFT est faite sur le bassin versant de Oued Dis (Sebaou) dans le but de tester les performances de cette mĂ©thode sur des donnĂ©es rĂ©elles, sachant qu'on a obtenu une confirmation assez rigoureuse de ces propriĂ©tĂ©s sur des donnĂ©es synthĂ©tiques gĂ©nĂ©rĂ©es. Les rĂ©sultats de l'identification de la fonction de transfert sont satisfaisants tandis que ceux de l'ajustement de la fonction de production sont moins satisfaisants, ce qui a influencĂ© directement la qualitĂ© des rĂ©sultats de validation.Rainfall-runoff modelling in the case of flood forecasting may be studied by the first difference of the transfer function (FDTF) method, which is an extension of the unit hydrograph approach. In contrast to other methods, the FDFT method simultaneously provides both the transfer model by its first difference and an excess precipitation series. The principal advantage of the difference formulation is the diminution of the autocorrelation between successive flow data and the transfer function coefficients.The compilation algorithm proceeds iteratively by resolving alternately a multievent system which identifies the transfer function, and a deconvolution system which assesses the excess precipitation series event by event The initiatization is done with the total precipitation as a first approximation of the excess precipitations, since the results are only dependent on flood variations. The algorithm convergence is easily established if the various constraints are applied (positive values for the transfer function coefficients and the excess precipitations; normalization of the transfer function).Thus, the FDFT method only requires total precipitation and flood data in order to generate the transfer function and quantify the excess precipitation. It doesn't require that the production function be specified. once the Transfer fonction is calibrated and the excess precipitation estimated, the production function adjustment is carried out by resolving an input-output model type.The FDTF method has previously been applied successfully to simulated data. In the present study, the method has been applied to the Oued Dis watershed (Sebaou, Algeria) in order to test its performance using real data. The transfer function identification results proved satisfactory, but those related to the producton function adjustment were less satisfactory and degraded the overall quality ofthe validation results

    Quantile Super Learning for independent and online settings with application to solar power forecasting

    Full text link
    Estimating quantiles of an outcome conditional on covariates is of fundamental interest in statistics with broad application in probabilistic prediction and forecasting. We propose an ensemble method for conditional quantile estimation, Quantile Super Learning, that combines predictions from multiple candidate algorithms based on their empirical performance measured with respect to a cross-validated empirical risk of the quantile loss function. We present theoretical guarantees for both iid and online data scenarios. The performance of our approach for quantile estimation and in forming prediction intervals is tested in simulation studies. Two case studies related to solar energy are used to illustrate Quantile Super Learning: in an iid setting, we predict the physical properties of perovskite materials for photovoltaic cells, and in an online setting we forecast ground solar irradiance based on output from dynamic weather ensemble models

    Estimation and Testing in Targeted Group Sequential Covariate-adjusted Randomized Clinical Trials

    Get PDF
    This article is devoted to the construction and asymptotic study of adaptive group sequential covariate-adjusted randomized clinical trials analyzed through the prism of the semiparametric methodology of targeted maximum likelihood estimation (TMLE). We show how to build, as the data accrue group-sequentially, a sampling design which targets a user-supplied optimal design. We also show how to carry out a sound TMLE statistical inference based on such an adaptive sampling scheme (therefore extending some results known in the i.i.d setting only so far), and how group-sequential testing applies on top of it. The procedure is robust (i.e., consistent even if the working model is misspecified). A simulation study confirms the theoretical results, and validates the conjecture that the procedure may also be efficient
    • 

    corecore