Search CORE

503 research outputs found

Testing the order of a model

Author: Chambaz Antoine
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 31/07/2006
Field of study

This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein's lemma yields an optimal underestimation error exponent. The lemma also implies that the overestimation error exponent is necessarily trivial. Our tests admit nontrivial underestimation error exponents. The optimal underestimation error exponent is achieved in some situations. The overestimation error can decay exponentially with respect to a positive power of the number of observations. These results are proved under mild assumptions by relating the underestimation (resp. overestimation) error to large (resp. moderate) deviations of the log-likelihood process. In particular, it is not necessary that the classical Cram\'{e}r condition be satisfied; namely, the

\log

-densities are not required to admit every exponential moment. Three benchmark examples with specific difficulties (location mixture of normal distributions, abrupt changes and various regressions) are detailed so as to illustrate the generality of our results.Comment: Published at http://dx.doi.org/10.1214/009053606000000344 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Faster Rates for Policy Learning

Author: Chambaz Antoine
Luedtke Alexander
Publication venue
Publication date: 20/04/2017
Field of study

This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme

arXiv.org e-Print Archive

Classification in postural style

Author: Chambaz Antoine
Denis Christophe
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2012
Field of study

This article contributes to the search for a notion of postural style, focusing on the issue of classifying subjects in terms of how they maintain posture. Longer term, the hope is to make it possible to determine on a case by case basis which sensorial information is prevalent in postural control, and to improve/adapt protocols for functional rehabilitation among those who show deficits in maintaining posture, typically seniors. Here, we specifically tackle the statistical problem of classifying subjects sampled from a two-class population. Each subject (enrolled in a cohort of 54 participants) undergoes four experimental protocols which are designed to evaluate potential deficits in maintaining posture. These protocols result in four complex trajectories, from which we can extract four small-dimensional summary measures. Because undergoing several protocols can be unpleasant, and sometimes painful, we try to limit the number of protocols needed for the classification. Therefore, we first rank the protocols by decreasing order of relevance, then we derive four plug-in classifiers which involve the best (i.e., more informative), the two best, the three best and all four protocols. This two-step procedure relies on the cutting-edge methodologies of targeted maximum likelihood learning (a methodology for robust and efficient inference) and super-learning (a machine learning procedure for aggregating various estimation procedures into a single better estimation procedure). A simulation study is carried out. The performances of the procedure applied to the real data set (and evaluated by the leave-one-out rule) go as high as an 87% rate of correct classification (47 out of 54 subjects correctly classified), using only the best protocol.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS542 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

HAL Descartes

Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

Author: Chambaz Antoine
Kaufmann Emilie
Luedtke Alexander
Publication venue
Publication date: 01/09/2019
Field of study

We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli rewards and a variant of KL-UCB for both single-parameter exponential families and bounded, finitely supported rewards. We show these algorithms are asymptotically optimal, both in rateand leading problem-dependent constants, including in the thick margin setting where multiple arms fall on the decision boundary

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Practical targeted learning from large data sets by survey sampling

Author: Bertail Patrice
Chambaz Antoine
Joly Emilien
Publication venue
Publication date: 29/06/2016
Field of study

We address the practical construction of asymptotic confidence intervals for smooth (i.e., path-wise differentiable), real-valued statistical parameters by targeted learning from independent and identically distributed data in contexts where sample size is so large that it poses computational challenges. We observe some summary measure of all data and select a sub-sample from the complete data set by Poisson rejective sampling with unequal inclusion probabilities based on the summary measures. Targeted learning is carried out from the easier to handle sub-sample. We derive a central limit theorem for the targeted minimum loss estimator (TMLE) which enables the construction of the confidence intervals. The inclusion probabilities can be optimized to reduce the asymptotic variance of the TMLE. We illustrate the procedure with two examples where the parameters of interest are variable importance measures of an exposure (binary or continuous) on an outcome. We also conduct a simulation study and comment on its results. keywords: semiparametric inference; survey sampling; targeted minimum loss estimation (TMLE

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive

HAL-Polytechnique

La microcirculation chez les patients en choc septique à deux objectifs de pression artérielle moyenne de 65 et 85 mmHg

Author: Chambaz Matthieu
Publication venue: HAL CCSD
Publication date: 20/10/2015
Field of study

Le choc septique est caractérisé par des anomalies macro et microcirculatoires. La technique de spectroscopie en proche infra-rouge (NIRS) permet d’approcher la microcirculation au niveau de l’éminence thénar. Cette technique mesure la saturation musculaire en oxygène (StO2) et la pente de resaturation lors d’un test d’occlusion vasculaire qui possède un intérêt physiologique (indicateur de la réserve microcirculatoire) et pronostique. Cette étude avait pour objectif principal de comparer l’état microcirculatoire et en particulier la pente de resaturation des patients en choc septique à deux niveaux de PAm cible : 65 mmHg et 85 mmHg. Vingt-deux patients ont été inclus à la phase précoce d’un choc septique après rétablissement d’une Pam à 65 mmHg. Nous avons trouvé globalement une amélioration de la pente de resaturation de la StO2 à 85 mmHg de PAm vs. 65 mmHg (2,6 [1,5] vs. 3 [1,4] ; p=0,021). Il n’y avait pas de différence significative pour les autres variables microcirculatoires (StO2, pente d’occlusion, aire d’hyperhémie) aux deux niveaux de PAm. Toutefois chez certains patients, la pente de resaturation était clairement meilleure (plus élevée) à 65 mmHG de PAm vs. 85 mmHg. Nous n’avons pas pu identifier les caractéristiques de cette sous-population.Au total, cibler une PAm de 85 mmHg est globalement associé à un meilleur état microcirculatoire qu’une PAm de 65 mmHg, surtout en terme de pente de resaturation de StO2. Il existe une forte variabilité interindividuelle plaidant en faveur d’une évaluation personnalisée de la microcirculation afin de mieux définir le niveau de PAm que doit cibler la réanimation du choc septique par remplissage vasculaire et vasopresseurs

HAL Descartes

La prévision des crues du bassin versant de l'Oued Dis (Sebaou) par la métode DPFT

Author: Chambaz H.
Dechemi N.
Publication venue: 'Consortium Erudit'
Publication date: 01/01/1994
Field of study

La modélisation pluie-débit dans le cas de la prévision des crues peut être étudiée par la méthode DPFT (différence première de la fonction de transfert) qui est une extension de la méthode de l'hydrogramme unitaire. Contrairement aux autres méthodes, la méthode DPFT permet d'obtenir à la fois la fonction de transfert à travers sa différence première (DPFT) et une série des pluies efficaces. L'avantage principal de la formulation en différences est la diminution de l'auto corrélation des débits successifs et des coefficients de la fonction de transfert.L'algorithme de calcul procède par itérations en résolvant alternativement un système multi-événements qui identifie la fonction de transfert et un système de déconvolution qui estime une série de pluies efficaces, cette fois-ci crue par crue. L'initialisation se fait à l'aide des pluies brutes comme première approximation des pluies efficaces puisque les résultats ne dépendent que des variations des débits. La convergence de l'algorithme est établie aisément lorsqu'on applique les différentes contraintes (positivité des ordonnées de la fonction de transfert et des pluies efficaces; normalisation de la fonction de transfert).La méthode DPFT ne nécessite donc que les mesures des pluies brutes et les débits pour effectuer les identifications de la fonction de transfert et des pluies efficaces. Elle n'impose pas de préciser la fonction de production. Une fois la fonction de transfert calée et les pluies efficaces estimées par la DPFT, l'ajustement de la fonction de production se fait par la suite en résolvant un problème du type entrée-sortie.Une application de la méthode DPFT est faite sur le bassin versant de Oued Dis (Sebaou) dans le but de tester les performances de cette méthode sur des données réelles, sachant qu'on a obtenu une confirmation assez rigoureuse de ces propriétés sur des données synthétiques générées. Les résultats de l'identification de la fonction de transfert sont satisfaisants tandis que ceux de l'ajustement de la fonction de production sont moins satisfaisants, ce qui a influencé directement la qualité des résultats de validation.Rainfall-runoff modelling in the case of flood forecasting may be studied by the first difference of the transfer function (FDTF) method, which is an extension of the unit hydrograph approach. In contrast to other methods, the FDFT method simultaneously provides both the transfer model by its first difference and an excess precipitation series. The principal advantage of the difference formulation is the diminution of the autocorrelation between successive flow data and the transfer function coefficients.The compilation algorithm proceeds iteratively by resolving alternately a multievent system which identifies the transfer function, and a deconvolution system which assesses the excess precipitation series event by event The initiatization is done with the total precipitation as a first approximation of the excess precipitations, since the results are only dependent on flood variations. The algorithm convergence is easily established if the various constraints are applied (positive values for the transfer function coefficients and the excess precipitations; normalization of the transfer function).Thus, the FDFT method only requires total precipitation and flood data in order to generate the transfer function and quantify the excess precipitation. It doesn't require that the production function be specified. once the Transfer fonction is calibrated and the excess precipitation estimated, the production function adjustment is carried out by resolving an input-output model type.The FDTF method has previously been applied successfully to simulated data. In the present study, the method has been applied to the Oued Dis watershed (Sebaou, Algeria) in order to test its performance using real data. The transfer function identification results proved satisfactory, but those related to the producton function adjustment were less satisfactory and degraded the overall quality ofthe validation results

Érudit

Quantile Super Learning for independent and online settings with application to solar power forecasting

Author: Chambaz Antoine
Susmann Herbert
Publication venue
Publication date: 30/10/2023
Field of study

Estimating quantiles of an outcome conditional on covariates is of fundamental interest in statistics with broad application in probabilistic prediction and forecasting. We propose an ensemble method for conditional quantile estimation, Quantile Super Learning, that combines predictions from multiple candidate algorithms based on their empirical performance measured with respect to a cross-validated empirical risk of the quantile loss function. We present theoretical guarantees for both iid and online data scenarios. The performance of our approach for quantile estimation and in forming prediction intervals is tested in simulation studies. Two case studies related to solar energy are used to illustrate Quantile Super Learning: in an iid setting, we predict the physical properties of perovskite materials for photovoltaic cells, and in an online setting we forecast ground solar irradiance based on output from dynamic weather ensemble models

arXiv.org e-Print Archive

AdaptiveConformal: An R Package for Adaptive Conformal Inference

Author: Chambaz Antoine
Josse Julie
Susmann Herbert
Publication venue
Publication date: 01/12/2023
Field of study

Conformal Inference (CI) is a popular approach for generating finite sample prediction intervals based on the output of any point prediction method when data are exchangeable. Adaptive Conformal Inference (ACI) algorithms extend CI to the case of sequentially observed data, such as time series, and exhibit strong theoretical guarantees without having to assume exchangeability of the observed data. The common thread that unites algorithms in the ACI family is that they adaptively adjust the width of the generated prediction intervals in response to the observed data. We provide a detailed description of five ACI algorithms and their theoretical guarantees, and test their performance in simulation studies. We then present a case study of producing prediction intervals for influenza incidence in the United States based on black-box point forecasts. Implementations of all the algorithms are released as an open-source R package, AdaptiveConformal, which also includes tools for visualizing and summarizing conformal prediction intervals

arXiv.org e-Print Archive