19 research outputs found
Réduction itérative du biais pour des lisseurs multivariés
International audienceLa méthode IBR (iterated biased reduction) permet d'estimer une fonction de régression inconnue lorsque les variables explicatives sont à valeurs dans \mathbbR^d. Pour estimer la fonction , les méthodes non-paramétriques classiques souffrent du fléau de la dimension. En pratique, il faut donc supposer des hypothèses structurelles: modèles additifs, modèles à directions révélatrices... A contrario IBR estime directement la fonction de régression . Elle concurrence MARS, les directions révélatrices ou les modèles additifs et sur des exemples réels ou simulés et elle apporte des gains significatifs sur l'erreur de prévision. Cette méthode utilise en pratique un lisseur pilote soit de type splines plaque-minces soit de type noyau gaussien. Cet estimateur pilote est utilisé de manière répétée afin d'estimer le biais et permet de l'enlever progressivement. La méthode, à l'instar du boosting, nécessite donc l'estimation de l'itération optimale. Des résultats de vitesse de convergence (vitesse minimax) de l'erreur quadratique moyenne de l'estimateur (avec itération optimale) ont été obtenus. L'optimalité du critère de choix de l'itération (GCV) a aussi été démontré. Un exemple simulé simple () et un exemple réel () seront traités et comparés aux méthodes existantes: GAM, MARS, PPR, ou -boosting. Un package \textsfR disponible sur le CRAN permet d'utiliser cette méthode très simplement
Recommended from our members
Coarse-graining stochastic biochemical networks: adiabaticity and fast simulations
We propose a universal approach for analysis and fast simulations of stiff stochastic biochemical kinetics networks, which rests on elimination of fast chemical species without a loss of information about mesoscoplc, non-Poissonian fluctuations of the slow ones. Our approach, which is similar to the Born-Oppenhelmer approximation in quantum mechanics, follows from the stochastic path Integral representation of the cumulant generating function of reaction events. In applications with a small number of chemIcal reactions, It produces analytical expressions for cumulants of chemical fluxes between the slow variables. This allows for a low-dimensional, Interpretable representation and can be used for coarse-grained numerical simulation schemes with a small computational complexity and yet high accuracy. As an example, we derive the coarse-grained description for a chain of biochemical reactions, and show that the coarse-grained and the microscopic simulations are in an agreement, but the coarse-gralned simulations are three orders of magnitude faster
On the mutual nearest neighbors estimate in regression
International audienceMotivated by promising experimental results, this paper investigates the theoretical properties of a recently proposed nonparametric estimator, called the Mutual Nearest Neighbors rule, which estimates the regression function m(x) = E[Y vertical bar X = x] as follows: first identify the k nearest neighbors of x in the sample D-n, then keep only those for which x is itself one of the k nearest neighbors, and finally take the average over the corresponding response variables. We prove that this estimator is consistent and that its rate of convergence is optimal. Since the estimate with the optimal rate of convergence depends on the unknown distribution of the observations, we also present adaptation results by data-splitting
On the mutual nearest neighbors estimate in regression
International audienceMotivated by promising experimental results, this paper investigates the theoretical properties of a recently proposed nonparametric estimator, called the Mutual Nearest Neighbors rule, which estimates the regression function m(x) = E[Y vertical bar X = x] as follows: first identify the k nearest neighbors of x in the sample D-n, then keep only those for which x is itself one of the k nearest neighbors, and finally take the average over the corresponding response variables. We prove that this estimator is consistent and that its rate of convergence is optimal. Since the estimate with the optimal rate of convergence depends on the unknown distribution of the observations, we also present adaptation results by data-splitting
Change in global transmission rates of COVID-19 through May 6 2020
We analyzed COVID-19 data through May 6th, 2020 using a partially observed Markov process. Our method uses a hybrid deterministic and stochastic formalism that allows for time variable transmission rates and detection probabilities. The model was fit using iterated particle filtering to case count and death count time series from 55 countries. We found evidence for a shrinking epidemic in 30 of the 55 examined countries. Of those 30 countries, 27 have significant evidence for subcritical transmission rates, although the decline in new cases is relatively slow compared to the initial growth rates. Generally, the transmission rates in Europe were lower than in the Americas and Asia. This suggests that global scale social distancing efforts to slow the spread of COVID-19 are effective although they need to be strengthened in many regions and maintained in others to avoid further resurgence of COVID-19. The slow decline also suggests alternative strategies to control the virus are needed before social distancing efforts are partially relaxed
Iterative isotonic regression
This article explores some theoretical aspects of a recent nonparametric method for
estimating a univariate regression function of bounded variation. The method exploits the
Jordan decomposition which states that a function of bounded variation can be decomposed
as the sum of a non-decreasing function and a non-increasing function. This suggests
combining the backfitting algorithm for estimating additive functions with isotonic
regression for estimating monotone functions. The resulting iterative algorithm is called
Iterative Isotonic Regression (I.I.R.). The main result in this paper states that the
estimator is consistent if the number of iterations kn grows appropriately
with the sample size n. The proof requires two auxiliary results that are
of interest in and by themselves: firstly, we generalize the well-known consistency
property of isotonic regression to the framework of a non-monotone regression function,
and secondly, we relate the backfitting algorithm to von Neumann’s algorithm in convex
analysis. We also analyse how the algorithm can be stopped in practice using a
data-splitting procedure