2,504 research outputs found
Prediction of time series by statistical learning: general losses and fast rates
We establish rates of convergences in time series forecasting using the
statistical learning approach based on oracle inequalities. A series of papers
extends the oracle inequalities obtained for iid observations to time series
under weak dependence conditions. Given a family of predictors and
observations, oracle inequalities state that a predictor forecasts the series
as well as the best predictor in the family up to a remainder term .
Using the PAC-Bayesian approach, we establish under weak dependence conditions
oracle inequalities with optimal rates of convergence. We extend previous
results for the absolute loss function to any Lipschitz loss function with
rates where measures the
complexity of the model. We apply the method for quantile loss functions to
forecast the french GDP. Under additional conditions on the loss functions
(satisfied by the quadratic loss function) and on the time series, we refine
the rates of convergence to . We achieve for the
first time these fast rates for uniformly mixing processes. These rates are
known to be optimal in the iid case and for individual sequences. In
particular, we generalize the results of Dalalyan and Tsybakov on sparse
regression estimation to the case of autoregression
Characterizing and Understanding the Generalization Error of Transfer Learning with Gibbs Algorithm
We provide an information-theoretic analysis of the generalization ability of
Gibbs-based transfer learning algorithms by focusing on two popular transfer
learning approaches, -weighted-ERM and two-stage-ERM. Our key result is
an exact characterization of the generalization behaviour using the conditional
symmetrized KL information between the output hypothesis and the target
training samples given the source samples. Our results can also be applied to
provide novel distribution-free generalization error upper bounds on these two
aforementioned Gibbs algorithms. Our approach is versatile, as it also
characterizes the generalization errors and excess risks of these two Gibbs
algorithms in the asymptotic regime, where they converge to the
-weighted-ERM and two-stage-ERM, respectively. Based on our theoretical
results, we show that the benefits of transfer learning can be viewed as a
bias-variance trade-off, with the bias induced by the source distribution and
the variance induced by the lack of target samples. We believe this viewpoint
can guide the choice of transfer learning algorithms in practice
An Exact Characterization of the Generalization Error for the Gibbs Algorithm
Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and lack of guarantees. As a result, they may fail to characterize the exact generalization ability of a learning algorithm.Our main contribution is an exact characterization of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization error and PAC-Bayesian bounds. Our approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with data-dependent regularizer and that of the Gibbs algorithm in the asymptotic regime, where it converges to the empirical risk minimization algorithm. Of particular relevance, our results highlight the role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm
Investigating time-variation in the marginal predictive power of the yield spread
We use Bayesian time-varying parameters VARs with stochastic volatility to investigate changes in the marginal predictive content of the yield spread for output growth in the United States and the United Kingdom, since the Gold Standard era, and in the Eurozone, Canada, and Australia over the post-WWII period. Overall, our evidence does not provide much support for either of the two dominant explanations why the yield spread may contain predictive power for output growth, the monetary policy-based one, and Harvey’s (1988) ‘real yield curve’ one. Instead, we offer a new conjecture. JEL Classification: E42, E43, E47Bayesian VARs, medianunbiased, stochastic volatility, time-varying parameters
Yersinia ruckeri isolates recovered from diseased Atlantic Salmon (Salmo salar) in Scotland are more diverse than those from Rainbow Trout (Oncorhynchus mykiss) and represent distinct subpopulations
Yersinia ruckeri is the etiological agent of enteric redmouth (ERM) disease of farmed salmonids. Enteric redmouth disease is traditionally associated with rainbow trout (Oncorhynchus mykiss, Walbaum), but its incidence in Atlantic salmon (Salmo salar) is increasing. Yersinia ruckeri isolates recovered from diseased Atlantic salmon have been poorly characterized, and very little is known about the relationship of the isolates associated with these two species. Phenotypic approaches were used to characterize 109 Y. ruckeri isolates recovered over a 14-year period from infected Atlantic salmon in Scotland; 26 isolates from infected rainbow trout were also characterized. Biotyping, serotyping, and comparison of outer membrane protein profiles identified 19 Y. ruckeri clones associated with Atlantic salmon but only five associated with rainbow trout; none of the Atlantic salmon clones occurred in rainbow trout and vice versa. These findings suggest that distinct subpopulations of Y. ruckeri are associated with each species. A new O serotype (designated O8) was identified in 56 biotype 1 Atlantic salmon isolates and was the most common serotype identified from 2006 to 2011 and in 2014, suggesting an increased prevalence during the time period sampled. Rainbow trout isolates were represented almost exclusively by the same biotype 2, serotype O1 clone that has been responsible for the majority of ERM outbreaks in this species within the United Kingdom since the 1980s. However, the identification of two biotype 2, serotype O8 isolates in rainbow trout suggests that vaccines containing serotypes O1 and O8 should be evaluated in both rainbow trout and Atlantic salmon for application in Scotland
Information-Theoretic Characterizations of Generalization Error for the Gibbs Algorithm
Various approaches have been developed to upper
bound the generalization error of a supervised learning algorithm.
However, existing bounds are often loose and even vacuous when
evaluated in practice. As a result, they may fail to characterize
the exact generalization ability of a learning algorithm. Our
main contributions are exact characterizations of the expected
generalization error of the well-known Gibbs algorithm (a.k.a.
Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training
samples and the output hypothesis. Our result can be applied to
tighten existing expected generalization errors and PAC-Bayesian
bounds. Our information-theoretic approach is versatile, as it also
characterizes the generalization error of the Gibbs algorithm with
a data-dependent regularizer and that of the Gibbs algorithm in
the asymptotic regime, where it converges to the standard empirical risk minimization algorithm. Of particular relevance, our
results highlight the role the symmetrized KL information plays
in controlling the generalization error of the Gibbs algorithm
- …