2,538 research outputs found
Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary -Mixing Processes
Pac-Bayes bounds are among the most accurate generalization bounds for
classifiers learned from independently and identically distributed (IID) data,
and it is particularly so for margin classifiers: there have been recent
contributions showing how practical these bounds can be either to perform model
selection (Ambroladze et al., 2007) or even to directly guide the learning of
linear classifiers (Germain et al., 2009). However, there are many practical
situations where the training data show some dependencies and where the
traditional IID assumption does not hold. Stating generalization bounds for
such frameworks is therefore of the utmost interest, both from theoretical and
practical standpoints. In this work, we propose the first - to the best of our
knowledge - Pac-Bayes generalization bounds for classifiers trained on data
exhibiting interdependencies. The approach undertaken to establish our results
is based on the decomposition of a so-called dependency graph that encodes the
dependencies within the data, in sets of independent data, thanks to graph
fractional covers. Our bounds are very general, since being able to find an
upper bound on the fractional chromatic number of the dependency graph is
sufficient to get new Pac-Bayes bounds for specific settings. We show how our
results can be used to derive bounds for ranking statistics (such as Auc) and
classifiers trained on data distributed according to a stationary {\ss}-mixing
process. In the way, we show how our approach seemlessly allows us to deal with
U-processes. As a side note, we also provide a Pac-Bayes generalization bound
for classifiers learned on data from stationary -mixing distributions.Comment: Long version of the AISTATS 09 paper:
http://jmlr.csail.mit.edu/proceedings/papers/v5/ralaivola09a/ralaivola09a.pd
Asynchronous Gossip for Averaging and Spectral Ranking
We consider two variants of the classical gossip algorithm. The first variant
is a version of asynchronous stochastic approximation. We highlight a
fundamental difficulty associated with the classical asynchronous gossip
scheme, viz., that it may not converge to a desired average, and suggest an
alternative scheme based on reinforcement learning that has guaranteed
convergence to the desired average. We then discuss a potential application to
a wireless network setting with simultaneous link activation constraints. The
second variant is a gossip algorithm for distributed computation of the
Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant
draws upon a reinforcement learning algorithm for an average cost controlled
Markov decision problem, the second variant draws upon a reinforcement learning
algorithm for risk-sensitive control. We then discuss potential applications of
the second variant to ranking schemes, reputation networks, and principal
component analysis.Comment: 14 pages, 7 figures. Minor revisio
Agent Behavior Prediction and Its Generalization Analysis
Machine learning algorithms have been applied to predict agent behaviors in
real-world dynamic systems, such as advertiser behaviors in sponsored search
and worker behaviors in crowdsourcing. The behavior data in these systems are
generated by live agents: once the systems change due to the adoption of the
prediction models learnt from the behavior data, agents will observe and
respond to these changes by changing their own behaviors accordingly. As a
result, the behavior data will evolve and will not be identically and
independently distributed, posing great challenges to the theoretical analysis
on the machine learning algorithms for behavior prediction. To tackle this
challenge, in this paper, we propose to use Markov Chain in Random Environments
(MCRE) to describe the behavior data, and perform generalization analysis of
the machine learning algorithms on its basis. Since the one-step transition
probability matrix of MCRE depends on both previous states and the random
environment, conventional techniques for generalization analysis cannot be
directly applied. To address this issue, we propose a novel technique that
transforms the original MCRE into a higher-dimensional time-homogeneous Markov
chain. The new Markov chain involves more variables but is more regular, and
thus easier to deal with. We prove the convergence of the new Markov chain when
time approaches infinity. Then we prove a generalization bound for the machine
learning algorithms on the behavior data generated by the new Markov chain,
which depends on both the Markovian parameters and the covering number of the
function class compounded by the loss function for behavior prediction and the
behavior prediction model. To the best of our knowledge, this is the first work
that performs the generalization analysis on data generated by complex
processes in real-world dynamic systems
Gaussian process hyper-parameter estimation using parallel asymptotically independent Markov sampling
Gaussian process emulators of computationally expensive computer codes
provide fast statistical approximations to model physical processes. The
training of these surrogates depends on the set of design points chosen to run
the simulator. Due to computational cost, such training set is bound to be
limited and quantifying the resulting uncertainty in the hyper-parameters of
the emulator by uni-modal distributions is likely to induce bias. In order to
quantify this uncertainty, this paper proposes a computationally efficient
sampler based on an extension of Asymptotically Independent Markov Sampling, a
recently developed algorithm for Bayesian inference. Structural uncertainty of
the emulator is obtained as a by-product of the Bayesian treatment of the
hyper-parameters. Additionally, the user can choose to perform stochastic
optimisation to sample from a neighbourhood of the Maximum a Posteriori
estimate, even in the presence of multimodality. Model uncertainty is also
acknowledged through numerical stabilisation measures by including a nugget
term in the formulation of the probability model. The efficiency of the
proposed sampler is illustrated in examples where multi-modal distributions are
encountered. For the purpose of reproducibility, further development, and use
in other applications the code used to generate the examples is freely
available for download at https://github.com/agarbuno/paims_codesComment: Computational Statistics \& Data Analysis, Volume 103, November 201
Mixed normal conditional heteroskedasticity
Both unconditional mixed-normal distributions and GARCH models with fat-tailed conditional distributions have been employed for modeling financial return data. We consider a mixed-normal distribution coupled with a GARCH-type structure which allows for conditional variance in each of the components as well as dynamic feedback between the components. Special cases and relationships with previously proposed specifications are discussed and stationarity conditions are derived. An empirical application to NASDAQ-index data indicates the appropriateness of the model class and illustrates that the approach can generate a plausible disaggregation of the conditional variance process, in which the components' volatility dynamics have a clearly distinct behavior that is, for example, compatible with the well-known leverage effect. Klassifikation: C22, C51, G1
Elicitability and backtesting: Perspectives for banking regulation
Conditional forecasts of risk measures play an important role in internal
risk management of financial institutions as well as in regulatory capital
calculations. In order to assess forecasting performance of a risk measurement
procedure, risk measure forecasts are compared to the realized financial losses
over a period of time and a statistical test of correctness of the procedure is
conducted. This process is known as backtesting. Such traditional backtests are
concerned with assessing some optimality property of a set of risk measure
estimates. However, they are not suited to compare different risk estimation
procedures. We investigate the proposal of comparative backtests, which are
better suited for method comparisons on the basis of forecasting accuracy, but
necessitate an elicitable risk measure. We argue that supplementing traditional
backtests with comparative backtests will enhance the existing trading book
regulatory framework for banks by providing the correct incentive for accuracy
of risk measure forecasts. In addition, the comparative backtesting framework
could be used by banks internally as well as by researchers to guide selection
of forecasting methods. The discussion focuses on three risk measures,
Value-at-Risk, expected shortfall and expectiles, and is supported by a
simulation study and data analysis
- …