235 research outputs found
The Sample Complexity of Multi-Distribution Learning for VC Classes
Multi-distribution learning is a natural generalization of PAC learning to
settings with multiple data distributions. There remains a significant gap
between the known upper and lower bounds for PAC-learnable classes. In
particular, though we understand the sample complexity of learning a VC
dimension d class on distributions to be , the best lower bound is
. We discuss recent progress on this
problem and some hurdles that are fundamental to the use of game dynamics in
statistical learning.Comment: 11 pages. Authors are ordered alphabetically. Open problem presented
at the 36th Annual Conference on Learning Theor
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
In this work, we study the low-rank MDPs with adversarially changed losses in
the full-information feedback setting. In particular, the unknown transition
probability kernel admits a low-rank matrix decomposition \citep{REPUCB22}, and
the loss functions may change adversarially but are revealed to the learner at
the end of each episode. We propose a policy optimization-based algorithm POLO,
and we prove that it attains the
regret
guarantee, where is rank of the transition kernel (and hence the dimension
of the unknown representations), is the cardinality of the action space,
is the cardinality of the model class, and is the discounted
factor. Notably, our algorithm is oracle-efficient and has a regret guarantee
with no dependence on the size of potentially arbitrarily large state space.
Furthermore, we also prove an
regret lower bound for this problem, showing that low-rank MDPs are
statistically more difficult to learn than linear MDPs in the regret
minimization setting. To the best of our knowledge, we present the first
algorithm that interleaves representation learning, exploration, and
exploitation to achieve the sublinear regret guarantee for RL with nonlinear
function approximation and adversarial losses
- …