4 research outputs found
Efficient improper learning for online logistic regression
We consider the setting of online logistic regression and consider the regret
with respect to the 2-ball of radius B. It is known (see [Hazan et al., 2014])
that any proper algorithm which has logarithmic regret in the number of samples
(denoted n) necessarily suffers an exponential multiplicative constant in B. In
this work, we design an efficient improper algorithm that avoids this
exponential constant while preserving a logarithmic regret. Indeed, [Foster et
al., 2018] showed that the lower bound does not apply to improper algorithms
and proposed a strategy based on exponential weights with prohibitive
computational complexity. Our new algorithm based on regularized empirical risk
minimization with surrogate losses satisfies a regret scaling as O(B log(Bn))
with a per-round time-complexity of order O(d^2)
Efficient improper learning for online logistic regression
International audienceWe consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B. It is known (see [Hazan et al., 2014]) that any proper algorithm which has logarithmic regret in the number of samples (denoted n) necessarily suffers an exponential multiplicative constant in B. In this work, we design an efficient improper algorithm that avoids this exponential constant while preserving a logarithmic regret. Indeed, [Foster et al., 2018] showed that the lower bound does not apply to improper algorithms and proposed a strategy based on exponential weights with prohibitive computational complexity. Our new algorithm based on regularized empirical risk minimization with surrogate losses satisfies a regret scaling as O(B log(Bn)) with a per-round time-complexity of order O(d^2)
Efficient online learning with kernels for adversarial large scale problems
We are interested in a framework of online learning with kernels for low-dimensional but large-scale and potentially adversarial datasets. We study the computational and theoretical performance of online variations of kernel Ridge regression. Despite its simplicity, the algorithm we study is the first to achieve the optimal regret for a wide range of kernels with a per-round complexity of order with . The algorithm we consider is based on approximating the kernel with the linear span of basis functions. Our contributions is two-fold: 1) For the Gaussian kernel, we propose to build the basis beforehand (independently of the data) through Taylor expansion. For -dimensional inputs, we provide a (close to) optimal regret of order with per-round time complexity and space complexity . This makes the algorithm a suitable choice as soon as which is likely to happen in a scenario with small dimensional and large-scale dataset; 2) For general kernels with low effective dimension, the basis functions are updated sequentially in a data-adaptive fashion by sampling Nyström points. In this case, our algorithm improves the computational trade-off known for online kernel regression