Search CORE

113 research outputs found

What Makes a Good Plan? An Efficient Planning Approach to Control Diffusion Processes in Networks

Author: Kalogeratos Argyris
Scaman Kevin
Vayatis Nicolas
Publication venue
Publication date: 17/07/2014
Field of study

In this paper, we analyze the quality of a large class of simple dynamic resource allocation (DRA) strategies which we name priority planning. Their aim is to control an undesired diffusion process by distributing resources to the contagious nodes of the network according to a predefined priority-order. In our analysis, we reduce the DRA problem to the linear arrangement of the nodes of the network. Under this perspective, we shed light on the role of a fundamental characteristic of this arrangement, the maximum cutwidth, for assessing the quality of any priority planning strategy. Our theoretical analysis validates the role of the maximum cutwidth by deriving bounds for the extinction time of the diffusion process. Finally, using the results of our analysis, we propose a novel and efficient DRA strategy, called Maximum Cutwidth Minimization, that outperforms other competing strategies in our simulations.Comment: 18 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Multivariate Hawkes Processes for Large-scale Inference

Author: Kalogeratos Argyris
Lemonnier Rémi
Scaman Kevin
Publication venue
Publication date: 26/02/2016
Field of study

In this paper, we present a framework for fitting multivariate Hawkes processes for large-scale problems both in the number of events in the observed history

n

and the number of event types

d

(i.e. dimensions). The proposed Low-Rank Hawkes Process (LRHP) framework introduces a low-rank approximation of the kernel matrix that allows to perform the nonparametric learning of the

d^2

triggering kernels using at most

O(ndr^2)

operations, where

r

is the rank of the approximation (

r \ll d,n

). This comes as a major improvement to the existing state-of-the-art inference algorithms that are in

O(nd^2)

. Furthermore, the low-rank approximation allows LRHP to learn representative patterns of interaction between event types, which may be valuable for the analysis of such complex processes in real world datasets. The efficiency and scalability of our approach is illustrated with numerical experiments on simulated as well as real datasets.Comment: 16 pages, 5 figure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Breaking the Log Barrier: a Novel Universal Restart Strategy for Faster Las Vegas Algorithms

Author: Scaman Kevin
Publication venue
Publication date: 21/04/2023
Field of study

Let

\mathcal{A}

be a Las Vegas algorithm, i.e. an algorithm whose running time

T

is a random variable drawn according to a certain probability distribution

p

. In 1993, Luby, Sinclair and Zuckerman [LSZ93] proved that a simple universal restart strategy can, for any probability distribution

p

, provide an algorithm executing

\mathcal{A}

and whose expected running time is

O(\ell^\star_p\log\ell^\star_p)

, where

\ell^\star_p=\Theta\left(\inf_{q\in (0,1]}Q_p(q)/q\right)

is the minimum expected running time achievable with full prior knowledge of the probability distribution

p

, and

Q_p(q)

is the

q

-quantile of

p

. Moreover, the authors showed that the logarithmic term could not be removed for universal restart strategies and was, in a certain sense, optimal. In this work, we show that, quite surprisingly, the logarithmic term can be replaced by a smaller quantity, thus reducing the expected running time in practical settings of interest. More precisely, we propose a novel restart strategy that executes

\mathcal{A}

and whose expected running time is

O\big(\inf_{q\in (0,1]}\frac{Q_p(q)}{q}\,\psi\big(\log Q_p(q),\,\log (1/q)\big)\big)

where

\psi(a,b)=1+\min\left\{a+b,a\log^2 a,\,b\log^2 b\right\}

. This quantity is, up to a multiplicative factor, better than: 1) the universal restart strategy of [LSZ93], 2) any

q

-quantile of

p

for

q\in(0,1]

, 3) the original algorithm, and 4) any quantity of the form

\phi^{-1}(\mathbb{E}[\phi(T)])

for a large class of concave functions

\phi

. The latter extends the recent restart strategy of [Zam22] achieving

O\left(e^{\mathbb{E}[\ln(T)]}\right)

, and can be thought of as algorithmic reverse Jensen's inequalities. Finally, we show that the behavior of

\frac{t\phi''(t)}{\phi'(t)}

at infinity controls the existence of reverse Jensen's inequalities by providing a necessary and a sufficient condition for these inequalities to hold.Comment: 13 pages, 0 figure

arXiv.org e-Print Archive

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Author: Bach Francis
Bubeck Sébastien
Lee Yin Tat
Massoulié Laurent
Scaman Kevin
Publication venue
Publication date: 01/06/2018
Field of study

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in

O(1/\sqrt{t})

, the structure of the communication network only impacts a second-order term in

O(1/t)

, where

t

is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a

d^{1/4}

multiplicative factor of the optimal convergence rate, where

d

is the underlying dimension.Comment: 17 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Author: Even Mathieu
Massoulié Laurent
Scaman Kevin
Publication venue
Publication date: 11/07/2023
Field of study

In this paper, we provide a novel framework for the analysis of generalization error of first-order optimization algorithms for statistical learning when the gradient can only be accessed through partial observations given by an oracle. Our analysis relies on the regularity of the gradient w.r.t. the data samples, and allows to derive near matching upper and lower bounds for the generalization error of multiple learning problems, including supervised learning, transfer learning, robust learning, distributed learning and communication efficient learning using gradient quantization. These results hold for smooth and strongly-convex optimization problems, as well as smooth non-convex optimization problems verifying a Polyak-Lojasiewicz assumption. In particular, our upper and lower bounds depend on a novel quantity that extends the notion of conditional standard deviation, and is a measure of the extent to which the gradient can be approximated by having access to the oracle. As a consequence, our analysis provides a precise meaning to the intuition that optimization of the statistical learning objective is as hard as the estimation of its gradient. Finally, we show that, in the case of standard supervised learning, mini-batch gradient descent with increasing batch sizes and a warm start can reach a generalization error that is optimal up to a multiplicative factor, thus motivating the use of this optimization scheme in practical applications.Comment: 18 pages, 0 figure

arXiv.org e-Print Archive