20 research outputs found
A\c{C}AI: Ascent Similarity Caching with Approximate Indexes
Similarity search is a key operation in multimedia retrieval systems and
recommender systems, and it will play an important role also for future machine
learning and augmented reality applications. When these systems need to serve
large objects with tight delay constraints, edge servers close to the end-user
can operate as similarity caches to speed up the retrieval. In this paper we
present A\c{C}AI, a new similarity caching policy which improves on the state
of the art by using (i) an (approximate) index for the whole catalog to decide
which objects to serve locally and which to retrieve from the remote server,
and (ii) a mirror ascent algorithm to update the set of local objects with
strong guarantees even when the request process does not exhibit any
statistical regularity
Online Submodular Maximization via Online Convex Optimization
We study monotone submodular maximization under general matroid constraints
in the online setting. We prove that online optimization of a large class of
submodular functions, namely, weighted threshold potential functions, reduces
to online convex optimization (OCO). This is precisely because functions in
this class admit a concave relaxation; as a result, OCO policies, coupled with
an appropriate rounding scheme, can be used to achieve sublinear regret in the
combinatorial setting. We show that our reduction extends to many different
versions of the online learning problem, including the dynamic regret, bandit,
and optimistic-learning settings.Comment: Under revie
Global age-sex-specific mortality, life expectancy, and population estimates in 204 countries and territories and 811 subnational locations, 1950–2021, and the impact of the COVID-19 pandemic: a comprehensive demographic analysis for the Global Burden of Disease Study 2021
Background: Estimates of demographic metrics are crucial to assess levels and trends of population health outcomes. The profound impact of the COVID-19 pandemic on populations worldwide has underscored the need for timely estimates to understand this unprecedented event within the context of long-term population health trends. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 provides new demographic estimates for 204 countries and territories and 811 additional subnational locations from 1950 to 2021, with a particular emphasis on changes in mortality and life expectancy that occurred during the 2020–21 COVID-19 pandemic period. Methods: 22 223 data sources from vital registration, sample registration, surveys, censuses, and other sources were used to estimate mortality, with a subset of these sources used exclusively to estimate excess mortality due to the COVID-19 pandemic. 2026 data sources were used for population estimation. Additional sources were used to estimate migration; the effects of the HIV epidemic; and demographic discontinuities due to conflicts, famines, natural disasters, and pandemics, which are used as inputs for estimating mortality and population. Spatiotemporal Gaussian process regression (ST-GPR) was used to generate under-5 mortality rates, which synthesised 30 763 location-years of vital registration and sample registration data, 1365 surveys and censuses, and 80 other sources. ST-GPR was also used to estimate adult mortality (between ages 15 and 59 years) based on information from 31 642 location-years of vital registration and sample registration data, 355 surveys and censuses, and 24 other sources. Estimates of child and adult mortality rates were then used to generate life tables with a relational model life table system. For countries with large HIV epidemics, life tables were adjusted using independent estimates of HIV-specific mortality generated via an epidemiological analysis of HIV prevalence surveys, antenatal clinic serosurveillance, and other data sources. Excess mortality due to the COVID-19 pandemic in 2020 and 2021 was determined by subtracting observed all-cause mortality (adjusted for late registration and mortality anomalies) from the mortality expected in the absence of the pandemic. Expected mortality was calculated based on historical trends using an ensemble of models. In location-years where all-cause mortality data were unavailable, we estimated excess mortality rates using a regression model with covariates pertaining to the pandemic. Population size was computed using a Bayesian hierarchical cohort component model. Life expectancy was calculated using age-specific mortality rates and standard demographic methods. Uncertainty intervals (UIs) were calculated for every metric using the 25th and 975th ordered values from a 1000-draw posterior distribution. Findings: Global all-cause mortality followed two distinct patterns over the study period: age-standardised mortality rates declined between 1950 and 2019 (a 62·8% [95% UI 60·5–65·1] decline), and increased during the COVID-19 pandemic period (2020–21; 5·1% [0·9–9·6] increase). In contrast with the overall reverse in mortality trends during the pandemic period, child mortality continued to decline, with 4·66 million (3·98–5·50) global deaths in children younger than 5 years in 2021 compared with 5·21 million (4·50–6·01) in 2019. An estimated 131 million (126–137) people died globally from all causes in 2020 and 2021 combined, of which 15·9 million (14·7–17·2) were due to the COVID-19 pandemic (measured by excess mortality, which includes deaths directly due to SARS-CoV-2 infection and those indirectly due to other social, economic, or behavioural changes associated with the pandemic). Excess mortality rates exceeded 150 deaths per 100 000 population during at least one year of the pandemic in 80 countries and territories, whereas 20 nations had a negative excess mortality rate in 2020 or 2021, indicating that all-cause mortality in these countries was lower during the pandemic than expected based on historical trends. Between 1950 and 2021, global life expectancy at birth increased by 22·7 years (20·8–24·8), from 49·0 years (46·7–51·3) to 71·7 years (70·9–72·5). Global life expectancy at birth declined by 1·6 years (1·0–2·2) between 2019 and 2021, reversing historical trends. An increase in life expectancy was only observed in 32 (15·7%) of 204 countries and territories between 2019 and 2021. The global population reached 7·89 billion (7·67–8·13) people in 2021, by which time 56 of 204 countries and territories had peaked and subsequently populations have declined. The largest proportion of population growth between 2020 and 2021 was in sub-Saharan Africa (39·5% [28·4–52·7]) and south Asia (26·3% [9·0–44·7]). From 2000 to 2021, the ratio of the population aged 65 years and older to the population aged younger than 15 years increased in 188 (92·2%) of 204 nations. Interpretation: Global adult mortality rates markedly increased during the COVID-19 pandemic in 2020 and 2021, reversing past decreasing trends, while child mortality rates continued to decline, albeit more slowly than in earlier years. Although COVID-19 had a substantial impact on many demographic indicators during the first 2 years of the pandemic, overall global health progress over the 72 years evaluated has been profound, with considerable improvements in mortality and life expectancy. Additionally, we observed a deceleration of global population growth since 2017, despite steady or increasing growth in lower-income countries, combined with a continued global shift of population age structures towards older ages. These demographic changes will likely present future challenges to health systems, economies, and societies. The comprehensive demographic estimates reported here will enable researchers, policy makers, health practitioners, and other key stakeholders to better understand and address the profound changes that have occurred in the global health landscape following the first 2 years of the COVID-19 pandemic, and longer-term trends beyond the pandemic
Global burden and strength of evidence for 88 risk factors in 204 countries and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
Background: Understanding the health consequences associated with exposure to risk factors is necessary to inform public health policy and practice. To systematically quantify the contributions of risk factor exposures to specific health outcomes, the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 aims to provide comprehensive estimates of exposure levels, relative health risks, and attributable burden of disease for 88 risk factors in 204 countries and territories and 811 subnational locations, from 1990 to 2021. Methods: The GBD 2021 risk factor analysis used data from 54 561 total distinct sources to produce epidemiological estimates for 88 risk factors and their associated health outcomes for a total of 631 risk–outcome pairs. Pairs were included on the basis of data-driven determination of a risk–outcome association. Age-sex-location-year-specific estimates were generated at global, regional, and national levels. Our approach followed the comparative risk assessment framework predicated on a causal web of hierarchically organised, potentially combinative, modifiable risks. Relative risks (RRs) of a given outcome occurring as a function of risk factor exposure were estimated separately for each risk–outcome pair, and summary exposure values (SEVs), representing risk-weighted exposure prevalence, and theoretical minimum risk exposure levels (TMRELs) were estimated for each risk factor. These estimates were used to calculate the population attributable fraction (PAF; ie, the proportional change in health risk that would occur if exposure to a risk factor were reduced to the TMREL). The product of PAFs and disease burden associated with a given outcome, measured in disability-adjusted life-years (DALYs), yielded measures of attributable burden (ie, the proportion of total disease burden attributable to a particular risk factor or combination of risk factors). Adjustments for mediation were applied to account for relationships involving risk factors that act indirectly on outcomes via intermediate risks. Attributable burden estimates were stratified by Socio-demographic Index (SDI) quintile and presented as counts, age-standardised rates, and rankings. To complement estimates of RR and attributable burden, newly developed burden of proof risk function (BPRF) methods were applied to yield supplementary, conservative interpretations of risk–outcome associations based on the consistency of underlying evidence, accounting for unexplained heterogeneity between input data from different studies. Estimates reported represent the mean value across 500 draws from the estimate's distribution, with 95% uncertainty intervals (UIs) calculated as the 2·5th and 97·5th percentile values across the draws. Findings: Among the specific risk factors analysed for this study, particulate matter air pollution was the leading contributor to the global disease burden in 2021, contributing 8·0% (95% UI 6·7–9·4) of total DALYs, followed by high systolic blood pressure (SBP; 7·8% [6·4–9·2]), smoking (5·7% [4·7–6·8]), low birthweight and short gestation (5·6% [4·8–6·3]), and high fasting plasma glucose (FPG; 5·4% [4·8–6·0]). For younger demographics (ie, those aged 0–4 years and 5–14 years), risks such as low birthweight and short gestation and unsafe water, sanitation, and handwashing (WaSH) were among the leading risk factors, while for older age groups, metabolic risks such as high SBP, high body-mass index (BMI), high FPG, and high LDL cholesterol had a greater impact. From 2000 to 2021, there was an observable shift in global health challenges, marked by a decline in the number of all-age DALYs broadly attributable to behavioural risks (decrease of 20·7% [13·9–27·7]) and environmental and occupational risks (decrease of 22·0% [15·5–28·8]), coupled with a 49·4% (42·3–56·9) increase in DALYs attributable to metabolic risks, all reflecting ageing populations and changing lifestyles on a global scale. Age-standardised global DALY rates attributable to high BMI and high FPG rose considerably (15·7% [9·9–21·7] for high BMI and 7·9% [3·3–12·9] for high FPG) over this period, with exposure to these risks increasing annually at rates of 1·8% (1·6–1·9) for high BMI and 1·3% (1·1–1·5) for high FPG. By contrast, the global risk-attributable burden and exposure to many other risk factors declined, notably for risks such as child growth failure and unsafe water source, with age-standardised attributable DALYs decreasing by 71·5% (64·4–78·8) for child growth failure and 66·3% (60·2–72·0) for unsafe water source. We separated risk factors into three groups according to trajectory over time: those with a decreasing attributable burden, due largely to declining risk exposure (eg, diet high in trans-fat and household air pollution) but also to proportionally smaller child and youth populations (eg, child and maternal malnutrition); those for which the burden increased moderately in spite of declining risk exposure, due largely to population ageing (eg, smoking); and those for which the burden increased considerably due to both increasing risk exposure and population ageing (eg, ambient particulate matter air pollution, high BMI, high FPG, and high SBP). Interpretation: Substantial progress has been made in reducing the global disease burden attributable to a range of risk factors, particularly those related to maternal and child health, WaSH, and household air pollution. Maintaining efforts to minimise the impact of these risk factors, especially in low SDI locations, is necessary to sustain progress. Successes in moderating the smoking-related burden by reducing risk exposure highlight the need to advance policies that reduce exposure to other leading risk factors such as ambient particulate matter air pollution and high SBP. Troubling increases in high FPG, high BMI, and other risk factors related to obesity and metabolic syndrome indicate an urgent need to identify and implement interventions
Apprentissage séquentiel pour l'allocation de ressources dans les réseaux
Network resource allocation is a complex and fundamental problem in computer science. It is a process in which components of a networked system aim to provide a faster service to demands, or to reduce the computation or communication load on the system. The main factors that contribute to the complexity of this problem are that the demands arrive to the system in an unpredictable and sequential fashion and may compete for the different network resources. The ubiquity of network resource allocation problems has motivated extensive research to design new policies with provable guarantees. This thesis investigates several instances of the network resource allocation problem and proposes online policies with strong performance guarantees leveraging the online learning framework.First, we study the online caching problem in which demands for files can be served by a local cache to avoid retrieval costs from a remote server. We study no-regret algorithms based on online mirror descent (OMD) strategies. We show that the optimal OMD strategy depends on the request diversity present in a batch of demands. We also prove that, when the cache must store the entire file, rather than a fraction, OMD strategies can be coupled with a randomized rounding scheme that preserves regret guarantees. We also present an extension to cache networks, and we propose a no-regret distributed online policy.Second, we investigate similarity caches that can reply to a demand for an object with similar objects stored locally. We propose a new online similarity caching policy that employs gradient descent to navigate the continuous representation space of objects and find appropriate objects to store in the cache. We provide theoretical convergence guarantees under stationary demands and show the proposed policy reduces service costs incurred by the system for 360-video delivery systems and recommendation systems. Subsequently, we show that the similarity caching problem can be formulated in the online learning framework by utilizing an OMD policy paired with randomized rounding to achieve a no-regret guarantee.Third, we present the novel idea of inference delivery networks (IDNs), networks of computing nodes that coordinate to satisfy machine learning (ML) inference demands achieving the best trade-off between latency and accuracy. IDNs bridge the dichotomy between device and cloud execution by integrating inference delivery at the various tiers of the infrastructure continuum (access, edge, regional data center, cloud). We propose a no-regret distributed dynamic policy for ML model allocation in an IDN: each node dynamically updates its local set of inference models based on demands observed during the recent past plus limited information exchange with its neighboring nodes.Finally, we study the fairness of network resource allocation problem under the alpha-fairness criterion. We recognize two different fairness objectives that naturally arise in this problem: the well-understood slot-fairness objective that aims to ensure fairness at every timeslot, and the less explored horizon-fairness objective that aims to ensure fairness across utilities accumulated over a time horizon. We argue that horizon-fairness comes at a lower price in terms of social welfare. We study horizon-fairness with the regret as a performance metric and show that vanishing regret cannot be achieved in presence of an unrestricted adversary. We propose restrictions on the adversary's capabilities corresponding to realistic scenarios and an online policy that indeed guarantees vanishing regret under these restrictions. We demonstrate the applicability of the proposed fairness framework to a representative resource management problem considering a virtualized caching system where different caches cooperate to serve content requests.L'allocation de ressources dans les réseaux est un problème complexe et fondamental en informatique. Il s'agit d'un processus dans lequel les composants d'un système de réseau visent à fournir un service plus rapide aux demandes ou à réduire la charge de calcul ou de communication sur le système. Les principaux facteurs qui contribuent à la complexité de ce problème sont que les demandes arrivent au système de manière imprévisible et séquentielle et peuvent entrer en concurrence pour les différentes ressources du réseau. L'ubiquité des problèmes d'allocation de ressources dans les réseaux a motivé des recherches approfondies pour concevoir de nouveaux algorithmes avec des garanties prouvables. Cette thèse étudie plusieurs instances du problème d'allocation de ressources dans les réseaux et propose des algorithmes adaptatifs avec de fortes garanties de performances s'appuyant sur le cadre d'apprentissage séquentiel.Premièrement, nous étudions le problème de mise en cache séquentiel, dans lequel les demandes de fichiers peuvent être servies par un cache local pour éviter les coûts de récupération à partir d'un serveur distant. Nous étudions des algorithmes avec des garanties de performance basés sur des stratégies de descente miroir (DM). Nous montrons que la stratégie DM optimale dépend de la diversité présente dans un lot de demandes. Nous prouvons également que, lorsque le cache doit stocker le fichier entier, plutôt qu'une fraction, les stratégies DM peuvent être couplées à un schéma d'arrondi aléatoire qui préserve garanties de performance. Nous présentons de plus une extension aux réseaux de caches, et nous proposons un algorithme adaptatif distribué. Deuxièmement, nous étudions les caches de similarité qui peuvent répondre à une demande d'un objet avec des objets similaires stockés localement. Nous proposons un nouvel algorithme de mise en cache de similarité séquentiel qui utilise la descente de gradient pour naviguer dans l'espace de représentation continue des objets et trouver les objets appropriés à stocker dans le cache. Nous montrons que l'algorithme proposé réduit les coûts de service encourus par le système pour les systèmes de diffusion vidéo à 360 degrés et les systèmes de recommandation. Par la suite, nous montrons que le problème de mise en cache de similarité peut être formulé dans le cadre d'apprentissage séquentiel en utilisant un algorithme MD associée à un arrondi aléatoire.Troisièmement, nous présentons les réseaux de distribution d'inférence (RDI) émergents, des réseaux de nœuds informatiques qui se coordonnent pour satisfaire les demandes d'inférence d'apprentissage automatique (AA) en obtenant le meilleur compromis entre latence et précision. Nous proposons un algorithme adaptatif distribué pour l'allocation de modèles d'AA dans un RDI : chaque nœud met à jour dynamiquement son ensemble local de modèles d'inférence en fonction des demandes observées au cours du passé récent et d'un échange d'informations limité avec ses nœuds voisins. Finalement, nous étudions l'équité du problème d'allocation des ressources réseau sous le critère d'alpha-fairness. Nous reconnaissons deux objectifs d'équité différents qui surgissent naturellement dans ce problème : l'objectif d'équité de tranche bien compris qui vise à assurer l'équité à chaque tranche de temps, et l'objectif d'équité d'horizon moins exploré qui vise à assurer l'équité entre les utilités accumulées sur un horizon temporel. Nous étudions l'équité de l'horizon avec le regret comme métrique de performance et montrons que la disparition du regret ne peut être atteinte en présence d'un adversaire sans restriction. Nous proposons des restrictions sur les capacités de l'adversaire correspondant à des scénarios réalistes et un algorithme adaptatif qui garantit en effet la disparition du regret sous ces restrictions
Apprentissage séquentiel pour l'allocation de ressources dans les réseaux
Network resource allocation is a complex and fundamental problem in computer science. It is a process in which components of a networked system aim to provide a faster service to demands, or to reduce the computation or communication load on the system. The main factors that contribute to the complexity of this problem are that the demands arrive to the system in an unpredictable and sequential fashion and may compete for the different network resources. The ubiquity of network resource allocation problems has motivated extensive research to design new policies with provable guarantees. This thesis investigates several instances of the network resource allocation problem and proposes online policies with strong performance guarantees leveraging the online learning framework.First, we study the online caching problem in which demands for files can be served by a local cache to avoid retrieval costs from a remote server. We study no-regret algorithms based on online mirror descent (OMD) strategies. We show that the optimal OMD strategy depends on the request diversity present in a batch of demands. We also prove that, when the cache must store the entire file, rather than a fraction, OMD strategies can be coupled with a randomized rounding scheme that preserves regret guarantees. We also present an extension to cache networks, and we propose a no-regret distributed online policy.Second, we investigate similarity caches that can reply to a demand for an object with similar objects stored locally. We propose a new online similarity caching policy that employs gradient descent to navigate the continuous representation space of objects and find appropriate objects to store in the cache. We provide theoretical convergence guarantees under stationary demands and show the proposed policy reduces service costs incurred by the system for 360-video delivery systems and recommendation systems. Subsequently, we show that the similarity caching problem can be formulated in the online learning framework by utilizing an OMD policy paired with randomized rounding to achieve a no-regret guarantee.Third, we present the novel idea of inference delivery networks (IDNs), networks of computing nodes that coordinate to satisfy machine learning (ML) inference demands achieving the best trade-off between latency and accuracy. IDNs bridge the dichotomy between device and cloud execution by integrating inference delivery at the various tiers of the infrastructure continuum (access, edge, regional data center, cloud). We propose a no-regret distributed dynamic policy for ML model allocation in an IDN: each node dynamically updates its local set of inference models based on demands observed during the recent past plus limited information exchange with its neighboring nodes.Finally, we study the fairness of network resource allocation problem under the alpha-fairness criterion. We recognize two different fairness objectives that naturally arise in this problem: the well-understood slot-fairness objective that aims to ensure fairness at every timeslot, and the less explored horizon-fairness objective that aims to ensure fairness across utilities accumulated over a time horizon. We argue that horizon-fairness comes at a lower price in terms of social welfare. We study horizon-fairness with the regret as a performance metric and show that vanishing regret cannot be achieved in presence of an unrestricted adversary. We propose restrictions on the adversary's capabilities corresponding to realistic scenarios and an online policy that indeed guarantees vanishing regret under these restrictions. We demonstrate the applicability of the proposed fairness framework to a representative resource management problem considering a virtualized caching system where different caches cooperate to serve content requests.L'allocation de ressources dans les réseaux est un problème complexe et fondamental en informatique. Il s'agit d'un processus dans lequel les composants d'un système de réseau visent à fournir un service plus rapide aux demandes ou à réduire la charge de calcul ou de communication sur le système. Les principaux facteurs qui contribuent à la complexité de ce problème sont que les demandes arrivent au système de manière imprévisible et séquentielle et peuvent entrer en concurrence pour les différentes ressources du réseau. L'ubiquité des problèmes d'allocation de ressources dans les réseaux a motivé des recherches approfondies pour concevoir de nouveaux algorithmes avec des garanties prouvables. Cette thèse étudie plusieurs instances du problème d'allocation de ressources dans les réseaux et propose des algorithmes adaptatifs avec de fortes garanties de performances s'appuyant sur le cadre d'apprentissage séquentiel.Premièrement, nous étudions le problème de mise en cache séquentiel, dans lequel les demandes de fichiers peuvent être servies par un cache local pour éviter les coûts de récupération à partir d'un serveur distant. Nous étudions des algorithmes avec des garanties de performance basés sur des stratégies de descente miroir (DM). Nous montrons que la stratégie DM optimale dépend de la diversité présente dans un lot de demandes. Nous prouvons également que, lorsque le cache doit stocker le fichier entier, plutôt qu'une fraction, les stratégies DM peuvent être couplées à un schéma d'arrondi aléatoire qui préserve garanties de performance. Nous présentons de plus une extension aux réseaux de caches, et nous proposons un algorithme adaptatif distribué. Deuxièmement, nous étudions les caches de similarité qui peuvent répondre à une demande d'un objet avec des objets similaires stockés localement. Nous proposons un nouvel algorithme de mise en cache de similarité séquentiel qui utilise la descente de gradient pour naviguer dans l'espace de représentation continue des objets et trouver les objets appropriés à stocker dans le cache. Nous montrons que l'algorithme proposé réduit les coûts de service encourus par le système pour les systèmes de diffusion vidéo à 360 degrés et les systèmes de recommandation. Par la suite, nous montrons que le problème de mise en cache de similarité peut être formulé dans le cadre d'apprentissage séquentiel en utilisant un algorithme MD associée à un arrondi aléatoire.Troisièmement, nous présentons les réseaux de distribution d'inférence (RDI) émergents, des réseaux de nœuds informatiques qui se coordonnent pour satisfaire les demandes d'inférence d'apprentissage automatique (AA) en obtenant le meilleur compromis entre latence et précision. Nous proposons un algorithme adaptatif distribué pour l'allocation de modèles d'AA dans un RDI : chaque nœud met à jour dynamiquement son ensemble local de modèles d'inférence en fonction des demandes observées au cours du passé récent et d'un échange d'informations limité avec ses nœuds voisins. Finalement, nous étudions l'équité du problème d'allocation des ressources réseau sous le critère d'alpha-fairness. Nous reconnaissons deux objectifs d'équité différents qui surgissent naturellement dans ce problème : l'objectif d'équité de tranche bien compris qui vise à assurer l'équité à chaque tranche de temps, et l'objectif d'équité d'horizon moins exploré qui vise à assurer l'équité entre les utilités accumulées sur un horizon temporel. Nous étudions l'équité de l'horizon avec le regret comme métrique de performance et montrons que la disparition du regret ne peut être atteinte en présence d'un adversaire sans restriction. Nous proposons des restrictions sur les capacités de l'adversaire correspondant à des scénarios réalistes et un algorithme adaptatif qui garantit en effet la disparition du regret sous ces restrictions
Online learning for network resource allocation
L'allocation de ressources dans les réseaux est un problème complexe et fondamental en informatique. Il s'agit d'un processus dans lequel les composants d'un système de réseau visent à fournir un service plus rapide aux demandes ou à réduire la charge de calcul ou de communication sur le système. Les principaux facteurs qui contribuent à la complexité de ce problème sont que les demandes arrivent au système de manière imprévisible et séquentielle et peuvent entrer en concurrence pour les différentes ressources du réseau. L'ubiquité des problèmes d'allocation de ressources dans les réseaux a motivé des recherches approfondies pour concevoir de nouveaux algorithmes avec des garanties prouvables. Cette thèse étudie plusieurs instances du problème d'allocation de ressources dans les réseaux et propose des algorithmes adaptatifs avec de fortes garanties de performances s'appuyant sur le cadre d'apprentissage séquentiel.Premièrement, nous étudions le problème de mise en cache séquentiel, dans lequel les demandes de fichiers peuvent être servies par un cache local pour éviter les coûts de récupération à partir d'un serveur distant. Nous étudions des algorithmes avec des garanties de performance basés sur des stratégies de descente miroir (DM). Nous montrons que la stratégie DM optimale dépend de la diversité présente dans un lot de demandes. Nous prouvons également que, lorsque le cache doit stocker le fichier entier, plutôt qu'une fraction, les stratégies DM peuvent être couplées à un schéma d'arrondi aléatoire qui préserve garanties de performance. Nous présentons de plus une extension aux réseaux de caches, et nous proposons un algorithme adaptatif distribué. Deuxièmement, nous étudions les caches de similarité qui peuvent répondre à une demande d'un objet avec des objets similaires stockés localement. Nous proposons un nouvel algorithme de mise en cache de similarité séquentiel qui utilise la descente de gradient pour naviguer dans l'espace de représentation continue des objets et trouver les objets appropriés à stocker dans le cache. Nous montrons que l'algorithme proposé réduit les coûts de service encourus par le système pour les systèmes de diffusion vidéo à 360 degrés et les systèmes de recommandation. Par la suite, nous montrons que le problème de mise en cache de similarité peut être formulé dans le cadre d'apprentissage séquentiel en utilisant un algorithme MD associée à un arrondi aléatoire.Troisièmement, nous présentons les réseaux de distribution d'inférence (RDI) émergents, des réseaux de nœuds informatiques qui se coordonnent pour satisfaire les demandes d'inférence d'apprentissage automatique (AA) en obtenant le meilleur compromis entre latence et précision. Nous proposons un algorithme adaptatif distribué pour l'allocation de modèles d'AA dans un RDI : chaque nœud met à jour dynamiquement son ensemble local de modèles d'inférence en fonction des demandes observées au cours du passé récent et d'un échange d'informations limité avec ses nœuds voisins. Finalement, nous étudions l'équité du problème d'allocation des ressources réseau sous le critère d'alpha-fairness. Nous reconnaissons deux objectifs d'équité différents qui surgissent naturellement dans ce problème : l'objectif d'équité de tranche bien compris qui vise à assurer l'équité à chaque tranche de temps, et l'objectif d'équité d'horizon moins exploré qui vise à assurer l'équité entre les utilités accumulées sur un horizon temporel. Nous étudions l'équité de l'horizon avec le regret comme métrique de performance et montrons que la disparition du regret ne peut être atteinte en présence d'un adversaire sans restriction. Nous proposons des restrictions sur les capacités de l'adversaire correspondant à des scénarios réalistes et un algorithme adaptatif qui garantit en effet la disparition du regret sous ces restrictions.Network resource allocation is a complex and fundamental problem in computer science. It is a process in which components of a networked system aim to provide a faster service to demands, or to reduce the computation or communication load on the system. The main factors that contribute to the complexity of this problem are that the demands arrive to the system in an unpredictable and sequential fashion and may compete for the different network resources. The ubiquity of network resource allocation problems has motivated extensive research to design new policies with provable guarantees. This thesis investigates several instances of the network resource allocation problem and proposes online policies with strong performance guarantees leveraging the online learning framework.First, we study the online caching problem in which demands for files can be served by a local cache to avoid retrieval costs from a remote server. We study no-regret algorithms based on online mirror descent (OMD) strategies. We show that the optimal OMD strategy depends on the request diversity present in a batch of demands. We also prove that, when the cache must store the entire file, rather than a fraction, OMD strategies can be coupled with a randomized rounding scheme that preserves regret guarantees. We also present an extension to cache networks, and we propose a no-regret distributed online policy.Second, we investigate similarity caches that can reply to a demand for an object with similar objects stored locally. We propose a new online similarity caching policy that employs gradient descent to navigate the continuous representation space of objects and find appropriate objects to store in the cache. We provide theoretical convergence guarantees under stationary demands and show the proposed policy reduces service costs incurred by the system for 360-video delivery systems and recommendation systems. Subsequently, we show that the similarity caching problem can be formulated in the online learning framework by utilizing an OMD policy paired with randomized rounding to achieve a no-regret guarantee.Third, we present the novel idea of inference delivery networks (IDNs), networks of computing nodes that coordinate to satisfy machine learning (ML) inference demands achieving the best trade-off between latency and accuracy. IDNs bridge the dichotomy between device and cloud execution by integrating inference delivery at the various tiers of the infrastructure continuum (access, edge, regional data center, cloud). We propose a no-regret distributed dynamic policy for ML model allocation in an IDN: each node dynamically updates its local set of inference models based on demands observed during the recent past plus limited information exchange with its neighboring nodes.Finally, we study the fairness of network resource allocation problem under the alpha-fairness criterion. We recognize two different fairness objectives that naturally arise in this problem: the well-understood slot-fairness objective that aims to ensure fairness at every timeslot, and the less explored horizon-fairness objective that aims to ensure fairness across utilities accumulated over a time horizon. We argue that horizon-fairness comes at a lower price in terms of social welfare. We study horizon-fairness with the regret as a performance metric and show that vanishing regret cannot be achieved in presence of an unrestricted adversary. We propose restrictions on the adversary's capabilities corresponding to realistic scenarios and an online policy that indeed guarantees vanishing regret under these restrictions. We demonstrate the applicability of the proposed fairness framework to a representative resource management problem considering a virtualized caching system where different caches cooperate to serve content requests
No-Regret Caching via Online Mirror Descent
We study an online caching problem in which requests can be served by a local
cache to avoid retrieval costs from a remote server. The cache can update its
state after a batch of requests and store an arbitrarily small fraction of each
content. We study no-regret algorithms based on Online Mirror Descent (OMD)
strategies. We show that the optimal OMD strategy depends on the request
diversity present in a batch. We also prove that, when the cache must store the
entire content, rather than a fraction, OMD strategies can be coupled with a
randomized rounding scheme that preserves regret guarantees
Enabling Long-term Fairness in Dynamic Resource Allocation
International audienceWe study the fairness of dynamic resource allocation problem under the α-fairness criterion. We recognize two different fairness objectives that naturally arise in this problem: the well-understood slot-fairness objective that aims to ensure fairness at every timeslot, and the less explored horizon-fairness objective that aims to ensure fairness across utilities accumulated over a time horizon. We argue that horizon-fairness comes at a lower price in terms of social welfare. We study horizon-fairness with the regret as a performance metric and show that vanishing regret cannot be achieved in presence of an unrestricted adversary. We propose restrictions on the adversary's capabilities corresponding to realistic scenarios and an online policy that indeed guarantees vanishing regret under these restrictions. We demonstrate the applicability of the proposed fairness framework to a representative resource management problem considering a virtualized caching system where different caches cooperate to serve content requests
AÇAI: Ascent Similarity Caching with Approximate Indexes
International audienceSimilarity search is a key operation in multimedia retrieval systems and recommender systems, and it will play an important role also for future machine learning and augmented reality applications. When these systems need to serve large objects with tight delay constraints, edge servers close to the end-user can operate as similarity caches to speed up the retrieval. In this paper we present AÇAI, a new similarity caching policy which improves on the state of the art by using (i) an (approximate) index for the whole catalog to decide which objects to serve locally and which to retrieve from the remote server, and (ii) a mirror ascent algorithm to update the set of local objects with strong guarantees even when the request process does not exhibit any statistical regularity