Search CORE

130 research outputs found

Exploring the Geography of Tags in Youtube Views

Author: Delbruel Stéphane
Frey Davide
Taïani François
Publication venue: HAL CCSD
Publication date: 28/04/2015
Field of study

Although tags play a critical role in many social media,their link to the geographic distribution of user generatedvideos has been little investigated. In this paper, we ana-lyze the correlation between the geographic distribution ofa video’s views and the tags attached to this video in aYoutube dataset. We show that tags can be interpreted asmarkers of a video’s geographic diffusion, with some tagsstrongly linked to well identified geographic areas. Basedon our findings, we explore whether the distribution of avideo’s views can be predicted from its tags. We demon-strate how this predictive power could help improve on-linevideo services by preferentially storing videos close to wherethey are likely to be viewed. Our results show that even witha simplistic approach we are able to predict a minimum of65.9% of a video’s views for a majority of videos, and thata tag-based placement strategy can improve the hit rate ofa distributed on-line video service by up to 6.8% globally,with an improvement of up to 34% in the USA

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Simple, Efficient and Convenient Decentralized Multi-Task Learning for Neural Networks

Author: Bouchra Pilet Amaury
Frey Davide
Taïani François
Publication venue: HAL CCSD
Publication date: 22/11/2019
Field of study

Artificial intelligence relying on machine learning is increasingly used on small, personal, network-connected devices such as smartphones and vocal assistants, and these applications will likely evolve with the development of the Internet of Things. The learning process requires a lot of data, often real users’ data, and computing power. Decentralized machine learning can help to protect users’ privacy by keeping sensitive training data on users’ devices, and has the potential to alleviate the cost born by service providers by off-loading some of the learning effort to user devices. Unfortunately, most approaches proposed so far for distributed learning with neural network are mono-task, and do not transfer easily to multi-tasks problems, for which users seek to solve related but distinct learning tasks and the few existing multi-task approaches have serious limitations. In this paper, we propose a novel learning method for neural networks that is decentralized, multitask, and keeps users’ data local. Our approach works with different learning algorithms, on various types of neural networks. We formally analyze the convergence of our method, and we evaluateits efficiency in different situations on various kind of neural networks, with different learning algorithms, thus demonstrating its benefits in terms of learning quality and convergence

GOSSIPKIT: A Unified Component Framework for Gossip

Author: Blair Gordon S.
Lin Shen
Taïani François
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2014
Field of study

International audienceAlthough the principles of gossip protocols are relatively easy to grasp, their variety can make their design and evaluation highly time consuming. This problem is compounded by the lack of a unified programming framework for gossip, which means developers cannot easily reuse, compose, or adapt existing solutions to fit their needs, and have limited opportunities to share knowledge and ideas. In this paper, we consider how component frameworks, which have been widely applied to implement middleware solutions, can facilitate the development of gossip-based systems in a way that is both generic and simple. We show how such an approach can maximise code reuse, simplify the implementation of gossip protocols, and facilitate dynamic evolution and re-deployment

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Lancaster E-Prints

Hal-Diderot

HAL-Rennes 1

Good-case Early-Stopping Latency of Synchronous Byzantine Reliable Broadcast: The Deterministic Case (Extended Version)

Author: Albouy Timothé
Frey Davide
Raynal Michel
Taïani François
Publication venue
Publication date: 10/03/2023
Field of study

This paper considers the good-case latency of Byzantine Reliable Broadcast (BRB), i.e., the time taken by correct processes to deliver a message when the initial sender is correct. This time plays a crucial role in the performance of practical distributed systems. Although significant strides have been made in recent years on this question, progress has mainly focused on either asynchronous or randomized algorithms. By contrast, the good-case latency of deterministic synchronous BRB under a majority of Byzantine faults has been little studied. In particular, it was not known whether a goodcase latency below the worst-case bound of t + 1 rounds could be obtained. This work answers this open question positively and proposes a deterministic synchronous Byzantine reliable broadcast that achieves a good-case latency of max(2, t + 3 -- c) rounds, where t is the upper bound on the number of Byzantine processes and c the number of effectively correct processes

arXiv.org e-Print Archive

Géodistribution des tags et des vues dans Youtube

Author: Delbruel Stéphane
Taïani François
Publication venue: HAL CCSD
Publication date: 30/06/2015
Field of study

International audienceDans cet article, nous analysons la corrélation entre la distribution géographique des vues d'une vidéo et les tags de cette vidéo au sein d'un dataset YouTube. Nous montrons que les tags peuvent servir d'indice sur la diffusion géographique d'une vidéo, avec certains tags très fortement liés à des zones géographiques bien définies. Cette corrélation peut être exploitée pour prédire correctement un minimum de 68% des vues pour une majorité des vidéos

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

D.1.2 – Modular quasi-causal data structures

Author: Frey Davide
Friedman Roy
Mostefaoui Achour
Perrin Matthieu
Raynal Michel
Taïani François
Publication venue: HAL CCSD
Publication date: 01/11/2015
Field of study

GDD_HCERES2020In large scale systems such as the Internet, replicating data is an essential feature in order to provide availability and fault-tolerance. Attiya and Welch proved that using strong consistency criteria such as atomicity is costly as each operation may need an execution time linear with the latency of the communication network. Weaker consistency criteria like causal consistency and PRAM consistency do not ensure convergence. The different replicas are not guaranteed to converge towards a unique state. Eventual consistency guarantees that all replicas eventually converge when the participants stop updating. However, it fails to fully specify the semantics of the operations on shared objects and requires additional non-intuitive and error-prone distributed specification techniques. In addition existing consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. In this deliverable, we address these issues with two novel contributions. The first contribution proposes a notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. We use this graph to provide a generic approach to the hybridization of data consistency conditions into the same system. Based on this, we design a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). The second contribution of this deliverable focuses on improving the limitations of eventual consistency. To this end, we formalize a new consistency criterion, called update consistency, that requires the state of a replicated object to be consistent with a linearization of all the updates. In other words, whereas atomicity imposes a linearization of all of the operations, this criterion imposes this only on updates. Consequently some read operations may return outdated values. Update consistency is stronger than eventual consistency , so we can replace eventually consistent objects with update consistent ones in any program. Finally, we prove that update consistency is universal, in the sense that any object can be implemented under this criterion in a distributed system where any number of nodes may crash

Context Adaptive Cooperation

Author: Albouy Timothé
Frey Davide
Gestin Mathieu
Raynal Michel
Taïani François
Publication venue
Publication date: 15/11/2023
Field of study

Reliable broadcast and consensus are the two pillars that support a lot of non-trivial fault-tolerant distributed middleware and fault-tolerant distributed systems. While they have close definitions, they strongly differ in the underlying assumptions needed to implement each of them. Reliable broadcast can be implemented in asynchronous systems in the presence of crash or Byzantine failures while Consensus cannot. This key difference stems from the fact that consensus involves synchronization between multiple processes that concurrently propose values, while reliable broadcast simply involves delivering a message from a predefined sender. This paper strikes a balance between these two agreement abstractions in the presence of Byzantine failures. It proposes CAC, a novel agreement abstraction that enables multiple processes to broadcast messages simultaneously, while guaranteeing that (despite potential conflicts, asynchrony, and Byzantine behaviors) the non-faulty processes will agree on messages deliveries. We show that this novel abstraction can enable more efficient algorithms for a variety of applications (such as money transfer where several people can share a same account). This is obtained by focusing the need for synchronization only on the processes that actually need to synchronize

arXiv.org e-Print Archive

La cohérence en oeil de poisson : maintenir la synchronisation des données dans un monde géo-répliqué

Author: Friedman Roy
Raynal Michel
Taïani François
Publication venue: HAL CCSD
Publication date: 04/11/2014
Field of study

Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this paper introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of the paper is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so the paper not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems.Au cours des trente dernières années, de nombreuses conditions de cohérence pour les données répliquées ont été proposées et mises en oeuvre. Les exemples courants de ces conditions comprennent la linéarisabilité (ou atomicité), la cohérence séquentielle, la cohérence causale, et la cohérence éventuelle. Ces conditions de cohérence sont généralement définies indépendamment des entités informatiques (noeuds) qui manipulent les données répliquées; c'est à dire qu'elles ne prennent pas en compte la façon dont les entités informatiques peuvent être liées les unes aux autres, ou géographiquement distribuées. Pour combler ce manque, ce document introduit la notion de graphe de proximité entre les noeuds de calcul d'un système réparti. Si deux noeuds sont connectés dans ce graphe, leurs activités doivent satisfaire une condition de cohérence forte, tandis que les opérations invoquées par d'autres noeuds peuvent ne satisfaire qu'une condition plus faible. Nous proposons d'utiliser un tel graphe pour fournir une approche générique à l'hybridation de conditions de cohérence des données dans un même système. Nous illustrons cette approche sur l'exemple de la cohérence séquentielle et de la cohérence causale, et présentons un modèle dans lequel, d'une part, toutes les opérations sont causalement cohérentes, et, d'autre part, les opérations par des processus qui sont voisins dans le graphe de proximité satisfont la cohérence séquentielle. Nous proposons et prouvons un algorithme distribué basé sur ce graphe de proximité, qui combine la cohérence séquentielle et la cohérence causal (nous appelons la cohérence obtenue cohérence en oeil de poisson). Ce faisant, le papier non seulement étend le domaine des conditions de cohérence, mais fournit une solution algorithmiquement correcte et générique directement applicable aux systèmes géo-répartis modernes

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1