130 research outputs found
Exploring the Geography of Tags in Youtube Views
Although tags play a critical role in many social media,their link to the geographic distribution of user generatedvideos has been little investigated. In this paper, we ana-lyze the correlation between the geographic distribution ofa videoâs views and the tags attached to this video in aYoutube dataset. We show that tags can be interpreted asmarkers of a videoâs geographic diffusion, with some tagsstrongly linked to well identified geographic areas. Basedon our findings, we explore whether the distribution of avideoâs views can be predicted from its tags. We demon-strate how this predictive power could help improve on-linevideo services by preferentially storing videos close to wherethey are likely to be viewed. Our results show that even witha simplistic approach we are able to predict a minimum of65.9% of a videoâs views for a majority of videos, and thata tag-based placement strategy can improve the hit rate ofa distributed on-line video service by up to 6.8% globally,with an improvement of up to 34% in the USA
Simple, Efficient and Convenient Decentralized Multi-Task Learning for Neural Networks
Artificial intelligence relying on machine learning is increasingly used on small, personal, network-connected devices such as smartphones and vocal assistants, and these applications will likely evolve with the development of the Internet of Things. The learning process requires a lot of data, often real usersâ data, and computing power. Decentralized machine learning can help to protect usersâ privacy by keeping sensitive training data on usersâ devices, and has the potential to alleviate the cost born by service providers by off-loading some of the learning effort to user devices. Unfortunately, most approaches proposed so far for distributed learning with neural network are mono-task, and do not transfer easily to multi-tasks problems, for which users seek to solve related but distinct learning tasks and the few existing multi-task approaches have serious limitations. In this paper, we propose a novel learning method for neural networks that is decentralized, multitask, and keeps usersâ data local. Our approach works with different learning algorithms, on various types of neural networks. We formally analyze the convergence of our method, and we evaluateits efficiency in different situations on various kind of neural networks, with different learning algorithms, thus demonstrating its benefits in terms of learning quality and convergence
GOSSIPKIT: A Unified Component Framework for Gossip
International audienceAlthough the principles of gossip protocols are relatively easy to grasp, their variety can make their design and evaluation highly time consuming. This problem is compounded by the lack of a unified programming framework for gossip, which means developers cannot easily reuse, compose, or adapt existing solutions to fit their needs, and have limited opportunities to share knowledge and ideas. In this paper, we consider how component frameworks, which have been widely applied to implement middleware solutions, can facilitate the development of gossip-based systems in a way that is both generic and simple. We show how such an approach can maximise code reuse, simplify the implementation of gossip protocols, and facilitate dynamic evolution and re-deployment
Good-case Early-Stopping Latency of Synchronous Byzantine Reliable Broadcast: The Deterministic Case (Extended Version)
This paper considers the good-case latency of Byzantine Reliable Broadcast
(BRB), i.e., the time taken by correct processes to deliver a message when the
initial sender is correct. This time plays a crucial role in the performance of
practical distributed systems. Although significant strides have been made in
recent years on this question, progress has mainly focused on either
asynchronous or randomized algorithms. By contrast, the good-case latency of
deterministic synchronous BRB under a majority of Byzantine faults has been
little studied. In particular, it was not known whether a goodcase latency
below the worst-case bound of t + 1 rounds could be obtained. This work answers
this open question positively and proposes a deterministic synchronous
Byzantine reliable broadcast that achieves a good-case latency of max(2, t + 3
-- c) rounds, where t is the upper bound on the number of Byzantine processes
and c the number of effectively correct processes
GĂ©odistribution des tags et des vues dans Youtube
International audienceDans cet article, nous analysons la corrĂ©lation entre la distribution gĂ©ographique des vues d'une vidĂ©o et les tags de cette vidĂ©o au sein d'un dataset YouTube. Nous montrons que les tags peuvent servir d'indice sur la diffusion gĂ©ographique d'une vidĂ©o, avec certains tags trĂšs fortement liĂ©s Ă des zones gĂ©ographiques bien dĂ©finies. Cette corrĂ©lation peut ĂȘtre exploitĂ©e pour prĂ©dire correctement un minimum de 68% des vues pour une majoritĂ© des vidĂ©os
D.1.2 â Modular quasi-causal data structures
GDD_HCERES2020In large scale systems such as the Internet, replicating data is an essential feature in order to provide availability and fault-tolerance. Attiya and Welch proved that using strong consistency criteria such as atomicity is costly as each operation may need an execution time linear with the latency of the communication network. Weaker consistency criteria like causal consistency and PRAM consistency do not ensure convergence. The different replicas are not guaranteed to converge towards a unique state. Eventual consistency guarantees that all replicas eventually converge when the participants stop updating. However, it fails to fully specify the semantics of the operations on shared objects and requires additional non-intuitive and error-prone distributed specification techniques. In addition existing consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. In this deliverable, we address these issues with two novel contributions. The first contribution proposes a notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. We use this graph to provide a generic approach to the hybridization of data consistency conditions into the same system. Based on this, we design a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). The second contribution of this deliverable focuses on improving the limitations of eventual consistency. To this end, we formalize a new consistency criterion, called update consistency, that requires the state of a replicated object to be consistent with a linearization of all the updates. In other words, whereas atomicity imposes a linearization of all of the operations, this criterion imposes this only on updates. Consequently some read operations may return outdated values. Update consistency is stronger than eventual consistency , so we can replace eventually consistent objects with update consistent ones in any program. Finally, we prove that update consistency is universal, in the sense that any object can be implemented under this criterion in a distributed system where any number of nodes may crash
Context Adaptive Cooperation
Reliable broadcast and consensus are the two pillars that support a lot of
non-trivial fault-tolerant distributed middleware and fault-tolerant
distributed systems. While they have close definitions, they strongly differ in
the underlying assumptions needed to implement each of them. Reliable broadcast
can be implemented in asynchronous systems in the presence of crash or
Byzantine failures while Consensus cannot. This key difference stems from the
fact that consensus involves synchronization between multiple processes that
concurrently propose values, while reliable broadcast simply involves
delivering a message from a predefined sender. This paper strikes a balance
between these two agreement abstractions in the presence of Byzantine failures.
It proposes CAC, a novel agreement abstraction that enables multiple processes
to broadcast messages simultaneously, while guaranteeing that (despite
potential conflicts, asynchrony, and Byzantine behaviors) the non-faulty
processes will agree on messages deliveries. We show that this novel
abstraction can enable more efficient algorithms for a variety of applications
(such as money transfer where several people can share a same account). This is
obtained by focusing the need for synchronization only on the processes that
actually need to synchronize
La cohérence en oeil de poisson : maintenir la synchronisation des données dans un monde géo-répliqué
Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this paper introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of the paper is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so the paper not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems.Au cours des trente derniĂšres annĂ©es, de nombreuses conditions de cohĂ©rence pour les donnĂ©es rĂ©pliquĂ©es ont Ă©tĂ© proposĂ©es et mises en oeuvre. Les exemples courants de ces conditions comprennent la linĂ©arisabilitĂ© (ou atomicitĂ©), la cohĂ©rence sĂ©quentielle, la cohĂ©rence causale, et la cohĂ©rence Ă©ventuelle. Ces conditions de cohĂ©rence sont gĂ©nĂ©ralement dĂ©finies indĂ©pendamment des entitĂ©s informatiques (noeuds) qui manipulent les donnĂ©es rĂ©pliquĂ©es; c'est Ă dire qu'elles ne prennent pas en compte la façon dont les entitĂ©s informatiques peuvent ĂȘtre liĂ©es les unes aux autres, ou gĂ©ographiquement distribuĂ©es. Pour combler ce manque, ce document introduit la notion de graphe de proximitĂ© entre les noeuds de calcul d'un systĂšme rĂ©parti. Si deux noeuds sont connectĂ©s dans ce graphe, leurs activitĂ©s doivent satisfaire une condition de cohĂ©rence forte, tandis que les opĂ©rations invoquĂ©es par d'autres noeuds peuvent ne satisfaire qu'une condition plus faible. Nous proposons d'utiliser un tel graphe pour fournir une approche gĂ©nĂ©rique Ă l'hybridation de conditions de cohĂ©rence des donnĂ©es dans un mĂȘme systĂšme. Nous illustrons cette approche sur l'exemple de la cohĂ©rence sĂ©quentielle et de la cohĂ©rence causale, et prĂ©sentons un modĂšle dans lequel, d'une part, toutes les opĂ©rations sont causalement cohĂ©rentes, et, d'autre part, les opĂ©rations par des processus qui sont voisins dans le graphe de proximitĂ© satisfont la cohĂ©rence sĂ©quentielle. Nous proposons et prouvons un algorithme distribuĂ© basĂ© sur ce graphe de proximitĂ©, qui combine la cohĂ©rence sĂ©quentielle et la cohĂ©rence causal (nous appelons la cohĂ©rence obtenue cohĂ©rence en oeil de poisson). Ce faisant, le papier non seulement Ă©tend le domaine des conditions de cohĂ©rence, mais fournit une solution algorithmiquement correcte et gĂ©nĂ©rique directement applicable aux systĂšmes gĂ©o-rĂ©partis modernes
- âŠ