20 research outputs found
Model of random packings of different size balls
We develop a model to describe the properties of random assemblies of
polydisperse hard spheres. We show that the key features to describe the system
are (i) the dependence between the free volume of a sphere and the various
coordination numbers between the species, and (ii) the dependence of the
coordination numbers with the concentration of species; quantities that are
calculated analytically. The model predicts the density of random close packing
and random loose packing of polydisperse systems for a given distribution of
ball size and describes packings for any interparticle friction coefficient.
The formalism allows to determine the optimal packing over different
distributions and may help to treat packing problems of non-spherical particles
which are notoriously difficult to solve.Comment: 6 pages, 6 figure
Mining bipartite graphs to improve semantic pedophile activity detection
International audiencePeer-to-peer (P2P) networks are popular to exchange large volumes of data through the Internet. Paedophile activity is a very important topic for our society and some works have recently attempted to gauge the extent of paedophile exchanges on P2P networks. A key issue is to obtain an efficient detection tool, which may decide if a sequence of keywords is related to the topic or not. We propose to use social network analysis in a large dataset from a P2P network to improve a state-of-the-art filter for paedophile queries. We obtain queries and thus combinations of words which are not tagged by the filter but should be. We also perform some experiments to explore if the original four categories of paedophile queries were to be found by topological measures only
Mesures de proximité appliquées à la détection de communautés dans les grands graphes de terrain
Many kinds of data can be represented as a graph (a set of nodes linked by edges). In this thesis, I show that two major problems, community detection and the measure of the proximity between two nodes have intricate connexions. Particularly, I will present a framework that, using a proximity measure, can isolate a set of nodes. Its general principle is rather straightforward and can be described as follows. Given a node of interest in a graph, the proximity of all nodes in the network to that node of interest is computed. Then, if a small set of nodes have a high proximity to the node of interest while all other have a small proximity, we can directly conclude that the small set of nodes is the community of the node of interest. I'll then show how to tweak this idea to (i) find all communities of a given node, (ii) complete a set of nodes into a community and (iii) find all overlapping communities in a network. I will validate these methods on real and synthetic network datasets.Un grand nombre de donnĂ©es sont reprĂ©sentables sous la forme d'un graphe (ensemble de nĆuds liĂ©s par des liens). Dans cet exposĂ©, je montrerai que deux problĂšmes majeurs concernant l'analyse de ces graphes de terrain, Ă savoir la dĂ©tection de communautĂ©s (dĂ©finies comme des groupes de nĆuds qu'il est pertinent de rassembler) et la mise au point de mesures de proximitĂ© (Ă©valuant dans quelle mesure deux nĆuds sont topologiquement proches), sont fortement intriquĂ©es. En particulier, je prĂ©sente une mĂ©thode qui permet, Ă l'aide d'une mesure de proximitĂ©, d'isoler des groupes de nĆuds. Son principe gĂ©nĂ©ral de fonctionnement est plutĂŽt simple et peut ĂȘtre dĂ©crit comme suit. Ătant donnĂ© un nĆud d'intĂ©rĂȘt dans le graphe, on calcule la proximitĂ© de chaque nĆud dans le graphe Ă ce nĆud d'intĂ©rĂȘt. Ensuite, si un petit groupe de nĆuds obtient une proximitĂ© trĂšs Ă©levĂ©e Ă ce nĆud d'intĂ©rĂȘt et que tous les autres nĆuds du graphe ont une proximitĂ© trĂšs faible, alors on peut directement conclure que le petit groupe de nĆuds est "la communautĂ©" du nĆud d'intĂ©rĂȘt. Je montre ensuite comment dĂ©cliner cette idĂ©e pour rĂ©soudre efficacement les trois problĂšmes suivants : (i) trouver des communautĂ©s auxquelles un nĆud donnĂ© appartient, (ii) complĂ©ter un ensemble de nĆuds en une communautĂ© et (iii) trouver des communautĂ©s recouvrantes dans un rĂ©seau
Listing k-cliques in Sparse Real-World Graphs
International audienceMotivated by recent studies in the data mining community which require to efficiently list all k-cliques, we revisit the iconic algorithm of Chiba and Nishizeki and develop the most efficient parallel algorithm for such a problem. Our theoretical analysis provides the best asymptotic upper bound on the running time of our algorithm for the case when the input graph is sparse. Our experimental evaluation on large real-world graphs shows that our parallel algorithm is faster than state-of-the-art algorithms, while boasting an excellent degree of parallelism. In particular, we are able to list all k-cliques (for any k) in graphs containing up to tens of millions of edges as well as all 10-cliques in graphs containing billions of edges, within a few minutes and a few hours respectively. Finally, we show how our algorithm can be employed as an effective subroutine for finding the k-clique core decomposition and an approximate k-clique densest subgraphs in very large real-world graphs
DĂ©plier la structure communautaire dâun rĂ©seau en mesurant la proximitĂ© aux reprĂ©sentants de communautĂ©
International audienceHow to find all overlapping communities in a complex network? That is, how to find all relevant groups of nodes in a linked dataset? No entirely satisfying solution to that important problem exists, having a criterion to decide which group is relevant and finding quickly these groups in large networks are bottlenecks. We found that in many networks the number of these groups is limited and that there exist, for each group, at least one node that can characterize it by itself: a node belonging only to that group and important within it. We call such a node a community representative. We develop an algorithm to find these overlapping communities. The community detection is done through measuring the proximities of all nodes to the representatives and then finding irregularities in the decrease of these values reflecting the presence of relevant groups. We show that our approach handles very large real-world networks and have comparable or even better performances compared to state of the art methods.Nous proposons un algorithme pour déplier la structure communautaire des grands graphes de terrain. L'algorithme est basé sur la détection de la communauté de chaque représentant communautaire : noeud contenu dans une seule communauté et important en son sein. Cette détection est faite avec une approche à base de mesure de proximité développée récemment. Par comparaison avec d'autres méthodes de l'état de l'art nous montrons que notre algorithme a des performances équivalentes voire meilleures et est capable de traiter les plus grands graphes de terrain
Une approche à base de proximité pour la détection de communautés egocentrées
International audienceNous proposons ici une approche performante pour déplier la structure communautaire egocentrée sur un sommet d'un gaphe. Nous montrons que, bien que chaque sommet d'un réseau appartienne en général à plusieurs communautés, il est souvent possible d'identifier une communauté unique si l'on considÚre deux sommets bien choisis. La méthodologie que nous proposons repose sur cette notion de communauté multi-egocentrée ainsi que sur l'utilisation d'une mesure de proximité dérivée de techniques de dynamique d'opinion, la carryover opinion. Cette approche pallie les limites des fonctions de qualité traditionnellement utilisées pour la détection de communautés egocentrées, et consiste à étudier les irrégularités dans la décroissance de cette mesure de proximité
Multi-ego-centered communities in practice
International audienceWe propose here a framework to unfold the ego-centered community structure of a given node in a network. The framework is not based on the optimization of a quality function, but on the study of the irregularity of the decrease of a proximity measure. It is a practical use of the notion of multi-ego-centered community and we validate the pertinence of the approach on benchmarks and a real-world network of wikipedia pages
Learning a proximity measure to complete a community
International audienceIn large-scale online complex networks (Wikipedia, Facebook, Twitter, etc.) finding nodes related to a specific topic is a strategic research subject. This article focuses on two central notions in this context: communities (groups of highly connected nodes) and proximity measures (indicating whether nodes are topologically close). We propose a parametrized proximity measure which, given a set of nodes belonging to a community, learns the optimal parameters and identifies the other nodes of this community, called multi-ego-centered community as it is centered on a set of nodes. We validate our results on a large dataset of categorized Wikipedia pages and on benchmarks, we also show that our approach performs better than existing ones. Our main contributions are (i) a new ergonomic parametrized proximity measure, (ii) the automatic tuning of the proximity's parameters and (iii) the unsupervised detection of community boundaries
Calculation of the Voronoi boundary for lens-shaped particles and spherocylinders
We have recently developed a mean-field theory to estimate the packing
fraction of non-spherical particles [A. Baule et al., Nature Commun. (2013)].
The central quantity in this framework is the Voronoi excluded volume, which
generalizes the standard hard-core excluded volume appearing in Onsager's
theory. The Voronoi excluded volume is defined from an exclusion condition for
the Voronoi boundary between two particles, which is usually not tractable
analytically. Here, we show how the technical difficulties in calculating the
Voronoi boundary can be overcome for lens-shaped particles and spherocylinders,
two standard prolate and oblate shapes with rotational symmetry. By decomposing
these shapes into unions and intersections of spheres analytical expressions
can be obtained.Comment: 19 pages, 8 figure
Proximity measure applied to community detection in complex networks
Un grand nombre de donnĂ©es sont reprĂ©sentables sous la forme d'un graphe (ensemble de nĆuds liĂ©s par des liens). Dans cet exposĂ©, je montrerai que deux problĂšmes majeurs concernant l'analyse de ces graphes de terrain, Ă savoir la dĂ©tection de communautĂ©s (dĂ©finies comme des groupes de nĆuds qu'il est pertinent de rassembler) et la mise au point de mesures de proximitĂ© (Ă©valuant dans quelle mesure deux nĆuds sont topologiquement proches), sont fortement intriquĂ©es. En particulier, je prĂ©sente une mĂ©thode qui permet, Ă l'aide d'une mesure de proximitĂ©, d'isoler des groupes de nĆuds. Son principe gĂ©nĂ©ral de fonctionnement est plutĂŽt simple et peut ĂȘtre dĂ©crit comme suit. Ătant donnĂ© un nĆud d'intĂ©rĂȘt dans le graphe, on calcule la proximitĂ© de chaque nĆud dans le graphe Ă ce nĆud d'intĂ©rĂȘt. Ensuite, si un petit groupe de nĆuds obtient une proximitĂ© trĂšs Ă©levĂ©e Ă ce nĆud d'intĂ©rĂȘt et que tous les autres nĆuds du graphe ont une proximitĂ© trĂšs faible, alors on peut directement conclure que le petit groupe de nĆuds est "la communautĂ©" du nĆud d'intĂ©rĂȘt. Je montre ensuite comment dĂ©cliner cette idĂ©e pour rĂ©soudre efficacement les trois problĂšmes suivants : (i) trouver des communautĂ©s auxquelles un nĆud donnĂ© appartient, (ii) complĂ©ter un ensemble de nĆuds en une communautĂ© et (iii) trouver des communautĂ©s recouvrantes dans un rĂ©seau.Many kinds of data can be represented as a graph (a set of nodes linked by edges). In this thesis, I show that two major problems, community detection and the measure of the proximity between two nodes have intricate connexions. Particularly, I will present a framework that, using a proximity measure, can isolate a set of nodes. Its general principle is rather straightforward and can be described as follows. Given a node of interest in a graph, the proximity of all nodes in the network to that node of interest is computed. Then, if a small set of nodes have a high proximity to the node of interest while all other have a small proximity, we can directly conclude that the small set of nodes is the community of the node of interest. I'll then show how to tweak this idea to (i) find all communities of a given node, (ii) complete a set of nodes into a community and (iii) find all overlapping communities in a network. I will validate these methods on real and synthetic network datasets