17,084 research outputs found
Clustering on k-Edge-Colored Graphs
International audienceWe study the Max k-colored clustering problem, where, given an edge-colored graph with k colors, we seek to color the vertices of the graph so as to find a clustering of the vertices maximizing the number (or the weight) of matched edges, i.e. the edges having the same color as their extremities. We show that the cardinality problem is NP-hard even for edge-colored bipartite graphs with a chromatic degree equal to two and k ≥ 3. Our main result is a constant approximation algorithm for the weighted version of the Max k-colored clustering problem which is based on a rounding of a natural linear programming relaxation. For graphs with chromatic degree equal to two, we improve this ratio by exploiting the relation of our problem with the Max 2-and problem. We also present a reduction to the maximum-weight independent set (IS) problem in bipartite graphs which leads to a polynomial time algorithm for the case of two colors
Multilayer Networks
In most natural and engineered systems, a set of entities interact with each
other in complicated patterns that can encompass multiple types of
relationships, change in time, and include other types of complications. Such
systems include multiple subsystems and layers of connectivity, and it is
important to take such "multilayer" features into account to try to improve our
understanding of complex systems. Consequently, it is necessary to generalize
"traditional" network theory by developing (and validating) a framework and
associated tools to study multilayer systems in a comprehensive fashion. The
origins of such efforts date back several decades and arose in multiple
disciplines, and now the study of multilayer networks has become one of the
most important directions in network science. In this paper, we discuss the
history of multilayer networks (and related concepts) and review the exploding
body of work on such networks. To unify the disparate terminology in the large
body of recent work, we discuss a general framework for multilayer networks,
construct a dictionary of terminology to relate the numerous existing concepts
to each other, and provide a thorough discussion that compares, contrasts, and
translates between related notions such as multilayer networks, multiplex
networks, interdependent networks, networks of networks, and many others. We
also survey and discuss existing data sets that can be represented as
multilayer networks. We review attempts to generalize single-layer-network
diagnostics to multilayer networks. We also discuss the rapidly expanding
research on multilayer-network models and notions like community structure,
connected components, tensor decompositions, and various types of dynamical
processes on multilayer networks. We conclude with a summary and an outlook.Comment: Working paper; 59 pages, 8 figure
On Coloring Resilient Graphs
We introduce a new notion of resilience for constraint satisfaction problems,
with the goal of more precisely determining the boundary between NP-hardness
and the existence of efficient algorithms for resilient instances. In
particular, we study -resiliently -colorable graphs, which are those
-colorable graphs that remain -colorable even after the addition of any
new edges. We prove lower bounds on the NP-hardness of coloring resiliently
colorable graphs, and provide an algorithm that colors sufficiently resilient
graphs. We also analyze the corresponding notion of resilience for -SAT.
This notion of resilience suggests an array of open questions for graph
coloring and other combinatorial problems.Comment: Appearing in MFCS 201
An efficient counting method for the colored triad census
The triad census is an important approach to understand local structure in
network science, providing comprehensive assessments of the observed relational
configurations between triples of actors in a network. However, researchers are
often interested in combinations of relational and categorical nodal
attributes. In this case, it is desirable to account for the label, or color,
of the nodes in the triad census. In this paper, we describe an efficient
algorithm for constructing the colored triad census, based, in part, on
existing methods for the classic triad census. We evaluate the performance of
the algorithm using empirical and simulated data for both undirected and
directed graphs. The results of the simulation demonstrate that the proposed
algorithm reduces computational time many-fold over the naive approach. We also
apply the colored triad census to the Zachary karate club network dataset. We
simultaneously show the efficiency of the algorithm, and a way to conduct a
statistical test on the census by forming a null distribution from 1,000
realizations of a mixing-matrix conditioned graph and comparing the observed
colored triad counts to the expected. From this, we demonstrate the method's
utility in our discussion of results about homophily, heterophily, and
bridging, simultaneously gained via the colored triad census. In sum, the
proposed algorithm for the colored triad census brings novel utility to social
network analysis in an efficient package
A Triclustering Approach for Time Evolving Graphs
This paper introduces a novel technique to track structures in time evolving
graphs. The method is based on a parameter free approach for three-dimensional
co-clustering of the source vertices, the target vertices and the time. All
these features are simultaneously segmented in order to build time segments and
clusters of vertices whose edge distributions are similar and evolve in the
same way over the time segments. The main novelty of this approach lies in that
the time segments are directly inferred from the evolution of the edge
distribution between the vertices, thus not requiring the user to make an a
priori discretization. Experiments conducted on a synthetic dataset illustrate
the good behaviour of the technique, and a study of a real-life dataset shows
the potential of the proposed approach for exploratory data analysis
Persistent Homology Guided Force-Directed Graph Layouts
Graphs are commonly used to encode relationships among entities, yet their
abstractness makes them difficult to analyze. Node-link diagrams are popular
for drawing graphs, and force-directed layouts provide a flexible method for
node arrangements that use local relationships in an attempt to reveal the
global shape of the graph. However, clutter and overlap of unrelated structures
can lead to confusing graph visualizations. This paper leverages the persistent
homology features of an undirected graph as derived information for interactive
manipulation of force-directed layouts. We first discuss how to efficiently
extract 0-dimensional persistent homology features from both weighted and
unweighted undirected graphs. We then introduce the interactive persistence
barcode used to manipulate the force-directed graph layout. In particular, the
user adds and removes contracting and repulsing forces generated by the
persistent homology features, eventually selecting the set of persistent
homology features that most improve the layout. Finally, we demonstrate the
utility of our approach across a variety of synthetic and real datasets
Principal Patterns on Graphs: Discovering Coherent Structures in Datasets
Graphs are now ubiquitous in almost every field of research. Recently, new
research areas devoted to the analysis of graphs and data associated to their
vertices have emerged. Focusing on dynamical processes, we propose a fast,
robust and scalable framework for retrieving and analyzing recurring patterns
of activity on graphs. Our method relies on a novel type of multilayer graph
that encodes the spreading or propagation of events between successive time
steps. We demonstrate the versatility of our method by applying it on three
different real-world examples. Firstly, we study how rumor spreads on a social
network. Secondly, we reveal congestion patterns of pedestrians in a train
station. Finally, we show how patterns of audio playlists can be used in a
recommender system. In each example, relevant information previously hidden in
the data is extracted in a very efficient manner, emphasizing the scalability
of our method. With a parallel implementation scaling linearly with the size of
the dataset, our framework easily handles millions of nodes on a single
commodity server
- …