100 research outputs found
Clique versus Independent Set
Yannakakis' Clique versus Independent Set problem (CL-IS) in communication
complexity asks for the minimum number of cuts separating cliques from stable
sets in a graph, called CS-separator. Yannakakis provides a quasi-polynomial
CS-separator, i.e. of size , and addresses the problem of
finding a polynomial CS-separator. This question is still open even for perfect
graphs. We show that a polynomial CS-separator almost surely exists for random
graphs. Besides, if H is a split graph (i.e. has a vertex-partition into a
clique and a stable set) then there exists a constant for which we find a
CS-separator on the class of H-free graphs. This generalizes a
result of Yannakakis on comparability graphs. We also provide a
CS-separator on the class of graphs without induced path of length k and its
complement. Observe that on one side, is of order
resulting from Vapnik-Chervonenkis dimension, and on the other side, is
exponential.
One of the main reason why Yannakakis' CL-IS problem is fascinating is that
it admits equivalent formulations. Our main result in this respect is to show
that a polynomial CS-separator is equivalent to the polynomial
Alon-Saks-Seymour Conjecture, asserting that if a graph has an edge-partition
into k complete bipartite graphs, then its chromatic number is polynomially
bounded in terms of k. We also show that the classical approach to the stubborn
problem (arising in CSP) which consists in covering the set of all solutions by
instances of 2-SAT is again equivalent to the existence of a
polynomial CS-separator
Enumerating Maximal Bicliques from a Large Graph using MapReduce
We consider the enumeration of maximal bipartite cliques (bicliques) from a
large graph, a task central to many practical data mining problems in social
network analysis and bioinformatics. We present novel parallel algorithms for
the MapReduce platform, and an experimental evaluation using Hadoop MapReduce.
Our algorithm is based on clustering the input graph into smaller sized
subgraphs, followed by processing different subgraphs in parallel. Our
algorithm uses two ideas that enable it to scale to large graphs: (1) the
redundancy in work between different subgraph explorations is minimized through
a careful pruning of the search space, and (2) the load on different reducers
is balanced through the use of an appropriate total order among the vertices.
Our evaluation shows that the algorithm scales to large graphs with millions of
edges and tens of mil- lions of maximal bicliques. To our knowledge, this is
the first work on maximal biclique enumeration for graphs of this scale.Comment: A preliminary version of the paper was accepted at the Proceedings of
the 3rd IEEE International Congress on Big Data 201
Finding Biclique Partitions of Co-Chordal Graphs
The biclique partition number of a graph is referred to as
the least number of complete bipartite (biclique) subgraphs that are required
to cover the edges of the graph exactly once. In this paper, we show that the
biclique partition number () of a co-chordal (complementary graph of
chordal) graph is less than the number of maximal cliques
() of its complementary graph: a chordal graph . We
first provide a general framework of the ``divide and conquer" heuristic of
finding minimum biclique partitions of co-chordal graphs based on clique trees.
Furthermore, a heuristic of complexity is proposed by
applying lexicographic breadth-first search to find structures called moplexes.
Either heuristic gives us a biclique partition of with size
. In addition, we prove that both of our heuristics can solve
the minimum biclique partition problem on exactly if its complement
is chordal and clique vertex irreducible. We also show that if is a split graph
Fractional coverings, greedy coverings, and rectifier networks
A rectifier network is a directed acyclic graph with distinguished sources and sinks; it is said to compute a Boolean matrix M that has a 1 in the entry (i,j) iff there is a path from the j-th source to the i-th sink. The smallest number of edges in a rectifier network that computes M is a classic complexity measure on matrices, which has been studied for more than half a century. We explore two techniques that have hitherto found little to no applications in this theory. They build upon a basic fact that depth-2 rectifier networks are essentially weighted coverings of Boolean matrices with rectangles. Using fractional and greedy coverings (defined in the standard way), we obtain new results in this area. First, we show that all fractional coverings of the so-called full triangular matrix have cost at least n log n. This provides (a fortiori) a new proof of the tight lower bound on its depth-2 complexity (the exact value has been known since 1965, but previous proofs are based on different arguments). Second, we show that the greedy heuristic is instrumental in tightening the upper bound on the depth-2 complexity of the Kneser-Sierpinski (disjointness) matrix. The previous upper bound is O(n^{1.28}), and we improve it to O(n^{1.17}), while the best known lower bound is Omega(n^{1.16}). Third, using fractional coverings, we obtain a form of direct product theorem that gives a lower bound on unbounded-depth complexity of Kronecker (tensor) products of matrices. In this case, the greedy heuristic shows (by an argument due to Lovász) that our result is only a logarithmic factor away from the "full" direct product theorem. Our second and third results constitute progress on open problem 7.3 and resolve, up to a logarithmic factor, open problem 7.5 from a recent book by Jukna and Sergeev (in Foundations and Trends in Theoretical Computer Science (2013)
A bitwise clique detection approach for accelerating power graph computation and clustering dense graphs
Graphs are at the essence of many data representations. The visual analytics over graphs is usually difficult due to their size, which makes their visual display challenging, and their fundamental algorithms, which are often classified as NP-hard problems. The Power Graph Analysis (PGA) is a method that simplifies networks using reduced representations for complete subgraphs (cliques) and complete bipartite subgraphs (bicliques), in both cases with edge reductions. The benefits of a power graph are the preservation of information and its capacity to show essential information about the original network. However, finding an optimal representation (maximum edges reduction) is also an NPhard problem. In this work, we propose BCD, a greedy algorithm that uses a Bitwise Clique Detection approach to finding power graphs. BCD is faster than competing strategies and allows the analysis of bigger graphs. For the display of larger power graphs, we propose an orthogonal layout to prevent overlapping of edges and vertices. Finally, we describe how the structure induced by the power graph is used for clustering analysis of dense graphs. We demonstrate with several datasets the results obtained by our proposal and compare against competing strategies.Os grafos são essenciais para muitas representações de dados. A análise visual de grafos é usualmente difícil devido ao tamanho, o que representa um desafio para sua visualização. Além de isso, seus algoritmos fundamentais são frequentemente classificados como NP-difícil. Análises dos grafos de potência (PGA em inglês) é um método que simplifica redes usando representações reduzidas para subgrafos completos chamados cliques e subgrafos bipartidos chamados bicliques, em ambos casos com una redução de arestas. Os benefícios da representação de grafo de potência são a preservação de informação e a capacidade de mostrar a informação essencial sobre a rede original. Entretanto, encontrar uma representação ótima (a máxima redução de arestas possível) é também um problema NP-difícil. Neste trabalho, propomos BCD, um algoritmo guloso que usa um abordagem de detecção de bicliques baseado em operações binarias para encontrar representações de grafos de potencia. O BCD é mas rápido que as estratégias atuais da literatura. Finalmente, descrevemos como a estrutura induzida pelo grafo de potência é utilizado para as análises dos grafos densos na detecção de agrupamentos de nodos
Dynamic Scaling of Parallel Stream Joins on the Cloud
Οι μεγάλοι όγκοι δεδομένων που παράγονται από πολλές αναδυόμενες εφαρμογές και συστήματα απαιτούν την πολύπλοκη επεξεργασία ροών δεδομένων υψηλής ταχύτητας σε πραγματικό χρόνο. Η σύζευξη δεδομένων ροών είναι η αντίστοιχη διαδικασία σύζευξης των συμβατικών βάσεων δεδομένων και συγκρίνει τις πλειάδες που προέρχονται από διαφορετικές σχεσιακές ροές. Ο συγκεκριμένος operator χαρακτηρίζεται ως υπολογιστικά ακριβός και ταυτόχρονα εξαιρετικά σημαντικός για την ανάλυση δεδομένων σε πραγματικό χρόνο. Η αποτελεσματική και κλιμακούμενη επεξεργασία των συζεύξεων δεδομένων ροών μπορεί να γίνει εφικτή από τη διαθεσιμότητα ενός μεγάλου αριθμού κόμβων επεξεργασίας σε ένα παράλληλο και κατανεμημένο περιβάλλον. Επιπλέον, τα υπολογιστικά νέφη έχουν εξελιχθεί ως μια ελκυστική πλατφόρμα για την επεξεργασία δεδομένων μεγάλης κλίμακας, κυρίως λόγω της έννοιας της ελαστικότητας. Με τα υπολογιστικά νέφη δίνεται η δυνατότητα εκμίσθωσης εικονικής υπολογιστικής υποδομής, η οποία μπορεί να χρησιμοποιηθεί για όσο χρόνο χρειάζεται με δυναμικό τρόπο. Στη συγκεκριμένη εργασία υιοθετούμε τις βασικές ιδέες και τα χαρακτηριστικά των Qian Lin et al. από το έργο τους "Scalable Distributed Stream Join Processing". Η βασική ιδέα που παρουσιάζεται σε αυτό το έργο είναι το μοντέλο join-biclique το οποίο οργανώνει τις μονάδες επεξεργασίας ενός υπολογιστικού cluster ως έναν ολοκληρωμένο διμερές γράφο. Με βάση αυτή την ιδέα, αναπτύξαμε και υλοποιήσαμε ένα σύνολο αλγορίθμων που σχεδιάστηκαν ως microservices σε περιβάλλον software containers. Οι αλγόριθμοι εκτελούν την επεξεργασία και σύζευξη ροών δεδομένων και μπορούν να κλιμακωθούν οριζόντια. Πραγματοποιήσαμε τα πειράματά μας σε περιβάλλον υπολογιστικού νέφους στο Google Container Engine χρησιμοποιώντας πλατφόρμα Kubernetes και Docker containers.The large and varying volumes of data generated by many emerging applications and systems demand the sophisticated processing of high speed data streams in a real-time fashion. Stream joins is the streaming counterpart of conventional database joins and compares tuples coming from different streaming relations. This operator is characterized as computationally expensive and also quite important for real-time analytics. Efficient and scalable processing of stream joins may be enabled by the availability of a large number of processing nodes in a parallel and distributed environment. Furthermore, clouds have evolved as an appealing platform for large-scale data processing mainly due to the concept of elasticity; virtual computing infrastructure can be leased on demand and used for as much time as needed in a dynamic manner. For this thesis project, we adopt the main ideas and features of Qian Lin et al. in their paper “Scalable Distributed Stream Join Processing”. The basic idea presented in that paper is the join-biclique model which organizes the processing units of a cluster as a complete bipartite graph. Based on that idea, we developed and carried out a set of algorithms designed as containerized microservices, which perform stream join processing and can be scaled horizontally on demand. We performed our experiments on Google Container Engine using Kubernetes orchestration platform and Docker containers
Cooperative Games with Overlapping Coalitions
In the usual models of cooperative game theory, the outcome of a coalition
formation process is either the grand coalition or a coalition structure that
consists of disjoint coalitions. However, in many domains where coalitions are
associated with tasks, an agent may be involved in executing more than one
task, and thus may distribute his resources among several coalitions. To tackle
such scenarios, we introduce a model for cooperative games with overlapping
coalitions--or overlapping coalition formation (OCF) games. We then explore the
issue of stability in this setting. In particular, we introduce a notion of the
core, which generalizes the corresponding notion in the traditional
(non-overlapping) scenario. Then, under some quite general conditions, we
characterize the elements of the core, and show that any element of the core
maximizes the social welfare. We also introduce a concept of balancedness for
overlapping coalitional games, and use it to characterize coalition structures
that can be extended to elements of the core. Finally, we generalize the notion
of convexity to our setting, and show that under some natural assumptions
convex games have a non-empty core. Moreover, we introduce two alternative
notions of stability in OCF that allow a wider range of deviations, and explore
the relationships among the corresponding definitions of the core, as well as
the classic (non-overlapping) core and the Aubin core. We illustrate the
general properties of the three cores, and also study them from a computational
perspective, thus obtaining additional insights into their fundamental
structure
- …