176 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Mining Butterflies in Streaming Graphs
This thesis introduces two main-memory systems sGrapp and sGradd for performing the fundamental analytic tasks of biclique counting and concept drift detection over a streaming graph. A data-driven heuristic is used to architect the systems. To this end, initially, the growth patterns of bipartite streaming graphs are mined and the emergence principles of streaming motifs are discovered. Next, the discovered principles are (a) explained by a graph generator called sGrow; and (b) utilized to establish the requirements for efficient, effective, explainable, and interpretable management and processing of streams. sGrow is used to benchmark stream analytics, particularly in the case of concept drift detection.
sGrow displays robust realization of streaming growth patterns independent of initial conditions, scale and temporal characteristics, and model configurations. Extensive evaluations confirm the simultaneous effectiveness and efficiency of sGrapp and sGradd. sGrapp achieves mean absolute percentage error up to 0.05/0.14 for the cumulative butterfly count in streaming graphs with uniform/non-uniform temporal distribution and a processing throughput of 1.5 million data records per second. The throughput and estimation error of sGrapp are 160x higher and 0.02x lower than baselines. sGradd demonstrates an improving performance over time, achieves zero false detection rates when there is not any drift and when drift is already detected, and detects sequential drifts in zero to a few seconds after their occurrence regardless of drift intervals
Sum-of-squares representations for copositive matrices and independent sets in graphs
A polynomial optimization problem asks for minimizing a polynomial function (cost) given a set of constraints (rules) represented by polynomial inequalities and equations. Many hard problems in combinatorial optimization and applications in operations research can be naturally encoded as polynomial optimization problems. A common approach for addressing such computationally hard problems is by considering variations of the original problem that give an approximate solution, and that can be solved efficiently. One such approach for attacking hard combinatorial problems and, more generally, polynomial optimization problems, is given by the so-called sum-of-squares approximations. This thesis focuses on studying whether these approximations find the optimal solution of the original problem.We investigate this question in two main settings: 1) Copositive programs and 2) parameters dealing with independent sets in graphs. Among our main new results, we characterize the matrix sizes for which sum-of-squares approximations are able to capture all copositive matrices. In addition, we show finite convergence of the sums-of-squares approximations for maximum independent sets in graphs based on their continuous copositive reformulations. We also study sum-of-squares approximations for parameters asking for maximum balanced independent sets in bipartite graphs. In particular, we find connections with the Lovász theta number and we design eigenvalue bounds for several related parameters when the graphs satisfy some symmetry properties.<br/
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Max -Flow Oracles and Negative Cycle Detection in Planar Digraphs
We study the maximum -flow oracle problem on planar directed graphs
where the goal is to design a data structure answering max -flow value (or
equivalently, min -cut value) queries for arbitrary source-target pairs
. For the case of polynomially bounded integer edge capacities, we
describe an exact max -flow oracle with truly subquadratic space and
preprocessing, and sublinear query time. Moreover, if
-approximate answers are acceptable, we obtain a static oracle
with near-linear preprocessing and query time and a
dynamic oracle supporting edge capacity updates and queries in
worst-case time.
To the best of our knowledge, for directed planar graphs, no (approximate)
max -flow oracles have been described even in the unweighted case, and
only trivial tradeoffs involving either no preprocessing or precomputing all
the possible answers have been known.
One key technical tool we develop on the way is a sublinear (in the number of
edges) algorithm for finding a negative cycle in so-called dense distance
graphs. By plugging it in earlier frameworks, we obtain improved bounds for
other fundamental problems on planar digraphs. In particular, we show: (1) a
deterministic time algorithm for negatively-weighted SSSP in
planar digraphs with integer edge weights at least . This improves upon the
previously known bounds in the important case of weights polynomial in , and
(2) an improved bound on finding a perfect matching in a
bipartite planar graph.Comment: Extended abstract to appear in SODA 202
On minimum -claw deletion in split graphs
For , is called -claw. In minimum -claw deletion
problem (\texttt{Min--Claw-Del}), given a graph , it is required
to find a vertex set of minimum size such that is
-claw free. In a split graph, the vertex set is partitioned into two sets
such that one forms a clique and the other forms an independent set. Every
-claw in a split graph has a center vertex in the clique partition. This
observation motivates us to consider the minimum one-sided bipartite -claw
deletion problem (\texttt{Min--OSBCD}). Given a bipartite graph , in \texttt{Min--OSBCD} it is asked to find a vertex set of
minimum size such that has no -claw with the center
vertex in . A primal-dual algorithm approximates \texttt{Min--OSBCD}
within a factor of . We prove that it is \UGC-hard to approximate with a
factor better than . We also prove it is approximable within a factor of 2
for dense bipartite graphs. By using these results on \texttt{Min--OSBCD},
we prove that \texttt{Min--Claw-Del} is \UGC-hard to approximate within a
factor better than , for split graphs. We also consider their complementary
maximization problems and prove that they are \APX-complete.Comment: 11 pages and 1 figur
Network models of T cell receptor repertoires, cross-reactivity, and viral infection
The mathematical models of this Thesis represent T cell population dynamics in homeostasis and during infection. In particular, T cell cross-reactivity is studied with a bipartite recognition network encoding the epitope recognition profiles of T cell receptors. The behaviour of extinction events is studied using stochastic models. Stochastic and deterministic techniques are used to study the late time behaviour of the system. Statistical methods are used to study immune responses in the context of influenza A virus infection in mice, providing insight into the effects of immunological history and cross-reactivity. Finally, network theoretical tools are used to study the dynamics of cross-reactive immune responses under different hypotheses for the structure of the bipartite recognition network
Cohesive subgraph identification in large graphs
Graph data is ubiquitous in real world applications, as the relationship among entities in the applications can be naturally captured by the graph model. Finding cohesive subgraphs is a fundamental problem in graph mining with diverse applications. Given the important roles of cohesive subgraphs, this thesis focuses on cohesive subgraph identification in large graphs.
Firstly, we study the size-bounded community search problem that aims to find a subgraph with the largest min-degree among all connected subgraphs that contain the query vertex q and have at least l and at most h vertices, where q, l, h are specified by the query. As the problem is NP-hard, we propose a branch-reduce-and-bound algorithm SC-BRB by developing nontrivial reducing techniques, upper bounding techniques, and branching techniques.
Secondly, we formulate the notion of similar-biclique in bipartite graphs which is a special kind of biclique where all vertices from a designated side are similar to each other, and aim to enumerate all maximal similar-bicliques. We propose a backtracking algorithm MSBE to directly enumerate maximal similar-bicliques, and power it by vertex reduction and optimization techniques. In addition, we design a novel index structure to speed up a time-critical operation of MSBE, as well as to speed up vertex reduction. Efficient index construction algorithms are developed.
Thirdly, we consider balanced cliques in signed graphs --- a clique is balanced if its vertex set can be partitioned into CL and CR such that all negative edges are between CL and CR --- and study the problem of maximum balanced clique computation. We propose techniques to transform the maximum balanced clique problem over G to a series of maximum dichromatic clique problems over small subgraphs of G. The transformation not only removes edge signs but also sparsifies the edge set
- …