Search CORE

27 research outputs found

Learning and Testing Variable Partitions

Author: Bogdanov Andrej
Wang Baoxiang
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 01/01/2020
Field of study

Let

F

be a multivariate function from a product set

\Sigma^n

to an Abelian group

G

. A

k

-partition of

F

with cost

\delta

is a partition of the set of variables

\mathbf{V}

into

k

non-empty subsets

(\mathbf{X}_1, \dots, \mathbf{X}_k)

such that

F(\mathbf{V})

\delta

-close to

F_1(\mathbf{X}_1)+\dots+F_k(\mathbf{X}_k)

for some

F_1, \dots, F_k

with respect to a given error metric. We study algorithms for agnostically learning

k

partitions and testing

k

-partitionability over various groups and error metrics given query access to

F

. In particular we show that

1.

Given a function that has a

k

-partition of cost

\delta

, a partition of cost

\mathcal{O}(k n^2)(\delta + \epsilon)

can be learned in time

\tilde{\mathcal{O}}(n^2 \mathrm{poly} (1/\epsilon))

for any

\epsilon > 0

. In contrast, for

k = 2

and

n = 3

learning a partition of cost

\delta + \epsilon

is NP-hard.

2.

When

F

is real-valued and the error metric is the 2-norm, a 2-partition of cost

\sqrt{\delta^2 + \epsilon}

can be learned in time

\tilde{\mathcal{O}}(n^5/\epsilon^2)

3.

When

F

\mathbb{Z}_q

-valued and the error metric is Hamming weight,

k

-partitionability is testable with one-sided error and

\mathcal{O}(kn^3/\epsilon)

non-adaptive queries. We also show that even two-sided testers require

\Omega(n)

queries when

k = 2

. This work was motivated by reinforcement learning control tasks in which the set of control variables can be partitioned. The partitioning reduces the task into multiple lower-dimensional ones that are relatively easier to learn. Our second algorithm empirically increases the scores attained over previous heuristic partitioning methods applied in this context.Comment: Innovations in Theoretical Computer Science (ITCS) 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Faster graph algorithms via switching classes

Author: Lindzey Nathan
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2012
Field of study

2012 Summer.Includes bibliographical references.The runtime of an algorithm is intimately related to how an instance is represented. Recall that the runtimes of the first generation of graph algorithms were expressed as functions of n := |V|. This analysis was natural since at this time graphs were represented in n2 space via their adjacency matrix. It was soon noticed that if m := |E| = o(n2), then a variety of graph algorithms could be sped-up by computing the adjacency-list from the adjacency matrix, then running the algorithm on the more efficient adjacency-list representation. This motivated the introduction of m to the runtime of graph algorithms and it is now customary in algorithm design to assume that a graph instance is given in the form of its adjacency-list. For instance, a graph algorithm is not considered to run in linear time unless it runs in O(n + m) time. An O(n2) bound is not considered linear, even though the two bounds are the same in the worst case. Let m͂ be the size of the minimum representative of a graph G's switching class (w.r.t. to some switching operation). It is shown that better bounds for several classical graph algorithms can be obtained by modifying them so that their running time is a function of n+m͂ rather than of n+m. This is significant because m͂ is O(m) but m is not O(m͂). This is accomplished by first computing the so-called partially complemented adjacency list (pc-list) from an adjacency list, then designing an algorithm that is amenable to the more efficient pc-list representation. The pc-list data-structure is generalization of the adjacency list that has a natural correspondence to switching classes. Using this approach, better bounds are obtained for bipartite maximum matching, graph diameter, and vertex-weighted all-pairs shortest path

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Online Correlation Clustering

Author: Mathieu Claire
Sankur Ocan
Schudy Warren
Publication venue
Publication date: 01/01/2010
Field of study

We study the online clustering problem where data items arrive in an online fashion. The algorithm maintains a clustering of data items into similarity classes. Upon arrival of v, the relation between v and previously arrived items is revealed, so that for each u we are told whether v is similar to u. The algorithm can create a new cluster for v and merge existing clusters. When the objective is to minimize disagreements between the clustering and the input, we prove that a natural greedy algorithm is O(n)-competitive, and this is optimal. When the objective is to maximize agreements between the clustering and the input, we prove that the greedy algorithm is .5-competitive; that no online algorithm can be better than .834-competitive; we prove that it is possible to get better than 1/2, by exhibiting a randomized algorithm with competitive ratio .5+c for a small positive fixed constant c.Comment: 12 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

On Parsimonious Explanations For 2-D Tree- and Linearly-Ordered Data

Author: Karloff Howard
Korn Flip
Makarychev Konstantin
Rabani Yuval
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011)
Publication date: 01/01/2011
Field of study

This paper studies the ``explanation problem\u27\u27 for tree- and linearly-ordered array data, a problem motivated by database applications and recently solved for the one-dimensional tree-ordered case. In this paper, one is given a matrix A=(a_{ij}) whose rows and columns have semantics: special subsets of the rows and special subsets of the columns are meaningful, others are not. A submatrix in A is said to be meaningful if and only if it is the cross product of a meaningful row subset and a meaningful column subset, in which case we call it an ``allowed rectangle.\u27\u27 The goal is to ``explain\u27\u27 A as a sparse sum of weighted allowed rectangles. Specifically, we wish to find as few weighted allowed rectangles as possible such that, for all i,j, a_ij equals the sum of the weights of all rectangles which include cell (i,j). In this paper we consider the natural cases in which the matrix dimensions are tree-ordered or linearly-ordered. In the tree-ordered case, we are given a rooted tree

T_1

whose leaves are the rows of

A

and another,

T_2

, whose leaves are the columns. Nodes of the trees correspond in an obvious way to the sets of their leaf descendants. In the linearly-ordered case, a set of rows or columns is meaningful if and only if it is contiguous. For tree-ordered data, we prove the explanation problem NP-Hard and give a randomized

2

-approximation algorithm for it. For linearly-ordered data, we prove the explanation problem NP-Har and give a

2.56

-approximation algorithm. To our knowledge, these are the first results for the problem of sparsely and exactly representing matrices by weighted rectangles

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Exploiting Dense Structures in Parameterized Complexity

Author: Lochet William
Lokshtanov Daniel
Saurabh Saket
Zehavi Meirav
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)
Publication date: 01/01/2021
Field of study

Over the past few decades, the study of dense structures from the perspective of approximation algorithms has become a wide area of research. However, from the viewpoint of parameterized algorithm, this area is largely unexplored. In particular, properties of random samples have been successfully deployed to design approximation schemes for a number of fundamental problems on dense structures [Arora et al. FOCS 1995, Goldreich et al. FOCS 1996, Giotis and Guruswami SODA 2006, Karpinksi and Schudy STOC 2009]. In this paper, we fill this gap, and harness the power of random samples as well as structure theory to design kernelization as well as parameterized algorithms on dense structures. In particular, we obtain linear vertex kernels for Edge-Disjoint Paths, Edge Odd Cycle Transversal, Minimum Bisection, d-Way Cut, Multiway Cut and Multicut on everywhere dense graphs. In fact, these kernels are obtained by designing a polynomial-time algorithm when the corresponding parameter is at most ?(n). Additionally, we obtain a cubic kernel for Vertex-Disjoint Paths on everywhere dense graphs. In addition to kernelization results, we obtain randomized subexponential-time parameterized algorithms for Edge Odd Cycle Transversal, Minimum Bisection, and d-Way Cut. Finally, we show how all of our results (as well as EPASes for these problems) can be de-randomized

Dagstuhl Research Online Publication Server

Multi Layer Peeling for Linear Arrangement and Hierarchical Clustering

Author: Azar Yossi
Vainstein Danny
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Publication date: 01/01/2023
Field of study

We present a new multi-layer peeling technique to cluster points in a metric space. A well-known non-parametric objective is to embed the metric space into a simpler structured metric space such as a line (i.e., Linear Arrangement) or a binary tree (i.e., Hierarchical Clustering). Points which are close in the metric space should be mapped to close points/leaves in the line/tree; similarly, points which are far in the metric space should be far in the line or on the tree. In particular we consider the Maximum Linear Arrangement problem [Refael Hassin and Shlomi Rubinstein, 2001] and the Maximum Hierarchical Clustering problem [Vincent Cohen-Addad et al., 2018] applied to metrics. We design approximation schemes (1-? approximation for any constant ? > 0) for these objectives. In particular this shows that by considering metrics one may significantly improve former approximations (0.5 for Max Linear Arrangement and 0.74 for Max Hierarchical Clustering). Our main technique, which is called multi-layer peeling, consists of recursively peeling off points which are far from the "core" of the metric space. The recursion ends once the core becomes a sufficiently densely weighted metric space (i.e. the average distance is at least a constant times the diameter) or once it becomes negligible with respect to its inner contribution to the objective. Interestingly, the algorithm in the Linear Arrangement case is much more involved than that in the Hierarchical Clustering case, and uses a significantly more delicate peeling

Dagstuhl Research Online Publication Server

Improved Approximation Algorithms for Bipartite Correlation Clustering

Author: A. Zuylen van
H. Zha
I. Giotis
J. Guo
M. Charikar
N. Ailon
N. Ailon
N. Bansal
S.C. Madeira
X.Z. Fern
Y. Cheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

MPG.PuRe