    A Local Algorithm for the Sparse Spanning Graph Problem

    Constructing a sparse spanning subgraph is a fundamental primitive in graph theory. In this paper, we study this problem in the Centralized Local model, where the goal is to decide whether an edge is part of the spanning subgraph by examining only a small part of the input; yet, answers must be globally consistent and independent of prior queries. Unfortunately, maximally sparse spanning subgraphs, i.e., spanning trees, cannot be constructed efficiently in this model. Therefore, we settle for a spanning subgraph containing at most (1+ε)n(1+\varepsilon)n edges (where nn is the number of vertices and ε\varepsilon is a given approximation/sparsity parameter). We achieve query complexity of O~(poly(Δ/ε)n2/3)\tilde{O}(poly(\Delta/\varepsilon)n^{2/3}), (O~\tilde{O}-notation hides polylogarithmic factors in nn). where Δ\Delta is the maximum degree of the input graph. Our algorithm is the first to do so on arbitrary bounded degree graphs. Moreover, we achieve the additional property that our algorithm outputs a spanner, i.e., distances are approximately preserved. With high probability, for each deleted edge there is a path of O(poly(Δ/ε)log2n)O(poly(\Delta/\varepsilon)\log^2 n) hops in the output that connects its endpoints

    Improved Generalization Bounds for Robust Learning

    We consider a model of robust learning in an adversarial environment. The learner gets uncorrupted training data with access to possible corruptions that may be affected by the adversary during testing. The learner's goal is to build a robust classifier that would be tested on future adversarial examples. We use a zero-sum game between the learner and the adversary as our game theoretic framework. The adversary is limited to kk possible corruptions for each input. Our model is closely related to the adversarial examples model of Schmidt et al. (2018); Madry et al. (2017). Our main results consist of generalization bounds for the binary and multi-class classification, as well as the real-valued case (regression). For the binary classification setting, we both tighten the generalization bound of Feige, Mansour, and Schapire (2015), and also are able to handle an infinite hypothesis class HH. The sample complexity is improved from O(1ϵ4log(Hδ))O(\frac{1}{\epsilon^4}\log(\frac{|H|}{\delta})) to O(1ϵ2(klog(k)VC(H)+log1δ))O(\frac{1}{\epsilon^2}(k\log(k)VC(H)+\log\frac{1}{\delta})). Additionally, we extend the algorithm and generalization bound from the binary to the multiclass and real-valued cases. Along the way, we obtain results on fat-shattering dimension and Rademacher complexity of kk-fold maxima over function classes; these may be of independent interest. For binary classification, the algorithm of Feige et al. (2015) uses a regret minimization algorithm and an ERM oracle as a blackbox; we adapt it for the multi-class and regression settings. The algorithm provides us with near-optimal policies for the players on a given training sample.Comment: Appearing at the 30th International Conference on Algorithmic Learning Theory (ALT 2019

    Non-Local Probes Do Not Help with Graph Problems

    This work bridges the gap between distributed and centralised models of computing in the context of sublinear-time graph algorithms. A priori, typical centralised models of computing (e.g., parallel decision trees or centralised local algorithms) seem to be much more powerful than distributed message-passing algorithms: centralised algorithms can directly probe any part of the input, while in distributed algorithms nodes can only communicate with their immediate neighbours. We show that for a large class of graph problems, this extra freedom does not help centralised algorithms at all: for example, efficient stateless deterministic centralised local algorithms can be simulated with efficient distributed message-passing algorithms. In particular, this enables us to transfer existing lower bound results from distributed algorithms to centralised local algorithms

    Adversarially Robust Learning with Tolerance

    We initiate the study of tolerant adversarial PAC-learning with respect to metric perturbation sets. In adversarial PAC-learning, an adversary is allowed to replace a test point xx with an arbitrary point in a closed ball of radius rr centered at xx. In the tolerant version, the error of the learner is compared with the best achievable error with respect to a slightly larger perturbation radius (1+γ)r(1+\gamma)r. This simple tweak helps us bridge the gap between theory and practice and obtain the first PAC-type guarantees for algorithmic techniques that are popular in practice. Our first result concerns the widely-used ``perturb-and-smooth'' approach for adversarial learning. For perturbation sets with doubling dimension dd, we show that a variant of these approaches PAC-learns any hypothesis class H\mathcal{H} with VC-dimension vv in the γ\gamma-tolerant adversarial setting with O(v(1+1/γ)O(d)ε)O\left(\frac{v(1+1/\gamma)^{O(d)}}{\varepsilon}\right) samples. This is in contrast to the traditional (non-tolerant) setting in which, as we show, the perturb-and-smooth approach can provably fail. Our second result shows that one can PAC-learn the same class using O~(d.vlog(1+1/γ)ε2)\widetilde{O}\left(\frac{d.v\log(1+1/\gamma)}{\varepsilon^2}\right) samples even in the agnostic setting. This result is based on a novel compression-based algorithm, and achieves a linear dependence on the doubling dimension as well as the VC-dimension. This is in contrast to the non-tolerant setting where there is no known sample complexity upper bound that depend polynomially on the VC-dimension.Comment: The paper was accepted for ALT 202