1,014 research outputs found

    Classical and quantum algorithms for scaling problems

    Get PDF
    This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Saturn: An Optimized Data System for Large Model Deep Learning Workloads

    Full text link
    Large language models such as GPT-3 & ChatGPT have transformed deep learning (DL), powering applications that have captured the public's imagination. These models are rapidly being adopted across domains for analytics on various modalities, often by finetuning pre-trained base models. Such models need multiple GPUs due to both their size and computational load, driving the development of a bevy of "model parallelism" techniques & tools. Navigating such parallelism choices, however, is a new burden for end users of DL such as data scientists, domain scientists, etc. who may lack the necessary systems knowhow. The need for model selection, which leads to many models to train due to hyper-parameter tuning or layer-wise finetuning, compounds the situation with two more burdens: resource apportioning and scheduling. In this work, we tackle these three burdens for DL users in a unified manner by formalizing them as a joint problem that we call SPASE: Select a Parallelism, Allocate resources, and SchedulE. We propose a new information system architecture to tackle the SPASE problem holistically, representing a key step toward enabling wider adoption of large DL models. We devise an extensible template for existing parallelism schemes and combine it with an automated empirical profiler for runtime estimation. We then formulate SPASE as an MILP. We find that direct use of an MILP-solver is significantly more effective than several baseline heuristics. We optimize the system runtime further with an introspective scheduling approach. We implement all these techniques into a new data system we call Saturn. Experiments with benchmark DL workloads show that Saturn achieves 39-49% lower model selection runtimes than typical current DL practice.Comment: Under submission at VLDB. Code available: https://github.com/knagrecha/saturn. 12 pages + 3 pages references + 2 pages appendi

    Mining Butterflies in Streaming Graphs

    Get PDF
    This thesis introduces two main-memory systems sGrapp and sGradd for performing the fundamental analytic tasks of biclique counting and concept drift detection over a streaming graph. A data-driven heuristic is used to architect the systems. To this end, initially, the growth patterns of bipartite streaming graphs are mined and the emergence principles of streaming motifs are discovered. Next, the discovered principles are (a) explained by a graph generator called sGrow; and (b) utilized to establish the requirements for efficient, effective, explainable, and interpretable management and processing of streams. sGrow is used to benchmark stream analytics, particularly in the case of concept drift detection. sGrow displays robust realization of streaming growth patterns independent of initial conditions, scale and temporal characteristics, and model configurations. Extensive evaluations confirm the simultaneous effectiveness and efficiency of sGrapp and sGradd. sGrapp achieves mean absolute percentage error up to 0.05/0.14 for the cumulative butterfly count in streaming graphs with uniform/non-uniform temporal distribution and a processing throughput of 1.5 million data records per second. The throughput and estimation error of sGrapp are 160x higher and 0.02x lower than baselines. sGradd demonstrates an improving performance over time, achieves zero false detection rates when there is not any drift and when drift is already detected, and detects sequential drifts in zero to a few seconds after their occurrence regardless of drift intervals

    Dynamic (1+\epsilon) : approximate matching size in truly sublinear update time

    Get PDF

    Algorithmic and Coding-theoretic Methods for Group Testing and Private Information Retrieval

    Get PDF
    In the first part of this dissertation, we consider the Group Testing (GT) problem and its two variants, the Quantitative GT (QGT) problem and the Coin Weighing (CW) problem. An instance of the GT problem includes a ground set of items that includes a small subset of defective items. The GT procedure consists of a number of tests, such that each test indicates whether or not a given subset of items includes one or more defective items. The goal of the GT procedure is to identify the subset of defective items with the minimum number of tests. Motivated by practical scenarios where the outcome of the tests can be affected by noise, we focus on the noisy GT setting, in which the outcome of a test can be flipped with some probability. In the noisy GT setting, the goal is to identify the set of defective items with high probability. We investigate the performance of two variants of the Belief Propagation (BP) algorithm for decoding of noisy non-adaptive GT under the combinatorial model for defective items. Through extensive simulations, we show that the proposed algorithms achieve higher success probability and lower false-negative and false-positive rates when compared to the traditional BP algorithm. We also consider a variation of the probabilistic GT model in which the prior probability of each item to be defective is not uniform and in which there is a certain amount of side information on the distribution of the defective items available to the GT algorithm. This dissertation focuses on leveraging the side information for improving the performance of decoding algorithms for noisy GT. First, we propose a probabilistic model, referred to as an interaction model, that captures the side information about the probability distribution of the defective items. Next, we present a decoding scheme, based on BP, that leverages the interaction model to improve the decoding accuracy. Our results indicate that the proposed algorithm achieves higher success probability and lower false-negative and false-positive rates when compared to the traditional BP, especially in the high noise regime. In the QGT problem, the result of a test reveals the number of defective items in the tested group. This is in contrast to the standard GT where the result of each test is either 1 or 0 depending on whether the tested group contains any defective items or not. In this dissertation, we study the QGT problem for the combinatorial and probabilistic models of defective items. We propose non-adaptive QGT algorithms using sparse graph codes over bi-regular and irregular bipartite graphs, and binary t-error-correcting BCH codes. The proposed schemes provide exact recovery with a probabilistic guarantee, i.e. recover all the defective items with high probability. The proposed schemes outperform existing non-adaptive QGT schemes for the sub-linear regime in terms of the number of tests required to identify all defective items with high probability. The CW problem lies at the intersection of GT and compressed sensing problems. Given a collection of coins and the total weight of the coins, where the weight of each coin is an unknown integer, the problem is to determine the weight of each coin by weighing subsets of coins on a spring scale. The goal is to minimize the average number of weighings over all possible weight configurations. Toward this goal, we propose and analyze a simple and effective adaptive weighing strategy. This is the first non-trivial achievable upper bound on the minimum expected required number of weighings. In the second part of this dissertation, we focus on the private information retrieval problem. In many practical settings, the user needs to retrieve information messages from a server in a periodic manner, over multiple rounds of communication. The messages are retrieved one at a time and the identity of future requests is not known to the server. We study the private information retrieval protocols that ensure that the identities of all the messages retrieved from the server are protected. This scenario can occur in practical settings such as periodic content download from text and multimedia repositories. We refer to this problem of minimizing the rate of data download as online private information retrieval problem. Following the previous line of work by Kadhe et al., we assume that the user knows a subset of messages in the database as side information. The identities of these messages are initially unknown to the server. Focusing on scalar-linear settings, we characterize the per-round capacity, i.e., the maximum achievable download rate at each round. The key idea of our achievability scheme is to combine the data downloaded during the current round and the previous rounds with the original side information messages and use the resulting data as side information for the subsequent rounds

    Data-driven deep-learning methods for the accelerated simulation of Eulerian fluid dynamics

    Get PDF
    Deep-learning (DL) methods for the fast inference of the temporal evolution of ļ¬‚uid-dynamics systems, based on the previous recognition of features underlying large sets of ļ¬‚uid-dynamics data, have been studied. Speciļ¬cally, models based on convolution neural networks (CNNs) and graph neural networks (GNNs) were proposed and discussed. A U-Net, a popular fully-convolutional architecture, was trained to infer wave dynamics on liquid surfaces surrounded by walls, given as input the system state at previous time-points. A term for penalising the error of the spatial derivatives was added to the loss function, which resulted in a suppression of spurious oscillations and a more accurate location and length of the predicted wavefronts. This model proved to accurately generalise to complex wall geometries not seen during training. As opposed to the image data-structures processed by CNNs, graphs oļ¬€er higher freedom on how data is organised and processed. This motivated the use of graphs to represent the state of ļ¬‚uid-dynamic systems discretised by unstructured sets of nodes, and GNNs to process such graphs. Graphs have enabled more accurate representations of curvilinear geometries and higher resolution placement exclusively in areas where physics is more challenging to resolve. Two novel GNN architectures were designed for ļ¬‚uid-dynamics inference: the MuS-GNN, a multi-scale GNN, and the REMuS-GNN, a rotation-equivariant multi-scale GNN. Both architectures work by repeatedly passing messages from each node to its nearest nodes in the graph. Additionally, lower-resolutions graphs, with a reduced number of nodes, are deļ¬ned from the original graph, and messages are also passed from ļ¬ner to coarser graphs and vice-versa. The low-resolution graphs allowed for eļ¬ƒciently capturing physics encompassing a range of lengthscales. Advection and ļ¬‚uid ļ¬‚ow, modelled by the incompressible Navier-Stokes equations, were the two types of problems used to assess the proposed GNNs. Whereas a single-scale GNN was suļ¬ƒcient to achieve high generalisation accuracy in advection simulations, ļ¬‚ow simulation highly beneļ¬ted from an increasing number of low-resolution graphs. The generalisation and long-term accuracy of these simulations were further improved by the REMuS-GNN architecture, which processes the system state independently of the orientation of the coordinate system thanks to a rotation-invariant representation and carefully designed components. To the best of the authorā€™s knowledge, the REMuS-GNN architecture was the ļ¬rst rotation-equivariant and multi-scale GNN. The simulations were accelerated between one (in a CPU) and three (in a GPU) orders of magnitude with respect to a CPU-based numerical solver. Additionally, the parallelisation of multi-scale GNNs resulted in a close-to-linear speedup with the number of CPU cores or GPUs.Open Acces

    Dynamic (1+Ļµ)(1+\epsilon)-Approximate Matching Size in Truly Sublinear Update Time

    Full text link
    We show a fully dynamic algorithm for maintaining (1+Ļµ)(1+\epsilon)-approximate \emph{size} of maximum matching of the graph with nn vertices and mm edges using m0.5āˆ’Ī©Ļµ(1)m^{0.5-\Omega_{\epsilon}(1)} update time. This is the first polynomial improvement over the long-standing O(n)O(n) update time, which can be trivially obtained by periodic recomputation. Thus, we resolve the value version of a major open question of the dynamic graph algorithms literature (see, e.g., [Gupta and Peng FOCS'13], [Bernstein and Stein SODA'16],[Behnezhad and Khanna SODA'22]). Our key technical component is the first sublinear algorithm for (1,Ļµn)(1,\epsilon n)-approximate maximum matching with sublinear running time on dense graphs. All previous algorithms suffered a multiplicative approximation factor of at least 1.4991.499 or assumed that the graph has a very small maximum degree
    • ā€¦
    corecore