171 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    p\ell_p-Regression in the Arbitrary Partition Model of Communication

    Full text link
    We consider the randomized communication complexity of the distributed p\ell_p-regression problem in the coordinator model, for p(0,2]p\in (0,2]. In this problem, there is a coordinator and ss servers. The ii-th server receives Ai{M,M+1,,M}n×dA^i\in\{-M, -M+1, \ldots, M\}^{n\times d} and bi{M,M+1,,M}nb^i\in\{-M, -M+1, \ldots, M\}^n and the coordinator would like to find a (1+ϵ)(1+\epsilon)-approximate solution to minxRn(iAi)x(ibi)p\min_{x\in\mathbb{R}^n} \|(\sum_i A^i)x - (\sum_i b^i)\|_p. Here Mpoly(nd)M \leq \mathrm{poly}(nd) for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For p=2p = 2, i.e., least squares regression, we give the first optimal bound of Θ~(sd2+sd/ϵ)\tilde{\Theta}(sd^2 + sd/\epsilon) bits. For p(1,2)p \in (1,2),we obtain an O~(sd2/ϵ+sd/poly(ϵ))\tilde{O}(sd^2/\epsilon + sd/\mathrm{poly}(\epsilon)) upper bound. Notably, for dd sufficiently large, our leading order term only depends linearly on 1/ϵ1/\epsilon rather than quadratically. We also show communication lower bounds of Ω(sd2+sd/ϵ2)\Omega(sd^2 + sd/\epsilon^2) for p(0,1]p\in (0,1] and Ω(sd2+sd/ϵ)\Omega(sd^2 + sd/\epsilon) for p(1,2]p\in (1,2]. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020)

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Barriers for Faster Dimensionality Reduction

    Get PDF

    Can You Solve Closest String Faster than Exhaustive Search?

    Full text link
    We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set XΣdX \subseteq \Sigma^d of nn strings, find the string xx^* minimizing the radius of the smallest Hamming ball around xx^* that encloses all the strings in XX. In this paper, we investigate whether the Closest String problem admits algorithms that are faster than the trivial exhaustive search algorithm. We obtain the following results for the two natural versions of the problem: \bullet In the continuous Closest String problem, the goal is to find the solution string xx^* anywhere in Σd\Sigma^d. For binary strings, the exhaustive search algorithm runs in time O(2dpoly(nd))O(2^d poly(nd)) and we prove that it cannot be improved to time O(2(1ϵ)dpoly(nd))O(2^{(1-\epsilon) d} poly(nd)), for any ϵ>0\epsilon > 0, unless the Strong Exponential Time Hypothesis fails. \bullet In the discrete Closest String problem, xx^* is required to be in the input set XX. While this problem is clearly in polynomial time, its fine-grained complexity has been pinpointed to be quadratic time n2±o(1)n^{2 \pm o(1)} whenever the dimension is ω(logn)<d<no(1)\omega(\log n) < d < n^{o(1)}. We complement this known hardness result with new algorithms, proving essentially that whenever dd falls out of this hard range, the discrete Closest String problem can be solved faster than exhaustive search. In the small-dd regime, our algorithm is based on a novel application of the inclusion-exclusion principle. Interestingly, all of our results apply (and some are even stronger) to the natural dual of the Closest String problem, called the Remotest String problem, where the task is to find a string maximizing the Hamming distance to all the strings in XX

    Minimizing Hitting Time between Disparate Groups with Shortcut Edges

    Full text link
    Structural bias or segregation of networks refers to situations where two or more disparate groups are present in the network, so that the groups are highly connected internally, but loosely connected to each other. In many cases it is of interest to increase the connectivity of disparate groups so as to, e.g., minimize social friction, or expose individuals to diverse viewpoints. A commonly-used mechanism for increasing the network connectivity is to add edge shortcuts between pairs of nodes. In many applications of interest, edge shortcuts typically translate to recommendations, e.g., what video to watch, or what news article to read next. The problem of reducing structural bias or segregation via edge shortcuts has recently been studied in the literature, and random walks have been an essential tool for modeling navigation and connectivity in the underlying networks. Existing methods, however, either do not offer approximation guarantees, or engineer the objective so that it satisfies certain desirable properties that simplify the optimization~task. In this paper we address the problem of adding a given number of shortcut edges in the network so as to directly minimize the average hitting time and the maximum hitting time between two disparate groups. Our algorithm for minimizing average hitting time is a greedy bicriteria that relies on supermodularity. In contrast, maximum hitting time is not supermodular. Despite, we develop an approximation algorithm for that objective as well, by leveraging connections with average hitting time and the asymmetric k-center problem.Comment: To appear in KDD 202

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    LIPIcs, Volume 258, SoCG 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 258, SoCG 2023, Complete Volum

    Energy Data Analytics for Smart Meter Data

    Get PDF
    The principal advantage of smart electricity meters is their ability to transfer digitized electricity consumption data to remote processing systems. The data collected by these devices make the realization of many novel use cases possible, providing benefits to electricity providers and customers alike. This book includes 14 research articles that explore and exploit the information content of smart meter data, and provides insights into the realization of new digital solutions and services that support the transition towards a sustainable energy system. This volume has been edited by Andreas Reinhardt, head of the Energy Informatics research group at Technische Universität Clausthal, Germany, and Lucas Pereira, research fellow at Técnico Lisboa, Portugal
    corecore