499 research outputs found

    Algorithms approaching the threshold for semi-random planted clique

    Full text link
    We design new polynomial-time algorithms for recovering planted cliques in the semi-random graph model introduced by Feige and Kilian~\cite{FK01}. The previous best algorithms for this model succeed if the planted clique has size at least n2/3n^{2/3} in a graph with nn vertices (Mehta, Mckenzie, Trevisan, 2019 and Charikar, Steinhardt, Valiant 2017). Our algorithms work for planted-clique sizes approaching n1/2n^{1/2} -- the information-theoretic threshold in the semi-random model~\cite{steinhardt2017does} and a conjectured computational threshold even in the easier fully-random model. This result comes close to resolving open questions by Feige and Steinhardt. Our algorithms are based on higher constant degree sum-of-squares relaxation and rely on a new conceptual connection that translates certificates of upper bounds on biclique numbers in \emph{unbalanced} bipartite Erd\H{o}s--R\'enyi random graphs into algorithms for semi-random planted clique. The use of a higher-constant degree sum-of-squares is essential in our setting: we prove a lower bound on the basic SDP for certifying bicliques that shows that the basic SDP cannot succeed for planted cliques of size k=o(n2/3)k =o(n^{2/3}). We also provide some evidence that the information-computation trade-off of our current algorithms may be inherent by proving an average-case lower bound for unbalanced bicliques in the low-degree-polynomials model.Comment: 51 pages, the arxiv landing page contains a shortened abstrac

    Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy

    Get PDF
    We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework to simultaneously provide (1) resiliency against stragglers that may prolong computations; (2) security against Byzantine (or malicious) workers that deliberately modify the computation for their benefit; and (3) (information-theoretic) privacy of the dataset amidst possible collusion of workers. LCC, which leverages the well-known Lagrange polynomial to create computation redundancy in a novel coded form across workers, can be applied to any computation scenario in which the function of interest is an arbitrary multivariate polynomial of the input dataset, hence covering many computations of interest in machine learning. LCC significantly generalizes prior works to go beyond linear computations. It also enables secure and private computing in distributed settings, improving the computation and communication efficiency of the state-of-the-art. Furthermore, we prove the optimality of LCC by showing that it achieves the optimal tradeoff between resiliency, security, and privacy, i.e., in terms of tolerating the maximum number of stragglers and adversaries, and providing data privacy against the maximum number of colluding workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the conventional uncoded implementation of distributed least-squares linear regression by up to 13.43×13.43\times, and also achieves a 2.36×2.36\times-12.65×12.65\times speedup over the state-of-the-art straggler mitigation strategies
    • …
    corecore