Search CORE

22 research outputs found

Learning Graphical Models Using Multiplicative Weights

Author: Klivans Adam
Meka Raghu
Publication venue
Publication date: 20/06/2017
Field of study

We give a simple, multiplicative-weight update algorithm for learning undirected graphical models or Markov random fields (MRFs). The approach is new, and for the well-studied case of Ising models or Boltzmann machines, we obtain an algorithm that uses a nearly optimal number of samples and has quadratic running time (up to logarithmic factors), subsuming and improving on all prior work. Additionally, we give the first efficient algorithm for learning Ising models over general alphabets. Our main application is an algorithm for learning the structure of t-wise MRFs with nearly-optimal sample complexity (up to polynomial losses in necessary terms that depend on the weights) and running time that is

n^{O(t)}

. In addition, given

n^{O(t)}

samples, we can also learn the parameters of the model and generate a hypothesis that is close in statistical distance to the true MRF. All prior work runs in time

n^{\Omega(d)}

for graphs of bounded degree d and does not generate a hypothesis close in statistical distance even for t=3. We observe that our runtime has the correct dependence on n and t assuming the hardness of learning sparse parities with noise. Our algorithm--the Sparsitron-- is easy to implement (has only one parameter) and holds in the on-line setting. Its analysis applies a regret bound from Freund and Schapire's classic Hedge algorithm. It also gives the first solution to the problem of learning sparse Generalized Linear Models (GLMs)

arXiv.org e-Print Archive

Crossref

Lower Bounds on Time-Space Trade-Offs for Approximate Near Neighbors

Author: Andoni Alexandr
Laarhoven Thijs
Razenshteyn Ilya
Waingarten Erik
Publication venue
Publication date: 01/01/2016
Field of study

We show tight lower bounds for the entire trade-off between space and query time for the Approximate Near Neighbor search problem. Our lower bounds hold in a restricted model of computation, which captures all hashing-based approaches. In articular, our lower bound matches the upper bound recently shown in [Laarhoven 2015] for the random instance on a Euclidean sphere (which we show in fact extends to the entire space

\mathbb{R}^d

using the techniques from [Andoni, Razenshteyn 2015]). We also show tight, unconditional cell-probe lower bounds for one and two probes, improving upon the best known bounds from [Panigrahy, Talwar, Wieder 2010]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than for one probe. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.Comment: 47 pages, 2 figures; v2: substantially revised introduction, lots of small corrections; subsumed by arXiv:1608.03580 [cs.DS] (along with arXiv:1511.07527 [cs.DS]

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Explicit Correlation Amplifiers for Finding Outlier Correlations in Deterministic Subquadratic Time

Author: Karppa Matti
Kaski Petteri
Kohonen Jukka
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th Annual European Symposium on Algorithms (ESA 2016)
Publication date: 01/01/2016
Field of study

We derandomize G. Valiant\u27s [J.ACM 62(2015) Art.13] subquadratic-time algorithm for finding outlier correlations in binary data. Our derandomized algorithm gives deterministic subquadratic scaling essentially for the same parameter range as Valiant\u27s randomized algorithm, but the precise constants we save over quadratic scaling are more modest. Our main technical tool for derandomization is an explicit family of correlation amplifiers built via a family of zigzag-product expanders in Reingold, Vadhan, and Wigderson [Ann. of Math 155(2002), 157-187]. We say that a function f:{-1,1}^d ->{-1,1}^D is a correlation amplifier with threshold 0 = 1, and strength p an even positive integer if for all pairs of vectors x,y in {-1,1}^d it holds that (i) ||| | >= tau*d implies (/gamma^d})^p*D /d)^p*D

Dagstuhl Research Online Publication Server

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

Author: Mitra Saayan
Qin Lianke
Song Zhao
Yang Yuanyuan
Zhou Tianyi
Publication venue
Publication date: 19/11/2023
Field of study

In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets

A \subset \{-1,+1\}^d

and

B \subset \{-1,+1\}^d

with

|A|=|B| = n

, if there are exact

k

pairs whose inner product passes a certain threshold, i.e.,

\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B

such that

\forall i \in [k], \langle a_i,b_i \rangle \geq \rho \cdot d

, for a threshold

\rho \in (0,1)

, the goal is to identify those

k

heavy inner products. We provide an algorithm that runs in

O(n^{2 \omega / 3+ o(1)})

time to find the

k

inner product pairs that surpass

\rho \cdot d

threshold with high probability, where

\omega

is the current matrix multiplication exponent. By solving this problem, our method speed up the training of neural networks with ReLU activation function.Comment: IEEE BigData 202

arXiv.org e-Print Archive

Fast Sketch-based Recovery of Correlation Outliers

Author: Cormode Graham
Dark Jacques
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 21st International Conference on Database Theory (ICDT 2018)
Publication date: 05/10/2017
Field of study

Many data sources can be interpreted as time-series, and a key problem is to identify which pairs out of a large collection of signals are highly correlated. We expect that there will be few, large, interesting correlations, while most signal pairs do not have any strong correlation. We abstract this as the problem of identifying the highly correlated pairs in a collection of n mostly pairwise uncorrelated random variables, where observations of the variables arrives as a stream. Dimensionality reduction can remove dependence on the number of observations, but further techniques are required to tame the quadratic (in n) cost of a search through all possible pairs. We develop a new algorithm for rapidly finding large correlations based on sketch techniques with an added twist: we quickly generate sketches of random combinations of signals, and use these in concert with ideas from coding theory to decode the identity of correlated pairs. We prove correctness and compare performance and effectiveness with the best LSH (locality sensitive hashing) based approach

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

A Faster Algorithm for Finding Closest Pairs in Hamming Metric

Author: Esser Andre
Zweydinger Floyd
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)
Publication date: 01/01/2021
Field of study

We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input outperforming previous approaches whenever the dimension of input points is small compared to the dataset size. For moderate to large dimensions, our algorithm matches the time complexity of the previously best-known locality sensitive hashing based algorithms. Technically our algorithm follows similar design principles as Dubiner (IEEE Trans. Inf. Theory 2010) and May-Ozerov (Eurocrypt 2015). Besides improving the time complexity in the aforementioned areas, we significantly simplify the analysis of these previous works. We give a modular analysis, which allows us to investigate the performance of the algorithm also on non-uniform input distributions. Furthermore, we give a proof of concept implementation of our algorithm which performs well in comparison to a quadratic search baseline. This is the first step towards answering an open question raised by May and Ozerov regarding the practicability of algorithms following these design principles

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server