22 research outputs found

    Learning Graphical Models Using Multiplicative Weights

    Full text link
    We give a simple, multiplicative-weight update algorithm for learning undirected graphical models or Markov random fields (MRFs). The approach is new, and for the well-studied case of Ising models or Boltzmann machines, we obtain an algorithm that uses a nearly optimal number of samples and has quadratic running time (up to logarithmic factors), subsuming and improving on all prior work. Additionally, we give the first efficient algorithm for learning Ising models over general alphabets. Our main application is an algorithm for learning the structure of t-wise MRFs with nearly-optimal sample complexity (up to polynomial losses in necessary terms that depend on the weights) and running time that is nO(t)n^{O(t)}. In addition, given nO(t)n^{O(t)} samples, we can also learn the parameters of the model and generate a hypothesis that is close in statistical distance to the true MRF. All prior work runs in time nΞ©(d)n^{\Omega(d)} for graphs of bounded degree d and does not generate a hypothesis close in statistical distance even for t=3. We observe that our runtime has the correct dependence on n and t assuming the hardness of learning sparse parities with noise. Our algorithm--the Sparsitron-- is easy to implement (has only one parameter) and holds in the on-line setting. Its analysis applies a regret bound from Freund and Schapire's classic Hedge algorithm. It also gives the first solution to the problem of learning sparse Generalized Linear Models (GLMs)

    Lower Bounds on Time-Space Trade-Offs for Approximate Near Neighbors

    Get PDF
    We show tight lower bounds for the entire trade-off between space and query time for the Approximate Near Neighbor search problem. Our lower bounds hold in a restricted model of computation, which captures all hashing-based approaches. In articular, our lower bound matches the upper bound recently shown in [Laarhoven 2015] for the random instance on a Euclidean sphere (which we show in fact extends to the entire space Rd\mathbb{R}^d using the techniques from [Andoni, Razenshteyn 2015]). We also show tight, unconditional cell-probe lower bounds for one and two probes, improving upon the best known bounds from [Panigrahy, Talwar, Wieder 2010]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than for one probe. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.Comment: 47 pages, 2 figures; v2: substantially revised introduction, lots of small corrections; subsumed by arXiv:1608.03580 [cs.DS] (along with arXiv:1511.07527 [cs.DS]

    Explicit Correlation Amplifiers for Finding Outlier Correlations in Deterministic Subquadratic Time

    Get PDF
    We derandomize G. Valiant\u27s [J.ACM 62(2015) Art.13] subquadratic-time algorithm for finding outlier correlations in binary data. Our derandomized algorithm gives deterministic subquadratic scaling essentially for the same parameter range as Valiant\u27s randomized algorithm, but the precise constants we save over quadratic scaling are more modest. Our main technical tool for derandomization is an explicit family of correlation amplifiers built via a family of zigzag-product expanders in Reingold, Vadhan, and Wigderson [Ann. of Math 155(2002), 157-187]. We say that a function f:{-1,1}^d ->{-1,1}^D is a correlation amplifier with threshold 0 = 1, and strength p an even positive integer if for all pairs of vectors x,y in {-1,1}^d it holds that (i) ||| | >= tau*d implies (/gamma^d})^p*D /d)^p*D

    Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

    Full text link
    In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets AβŠ‚{βˆ’1,+1}dA \subset \{-1,+1\}^d and BβŠ‚{βˆ’1,+1}dB \subset \{-1,+1\}^d with ∣A∣=∣B∣=n|A|=|B| = n, if there are exact kk pairs whose inner product passes a certain threshold, i.e., {(a1,b1),⋯ ,(ak,bk)}βŠ‚AΓ—B\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B such that βˆ€i∈[k],⟨ai,bi⟩β‰₯ρ⋅d\forall i \in [k], \langle a_i,b_i \rangle \geq \rho \cdot d, for a threshold ρ∈(0,1)\rho \in (0,1), the goal is to identify those kk heavy inner products. We provide an algorithm that runs in O(n2Ο‰/3+o(1))O(n^{2 \omega / 3+ o(1)}) time to find the kk inner product pairs that surpass ρ⋅d\rho \cdot d threshold with high probability, where Ο‰\omega is the current matrix multiplication exponent. By solving this problem, our method speed up the training of neural networks with ReLU activation function.Comment: IEEE BigData 202

    Fast Sketch-based Recovery of Correlation Outliers

    Get PDF
    Many data sources can be interpreted as time-series, and a key problem is to identify which pairs out of a large collection of signals are highly correlated. We expect that there will be few, large, interesting correlations, while most signal pairs do not have any strong correlation. We abstract this as the problem of identifying the highly correlated pairs in a collection of n mostly pairwise uncorrelated random variables, where observations of the variables arrives as a stream. Dimensionality reduction can remove dependence on the number of observations, but further techniques are required to tame the quadratic (in n) cost of a search through all possible pairs. We develop a new algorithm for rapidly finding large correlations based on sketch techniques with an added twist: we quickly generate sketches of random combinations of signals, and use these in concert with ideas from coding theory to decode the identity of correlated pairs. We prove correctness and compare performance and effectiveness with the best LSH (locality sensitive hashing) based approach

    A Faster Algorithm for Finding Closest Pairs in Hamming Metric

    Get PDF
    We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input outperforming previous approaches whenever the dimension of input points is small compared to the dataset size. For moderate to large dimensions, our algorithm matches the time complexity of the previously best-known locality sensitive hashing based algorithms. Technically our algorithm follows similar design principles as Dubiner (IEEE Trans. Inf. Theory 2010) and May-Ozerov (Eurocrypt 2015). Besides improving the time complexity in the aforementioned areas, we significantly simplify the analysis of these previous works. We give a modular analysis, which allows us to investigate the performance of the algorithm also on non-uniform input distributions. Furthermore, we give a proof of concept implementation of our algorithm which performs well in comparison to a quadratic search baseline. This is the first step towards answering an open question raised by May and Ozerov regarding the practicability of algorithms following these design principles
    corecore