1,947 research outputs found

    Learning Models over Relational Data using Sparse Tensors and Functional Dependencies

    Full text link
    Integrated solutions for analytics over relational databases are of great practical importance as they avoid the costly repeated loop data scientists have to deal with on a daily basis: select features from data residing in relational databases using feature extraction queries involving joins, projections, and aggregations; export the training dataset defined by such queries; convert this dataset into the format of an external learning tool; and train the desired model using this tool. These integrated solutions are also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This article introduces a unified framework for training and evaluating a class of statistical learning models over relational databases. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from database theory such as schema information, query structure, functional dependencies, recent advances in query evaluation algorithms, and from linear algebra such as tensor and matrix operations, one can formulate relational analytics problems and design efficient (query and data) structure-aware algorithms to solve them. This theoretical development informed the design and implementation of the AC/DC system for structure-aware learning. We benchmark the performance of AC/DC against R, MADlib, libFM, and TensorFlow. For typical retail forecasting and advertisement planning applications, AC/DC can learn polynomial regression models and factorization machines with at least the same accuracy as its competitors and up to three orders of magnitude faster than its competitors whenever they do not run out of memory, exceed 24-hour timeout, or encounter internal design limitations.Comment: 61 pages, 9 figures, 2 table

    A composition theorem for parity kill number

    Full text link
    In this work, we study the parity complexity measures Cmin[f]{\mathsf{C}^{\oplus}_{\min}}[f] and DT[f]{\mathsf{DT^{\oplus}}}[f]. Cmin[f]{\mathsf{C}^{\oplus}_{\min}}[f] is the \emph{parity kill number} of ff, the fewest number of parities on the input variables one has to fix in order to "kill" ff, i.e. to make it constant. DT[f]{\mathsf{DT^{\oplus}}}[f] is the depth of the shortest \emph{parity decision tree} which computes ff. These complexity measures have in recent years become increasingly important in the fields of communication complexity \cite{ZS09, MO09, ZS10, TWXZ13} and pseudorandomness \cite{BK12, Sha11, CT13}. Our main result is a composition theorem for Cmin{\mathsf{C}^{\oplus}_{\min}}. The kk-th power of ff, denoted fkf^{\circ k}, is the function which results from composing ff with itself kk times. We prove that if ff is not a parity function, then Cmin[fk]Ω(Cmin[f]k).{\mathsf{C}^{\oplus}_{\min}}[f^{\circ k}] \geq \Omega({\mathsf{C}_{\min}}[f]^{k}). In other words, the parity kill number of ff is essentially supermultiplicative in the \emph{normal} kill number of ff (also known as the minimum certificate complexity). As an application of our composition theorem, we show lower bounds on the parity complexity measures of Sortk\mathsf{Sort}^{\circ k} and HIk\mathsf{HI}^{\circ k}. Here Sort\mathsf{Sort} is the sort function due to Ambainis \cite{Amb06}, and HI\mathsf{HI} is Kushilevitz's hemi-icosahedron function \cite{NW95}. In doing so, we disprove a conjecture of Montanaro and Osborne \cite{MO09} which had applications to communication complexity and computational learning theory. In addition, we give new lower bounds for conjectures of \cite{MO09,ZS10} and \cite{TWXZ13}

    Learning probability distributions generated by finite-state machines

    Get PDF
    We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft

    A Nearly Optimal Lower Bound on the Approximate Degree of AC0^0

    Full text link
    The approximate degree of a Boolean function f ⁣:{1,1}n{1,1}f \colon \{-1, 1\}^n \rightarrow \{-1, 1\} is the least degree of a real polynomial that approximates ff pointwise to error at most 1/31/3. We introduce a generic method for increasing the approximate degree of a given function, while preserving its computability by constant-depth circuits. Specifically, we show how to transform any Boolean function ff with approximate degree dd into a function FF on O(npolylog(n))O(n \cdot \operatorname{polylog}(n)) variables with approximate degree at least D=Ω(n1/3d2/3)D = \Omega(n^{1/3} \cdot d^{2/3}). In particular, if d=n1Ω(1)d= n^{1-\Omega(1)}, then DD is polynomially larger than dd. Moreover, if ff is computed by a polynomial-size Boolean circuit of constant depth, then so is FF. By recursively applying our transformation, for any constant δ>0\delta > 0 we exhibit an AC0^0 function of approximate degree Ω(n1δ)\Omega(n^{1-\delta}). This improves over the best previous lower bound of Ω(n2/3)\Omega(n^{2/3}) due to Aaronson and Shi (J. ACM 2004), and nearly matches the trivial upper bound of nn that holds for any function. Our lower bounds also apply to (quasipolynomial-size) DNFs of polylogarithmic width. We describe several applications of these results. We give: * For any constant δ>0\delta > 0, an Ω(n1δ)\Omega(n^{1-\delta}) lower bound on the quantum communication complexity of a function in AC0^0. * A Boolean function ff with approximate degree at least C(f)2o(1)C(f)^{2-o(1)}, where C(f)C(f) is the certificate complexity of ff. This separation is optimal up to the o(1)o(1) term in the exponent. * Improved secret sharing schemes with reconstruction procedures in AC0^0.Comment: 40 pages, 1 figur
    corecore