1,947 research outputs found
Learning Models over Relational Data using Sparse Tensors and Functional Dependencies
Integrated solutions for analytics over relational databases are of great
practical importance as they avoid the costly repeated loop data scientists
have to deal with on a daily basis: select features from data residing in
relational databases using feature extraction queries involving joins,
projections, and aggregations; export the training dataset defined by such
queries; convert this dataset into the format of an external learning tool; and
train the desired model using this tool. These integrated solutions are also a
fertile ground of theoretically fundamental and challenging problems at the
intersection of relational and statistical data models.
This article introduces a unified framework for training and evaluating a
class of statistical learning models over relational databases. This class
includes ridge linear regression, polynomial regression, factorization
machines, and principal component analysis. We show that, by synergizing key
tools from database theory such as schema information, query structure,
functional dependencies, recent advances in query evaluation algorithms, and
from linear algebra such as tensor and matrix operations, one can formulate
relational analytics problems and design efficient (query and data)
structure-aware algorithms to solve them.
This theoretical development informed the design and implementation of the
AC/DC system for structure-aware learning. We benchmark the performance of
AC/DC against R, MADlib, libFM, and TensorFlow. For typical retail forecasting
and advertisement planning applications, AC/DC can learn polynomial regression
models and factorization machines with at least the same accuracy as its
competitors and up to three orders of magnitude faster than its competitors
whenever they do not run out of memory, exceed 24-hour timeout, or encounter
internal design limitations.Comment: 61 pages, 9 figures, 2 table
A composition theorem for parity kill number
In this work, we study the parity complexity measures
and .
is the \emph{parity kill number} of , the
fewest number of parities on the input variables one has to fix in order to
"kill" , i.e. to make it constant. is the depth
of the shortest \emph{parity decision tree} which computes . These
complexity measures have in recent years become increasingly important in the
fields of communication complexity \cite{ZS09, MO09, ZS10, TWXZ13} and
pseudorandomness \cite{BK12, Sha11, CT13}.
Our main result is a composition theorem for .
The -th power of , denoted , is the function which results
from composing with itself times. We prove that if is not a parity
function, then In other words, the parity kill number of
is essentially supermultiplicative in the \emph{normal} kill number of
(also known as the minimum certificate complexity).
As an application of our composition theorem, we show lower bounds on the
parity complexity measures of and . Here is the sort function due to Ambainis \cite{Amb06},
and is Kushilevitz's hemi-icosahedron function \cite{NW95}. In
doing so, we disprove a conjecture of Montanaro and Osborne \cite{MO09} which
had applications to communication complexity and computational learning theory.
In addition, we give new lower bounds for conjectures of \cite{MO09,ZS10} and
\cite{TWXZ13}
Learning probability distributions generated by finite-state machines
We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference
in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft
A Nearly Optimal Lower Bound on the Approximate Degree of AC
The approximate degree of a Boolean function is the least degree of a real polynomial that
approximates pointwise to error at most . We introduce a generic
method for increasing the approximate degree of a given function, while
preserving its computability by constant-depth circuits.
Specifically, we show how to transform any Boolean function with
approximate degree into a function on variables with approximate degree at least . In particular, if , then
is polynomially larger than . Moreover, if is computed by a
polynomial-size Boolean circuit of constant depth, then so is .
By recursively applying our transformation, for any constant we
exhibit an AC function of approximate degree . This
improves over the best previous lower bound of due to
Aaronson and Shi (J. ACM 2004), and nearly matches the trivial upper bound of
that holds for any function. Our lower bounds also apply to
(quasipolynomial-size) DNFs of polylogarithmic width.
We describe several applications of these results. We give:
* For any constant , an lower bound on the
quantum communication complexity of a function in AC.
* A Boolean function with approximate degree at least ,
where is the certificate complexity of . This separation is optimal
up to the term in the exponent.
* Improved secret sharing schemes with reconstruction procedures in AC.Comment: 40 pages, 1 figur
- …