212 research outputs found
Fast arithmetic computing with neural networks
The authors introduce a restricted model of a neuron which is more practical as a model of computation then the classical model of a neuron. The authors define a model of neural networks as a feedforward network of such neurons. Whereas any logic circuit of polynomial size (in n) that computes the product of two n-bit numbers requires unbounded delay, such computations can be done in a neural network with constant delay. The authors improve some known results by showing that the product of two n-bit numbers and sorting of n n-bit numbers can both be computed by a polynomial size neural network using only four unit delays, independent of n . Moreover, the weights of each threshold element in the neural networks require only O(log n)-bit (instead of n-bit) accuracy
Boolean Operations, Joins, and the Extended Low Hierarchy
We prove that the join of two sets may actually fall into a lower level of
the extended low hierarchy than either of the sets. In particular, there exist
sets that are not in the second level of the extended low hierarchy, EL_2, yet
their join is in EL_2. That is, in terms of extended lowness, the join operator
can lower complexity. Since in a strong intuitive sense the join does not lower
complexity, our result suggests that the extended low hierarchy is unnatural as
a complexity measure. We also study the closure properties of EL_ and prove
that EL_2 is not closed under certain Boolean operations. To this end, we
establish the first known (and optimal) EL_2 lower bounds for certain notions
generalizing Selman's P-selectivity, which may be regarded as an interesting
result in its own right.Comment: 12 page
Subspace-Invariant AC Formulas
We consider the action of a linear subspace of on the set of
AC formulas with inputs labeled by literals in the set , where an element acts on formulas by
transposing the th pair of literals for all such that . A
formula is {\em -invariant} if it is fixed by this action. For example,
there is a well-known recursive construction of depth formulas of size
computing the -variable PARITY function; these
formulas are easily seen to be -invariant where is the subspace of
even-weight elements of . In this paper we establish a nearly
matching lower bound on the -invariant depth
formula size of PARITY. Quantitatively this improves the best known
lower bound for {\em unrestricted} depth
formulas, while avoiding the use of the switching lemma. More generally,
for any linear subspaces , we show that if a Boolean function is
-invariant and non-constant over , then its -invariant depth
formula size is at least where is the minimum Hamming
weight of a vector in
Lower Bounds for (Non-Monotone) Comparator Circuits
Comparator circuits are a natural circuit model for studying the concept of bounded fan-out computations, which intuitively corresponds to whether or not a computational model can make "copies" of intermediate computational steps. Comparator circuits are believed to be weaker than general Boolean circuits, but they can simulate Branching Programs and Boolean formulas. In this paper we prove the first superlinear lower bounds in the general (non-monotone) version of this model for an explicitly defined function. More precisely, we prove that the n-bit Element Distinctness function requires ?((n/ log n)^(3/2)) size comparator circuits
On the Depth of Deep Neural Networks: A Theoretical View
People believe that depth plays an important role in success of deep neural
networks (DNN). However, this belief lacks solid theoretical justifications as
far as we know. We investigate role of depth from perspective of margin bound.
In margin bound, expected error is upper bounded by empirical margin error plus
Rademacher Average (RA) based capacity term. First, we derive an upper bound
for RA of DNN, and show that it increases with increasing depth. This indicates
negative impact of depth on test performance. Second, we show that deeper
networks tend to have larger representation power (measured by Betti numbers
based complexity) than shallower networks in multi-class setting, and thus can
lead to smaller empirical margin error. This implies positive impact of depth.
The combination of these two results shows that for DNN with restricted number
of hidden units, increasing depth is not always good since there is a tradeoff
between positive and negative impacts. These results inspire us to seek
alternative ways to achieve positive impact of depth, e.g., imposing
margin-based penalty terms to cross entropy loss so as to reduce empirical
margin error without increasing depth. Our experiments show that in this way,
we achieve significantly better test performance.Comment: AAAI 201
Learning pseudo-Boolean k-DNF and Submodular Functions
We prove that any submodular function f: {0,1}^n -> {0,1,...,k} can be
represented as a pseudo-Boolean 2k-DNF formula. Pseudo-Boolean DNFs are a
natural generalization of DNF representation for functions with integer range.
Each term in such a formula has an associated integral constant. We show that
an analog of Hastad's switching lemma holds for pseudo-Boolean k-DNFs if all
constants associated with the terms of the formula are bounded.
This allows us to generalize Mansour's PAC-learning algorithm for k-DNFs to
pseudo-Boolean k-DNFs, and hence gives a PAC-learning algorithm with membership
queries under the uniform distribution for submodular functions of the form
f:{0,1}^n -> {0,1,...,k}. Our algorithm runs in time polynomial in n, k^{O(k
\log k / \epsilon)}, 1/\epsilon and log(1/\delta) and works even in the
agnostic setting. The line of previous work on learning submodular functions
[Balcan, Harvey (STOC '11), Gupta, Hardt, Roth, Ullman (STOC '11), Cheraghchi,
Klivans, Kothari, Lee (SODA '12)] implies only n^{O(k)} query complexity for
learning submodular functions in this setting, for fixed epsilon and delta.
Our learning algorithm implies a property tester for submodularity of
functions f:{0,1}^n -> {0, ..., k} with query complexity polynomial in n for
k=O((\log n/ \loglog n)^{1/2}) and constant proximity parameter \epsilon
- β¦