342 research outputs found
A Global Approach for Solving Edge-Matching Puzzles
We consider apictorial edge-matching puzzles, in which the goal is to arrange
a collection of puzzle pieces with colored edges so that the colors match along
the edges of adjacent pieces. We devise an algebraic representation for this
problem and provide conditions under which it exactly characterizes a puzzle.
Using the new representation, we recast the combinatorial, discrete problem of
solving puzzles as a global, polynomial system of equations with continuous
variables. We further propose new algorithms for generating approximate
solutions to the continuous problem by solving a sequence of convex
relaxations
Empirical approximation of the gaussian distribution in
Let be independent copies of the standard gaussian random
vector in . We show that there is an absolute constant such
that for any , with probability at least , for every , Here is the variance of and , where is determined
by an unexpected complexity parameter of that captures the set's geometry
(Talagrand's functional). The bound, the probability estimate, and
the value of are all (almost) optimal.
We use this fact to show that if is the random matrix that has as its rows, then the
structure of is far more rigid and
well-prescribed than was previously expected
A Deep Hierarchical Approach to Lifelong Learning in Minecraft
We propose a lifelong learning system that has the ability to reuse and
transfer knowledge from one task to another while efficiently retaining the
previously learned knowledge-base. Knowledge is transferred by learning
reusable skills to solve tasks in Minecraft, a popular video game which is an
unsolved and high-dimensional lifelong learning problem. These reusable skills,
which we refer to as Deep Skill Networks, are then incorporated into our novel
Hierarchical Deep Reinforcement Learning Network (H-DRLN) architecture using
two techniques: (1) a deep skill array and (2) skill distillation, our novel
variation of policy distillation (Rusu et. al. 2015) for learning skills. Skill
distillation enables the HDRLN to efficiently retain knowledge and therefore
scale in lifelong learning, by accumulating knowledge and encapsulating
multiple reusable skills into a single distilled network. The H-DRLN exhibits
superior performance and lower learning sample complexity compared to the
regular Deep Q Network (Mnih et. al. 2015) in sub-domains of Minecraft
DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Background: Distributed training is essential for large scale training of
deep neural networks (DNNs). The dominant methods for large scale DNN training
are synchronous (e.g. All-Reduce), but these require waiting for all workers in
each step. Thus, these methods are limited by the delays caused by straggling
workers. Results: We study a typical scenario in which workers are straggling
due to variability in compute time. We find an analytical relation between
compute time properties and scalability limitations, caused by such straggling
workers. With these findings, we propose a simple yet effective decentralized
method to reduce the variation among workers and thus improve the robustness of
synchronous training. This method can be integrated with the widely used
All-Reduce. Our findings are validated on large-scale training tasks using 200
Gaudi Accelerators.Comment: https://github.com/paper-submissions/dropcomput
MTJ-Based Hardware Synapse Design for Quantized Deep Neural Networks
Quantized neural networks (QNNs) are being actively researched as a solution
for the computational complexity and memory intensity of deep neural networks.
This has sparked efforts to develop algorithms that support both inference and
training with quantized weight and activation values without sacrificing
accuracy. A recent example is the GXNOR framework for stochastic training of
ternary and binary neural networks. In this paper, we introduce a novel
hardware synapse circuit that uses magnetic tunnel junction (MTJ) devices to
support the GXNOR training. Our solution enables processing near memory (PNM)
of QNNs, therefore can further reduce the data movements from and into the
memory. We simulated MTJ-based stochastic training of a TNN over the MNIST and
SVHN datasets and achieved an accuracy of 98.61% and 93.99%, respectively
The Drosophila Gene CheB42a Is a Novel Modifier of Deg/ENaC Channel Function
Degenerin/epithelial Na+ channels (DEG/ENaC) represent a diverse family of voltage-insensitive cation channels whose functions include Na+ transport across epithelia, mechanosensation, nociception, salt sensing, modification of neurotransmission, and detecting the neurotransmitter FMRFamide. We previously showed that the Drosophila melanogaster Deg/ENaC gene lounge lizard (llz) is co-transcribed in an operon-like locus with another gene of unknown function, CheB42a. Because operons often encode proteins in the same biochemical or physiological pathway, we hypothesized that CHEB42A and LLZ might function together. Consistent with this hypothesis, we found both genes expressed in cells previously implicated in sensory functions during male courtship. Furthermore, when coexpressed, LLZ coprecipitated with CHEB42A, suggesting that the two proteins form a complex. Although LLZ expressed either alone or with CHEB42A did not generate ion channel currents, CHEB42A increased current amplitude of another DEG/ENaC protein whose ligand (protons) is known, acid-sensing ion channel 1a (ASIC1a). We also found that CHEB42A was cleaved to generate a secreted protein, suggesting that CHEB42A may play an important role in the extracellular space. These data suggest that CHEB42A is a modulatory subunit for sensory-related Deg/ENaC signaling. These results are consistent with operon-like transcription of CheB42a and llz and explain the similar contributions of these genes to courtship behavior
- …