32,833 research outputs found
Partial information decomposition as a unified approach to the specification of neural goal functions
In many neural systems anatomical motifs are present repeatedly, but despite their structural similarity they can serve very different tasks. A prime example for such a motif is the canonical microcircuit of six-layered neo-cortex, which is repeated across cortical areas, and is involved in a number of different tasks (e.g. sensory, cognitive, or motor tasks). This observation has spawned interest in finding a common underlying principle, a âgoal functionâ, of information processing implemented in this structure. By definition such a goal function, if universal, cannot be cast in processing-domain specific language (e.g. âedge filteringâ, âworking memoryâ). Thus, to formulate such a principle, we have to use a domain-independent framework. Information theory offers such a framework. However, while the classical framework of information theory focuses on the relation between one input and one output (Shannonâs mutual information), we argue that neural information processing crucially depends on the combination of multiple inputs to create the output of a processor. To account for this, we use a very recent extension of Shannon Information theory, called partial information decomposition (PID). PID allows to quantify the information that several inputs provide individually (unique information), redundantly (shared information) or only jointly (synergistic information) about the output. First, we review the framework of PID. Then we apply it to reevaluate and analyze several earlier proposals of information theoretic neural goal functions (predictive coding, infomax and coherent infomax, efficient coding). We find that PID allows to compare these goal functions in a common framework, and also provides a versatile approach to design new goal functions from first principles. Building on this, we design and analyze a novel goal function, called âcoding with synergyâ, which builds on combining external input and prior knowledge in a synergistic manner. We suggest that this novel goal function may be highly useful in neural information processing
XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference
Binary Neural Networks (BNNs) are promising to deliver accuracy comparable to
conventional deep neural networks at a fraction of the cost in terms of memory
and energy. In this paper, we introduce the XNOR Neural Engine (XNE), a fully
digital configurable hardware accelerator IP for BNNs, integrated within a
microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid
SRAM / standard cell memory. The XNE is able to fully compute convolutional and
dense layers in autonomy or in cooperation with the core in the MCU to realize
more complex behaviors. We show post-synthesis results in 65nm and 22nm
technology for the XNE IP and post-layout results in 22nm for the full MCU
indicating that this system can drop the energy cost per binary operation to
21.6fJ per operation at 0.4V, and at the same time is flexible and performant
enough to execute state-of-the-art BNN topologies such as ResNet-34 in less
than 2.2mJ per frame at 8.9 fps.Comment: 11 pages, 8 figures, 2 tables, 3 listings. Accepted for presentation
at CODES'18 and for publication in IEEE Transactions on Computer-Aided Design
of Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issu
Neural Simulations on Multi-Core Architectures
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing
Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL
Recent technological advances have proliferated the available computing
power, memory, and speed of modern Central Processing Units (CPUs), Graphics
Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs).
Consequently, the performance and complexity of Artificial Neural Networks
(ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs)
currently offer state-of-the-art performance, they consume large amounts of
power. Training such networks on CPUs is inefficient, as data throughput and
parallel computation is limited. FPGAs are considered a suitable candidate for
performance critical, low power systems, e.g. the Internet of Things (IOT) edge
devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development
environment, networks described using the high-level OpenCL framework can be
accelerated on heterogeneous platforms. Moreover, the resource utilization and
power consumption of DNNs can be further enhanced by utilizing regularization
techniques that binarize network weights. In this paper, we introduce, to the
best of our knowledge, the first FPGA-accelerated stochastically binarized DNN
implementations, and compare them to implementations accelerated using both
GPUs and FPGAs. Our developed networks are trained and benchmarked using the
popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art
performance, while offering a >16-fold improvement in power consumption,
compared to conventional GPU-accelerated networks. Both our FPGA-accelerated
determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10
by >9.89x and >9.91x, respectively.Comment: 4 pages, 3 figures, 1 tabl
One-Shot Relational Learning for Knowledge Graphs
Knowledge graphs (KGs) are the key components of various natural language
processing applications. To further expand KGs' coverage, previous studies on
knowledge graph completion usually require a large number of training instances
for each relation. However, we observe that long-tail relations are actually
more common in KGs and those newly added relations often do not have many known
triples for training. In this work, we aim at predicting new facts under a
challenging setting where only one training instance is available. We propose a
one-shot relational learning framework, which utilizes the knowledge extracted
by embedding models and learns a matching metric by considering both the
learned embeddings and one-hop graph structures. Empirically, our model yields
considerable performance improvements over existing embedding models, and also
eliminates the need of re-training the embedding models when dealing with newly
added relations.Comment: EMNLP 201
- âŠ