75,136 research outputs found
An Analysis of the Connections Between Layers of Deep Neural Networks
We present an analysis of different techniques for selecting the connection
be- tween layers of deep neural networks. Traditional deep neural networks use
ran- dom connection tables between layers to keep the number of connections
small and tune to different image features. This kind of connection performs
adequately in supervised deep networks because their values are refined during
the training. On the other hand, in unsupervised learning, one cannot rely on
back-propagation techniques to learn the connections between layers. In this
work, we tested four different techniques for connecting the first layer of the
network to the second layer on the CIFAR and SVHN datasets and showed that the
accuracy can be im- proved up to 3% depending on the technique used. We also
showed that learning the connections based on the co-occurrences of the
features does not confer an advantage over a random connection table in small
networks. This work is helpful to improve the efficiency of connections between
the layers of unsupervised deep neural networks
Temporal Data Analysis Using Reservoir Computing and Dynamic Memristors
Temporal data analysis including classification and forecasting is essential in a range of fields from finance to engineering. While static data are largely independent of each other, temporal data have a considerable correlation between the samples, which is important for temporal data analysis. Neural networks thus offer a more general and flexible approach since they do not depend on parameters of specific tasks but are driven only by the data. In particular, recurrent neural networks have gathered much attention since the temporal information captured by the recurrent connections improves the prediction performance. Recently, reservoir computing (RC), which evolves from recurrent neural networks, has been extensively studied for temporal data analysis as it can offer efficient temporal processing of recurrent neural networks with a low training cost.
This dissertation presents a hardware implementation of the RC system using an emerging device - memristor, followed by a theoretical study on hierarchical architectures of the RC system.
A RC hardware system based on dynamic tungsten oxide (WOx) memristors is first demonstrated. The internal short-term memory effects of the WOx memristors allow the memristor-based reservoir to nonlinearly map temporal inputs into reservoir states, where the projected features can be readily processed by a simple linear readout function. We use the system to experimentally demonstrate two standard benchmarking tasks: isolated spoken digit recognition with partial inputs and chaotic system forecasting. High classification accuracy of 99.2% is obtained for spoken digit recognition and autonomous chaotic time series forecasting has been demonstrated over the long term.
We then investigate the influence of the hierarchical reservoir structure on the properties of the reservoir and the performance of the RC system. Analogous to deep neural networks, stacking sub-reservoirs in series is an efficient way to enhance the nonlinearity of data transformation to high-dimensional space and expand the diversity of temporal information captured by the reservoir. These deep reservoir systems offer better performance when compared to simply increasing the size of the reservoir or the number of sub-reservoirs. Low-frequency components are mainly captured by the sub-reservoirs in the later stages of the deep reservoir structure, similar to observations that more abstract information can be extracted by layers in the late stage of deep neural networks. When the total size of the reservoir is fixed, the tradeoff between the number of sub-reservoirs and the size of each sub-reservoir needs to be carefully considered, due to the degraded ability of the individual sub-reservoirs at small sizes. Improved performance of the deep reservoir structure alleviates the difficulty of implementing the RC system on hardware systems.
Beyond temporal data classification and prediction, one of the interesting applications of temporal data analysis is inferring the neural connectivity patterns from the high-dimensional neural activity recording data. By computing the temporal correlation between the neural spikes, connections between the neurons can be inferred using statistics-based techniques, but it becomes increasingly computationally expensive for large scale neural systems. We propose a second-order memristor-based hardware system using the natively implemented spike-timing-dependent plasticity learning rule for neural connectivity inference. By incorporating biological features such as transmission delay to the neural networks, the proposed concept not only correctly infers the direct connections but also distinguishes direct connections from indirect connections. Effects of additional biophysical properties not considered in the simulation and challenges of experimental memristor implementation will be also discussed.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167995/1/moonjohn_1.pd
Efficient Deep Feature Learning and Extraction via StochasticNets
Deep neural networks are a powerful tool for feature learning and extraction
given their ability to model high-level abstractions in highly complex data.
One area worth exploring in feature learning and extraction using deep neural
networks is efficient neural connectivity formation for faster feature learning
and extraction. Motivated by findings of stochastic synaptic connectivity
formation in the brain as well as the brain's uncanny ability to efficiently
represent information, we propose the efficient learning and extraction of
features via StochasticNets, where sparsely-connected deep neural networks can
be formed via stochastic connectivity between neurons. To evaluate the
feasibility of such a deep neural network architecture for feature learning and
extraction, we train deep convolutional StochasticNets to learn abstract
features using the CIFAR-10 dataset, and extract the learned features from
images to perform classification on the SVHN and STL-10 datasets. Experimental
results show that features learned using deep convolutional StochasticNets,
with fewer neural connections than conventional deep convolutional neural
networks, can allow for better or comparable classification accuracy than
conventional deep neural networks: relative test error decrease of ~4.5% for
classification on the STL-10 dataset and ~1% for classification on the SVHN
dataset. Furthermore, it was shown that the deep features extracted using deep
convolutional StochasticNets can provide comparable classification accuracy
even when only 10% of the training data is used for feature learning. Finally,
it was also shown that significant gains in feature extraction speed can be
achieved in embedded applications using StochasticNets. As such, StochasticNets
allow for faster feature learning and extraction performance while facilitate
for better or comparable accuracy performances.Comment: 10 pages. arXiv admin note: substantial text overlap with
arXiv:1508.0546
- …