75,136 research outputs found

    An Analysis of the Connections Between Layers of Deep Neural Networks

    Full text link
    We present an analysis of different techniques for selecting the connection be- tween layers of deep neural networks. Traditional deep neural networks use ran- dom connection tables between layers to keep the number of connections small and tune to different image features. This kind of connection performs adequately in supervised deep networks because their values are refined during the training. On the other hand, in unsupervised learning, one cannot rely on back-propagation techniques to learn the connections between layers. In this work, we tested four different techniques for connecting the first layer of the network to the second layer on the CIFAR and SVHN datasets and showed that the accuracy can be im- proved up to 3% depending on the technique used. We also showed that learning the connections based on the co-occurrences of the features does not confer an advantage over a random connection table in small networks. This work is helpful to improve the efficiency of connections between the layers of unsupervised deep neural networks

    Temporal Data Analysis Using Reservoir Computing and Dynamic Memristors

    Full text link
    Temporal data analysis including classification and forecasting is essential in a range of fields from finance to engineering. While static data are largely independent of each other, temporal data have a considerable correlation between the samples, which is important for temporal data analysis. Neural networks thus offer a more general and flexible approach since they do not depend on parameters of specific tasks but are driven only by the data. In particular, recurrent neural networks have gathered much attention since the temporal information captured by the recurrent connections improves the prediction performance. Recently, reservoir computing (RC), which evolves from recurrent neural networks, has been extensively studied for temporal data analysis as it can offer efficient temporal processing of recurrent neural networks with a low training cost. This dissertation presents a hardware implementation of the RC system using an emerging device - memristor, followed by a theoretical study on hierarchical architectures of the RC system. A RC hardware system based on dynamic tungsten oxide (WOx) memristors is first demonstrated. The internal short-term memory effects of the WOx memristors allow the memristor-based reservoir to nonlinearly map temporal inputs into reservoir states, where the projected features can be readily processed by a simple linear readout function. We use the system to experimentally demonstrate two standard benchmarking tasks: isolated spoken digit recognition with partial inputs and chaotic system forecasting. High classification accuracy of 99.2% is obtained for spoken digit recognition and autonomous chaotic time series forecasting has been demonstrated over the long term. We then investigate the influence of the hierarchical reservoir structure on the properties of the reservoir and the performance of the RC system. Analogous to deep neural networks, stacking sub-reservoirs in series is an efficient way to enhance the nonlinearity of data transformation to high-dimensional space and expand the diversity of temporal information captured by the reservoir. These deep reservoir systems offer better performance when compared to simply increasing the size of the reservoir or the number of sub-reservoirs. Low-frequency components are mainly captured by the sub-reservoirs in the later stages of the deep reservoir structure, similar to observations that more abstract information can be extracted by layers in the late stage of deep neural networks. When the total size of the reservoir is fixed, the tradeoff between the number of sub-reservoirs and the size of each sub-reservoir needs to be carefully considered, due to the degraded ability of the individual sub-reservoirs at small sizes. Improved performance of the deep reservoir structure alleviates the difficulty of implementing the RC system on hardware systems. Beyond temporal data classification and prediction, one of the interesting applications of temporal data analysis is inferring the neural connectivity patterns from the high-dimensional neural activity recording data. By computing the temporal correlation between the neural spikes, connections between the neurons can be inferred using statistics-based techniques, but it becomes increasingly computationally expensive for large scale neural systems. We propose a second-order memristor-based hardware system using the natively implemented spike-timing-dependent plasticity learning rule for neural connectivity inference. By incorporating biological features such as transmission delay to the neural networks, the proposed concept not only correctly infers the direct connections but also distinguishes direct connections from indirect connections. Effects of additional biophysical properties not considered in the simulation and challenges of experimental memristor implementation will be also discussed.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167995/1/moonjohn_1.pd

    Efficient Deep Feature Learning and Extraction via StochasticNets

    Full text link
    Deep neural networks are a powerful tool for feature learning and extraction given their ability to model high-level abstractions in highly complex data. One area worth exploring in feature learning and extraction using deep neural networks is efficient neural connectivity formation for faster feature learning and extraction. Motivated by findings of stochastic synaptic connectivity formation in the brain as well as the brain's uncanny ability to efficiently represent information, we propose the efficient learning and extraction of features via StochasticNets, where sparsely-connected deep neural networks can be formed via stochastic connectivity between neurons. To evaluate the feasibility of such a deep neural network architecture for feature learning and extraction, we train deep convolutional StochasticNets to learn abstract features using the CIFAR-10 dataset, and extract the learned features from images to perform classification on the SVHN and STL-10 datasets. Experimental results show that features learned using deep convolutional StochasticNets, with fewer neural connections than conventional deep convolutional neural networks, can allow for better or comparable classification accuracy than conventional deep neural networks: relative test error decrease of ~4.5% for classification on the STL-10 dataset and ~1% for classification on the SVHN dataset. Furthermore, it was shown that the deep features extracted using deep convolutional StochasticNets can provide comparable classification accuracy even when only 10% of the training data is used for feature learning. Finally, it was also shown that significant gains in feature extraction speed can be achieved in embedded applications using StochasticNets. As such, StochasticNets allow for faster feature learning and extraction performance while facilitate for better or comparable accuracy performances.Comment: 10 pages. arXiv admin note: substantial text overlap with arXiv:1508.0546
    • …
    corecore