72,717 research outputs found

    Parallel backpropagation neural networks forTask allocation by means of PVM

    Get PDF
    Features such as fast response, storage efficiency, fault tolerance and graceful degradation in face of scarce or spurious inputs make neural networks appropriate tools for Intelligent Computer Systems. A neural network is, by itself, an inherently parallel system where many, extremely simple, processing units work simultaneously in the same problem building up a computational device which possess adaptation (learning) and generalisation (recognition) abilities. Implementation of neural networks roughly involve at least three stages; design, training and testing. The second, being CPU intensive, is the one requiring most of the processing resources and depending on size and structure complexity the learning process can be extremely long. Thus, great effort has been done to develop parallel implementations intended for a reduction of learning time. Pattern partitioning is an approach to parallelise neural networks where the whole net is replicated in different processors and the weight changes owing to diverse training patterns are parallelised. This approach is the most suitable for a distributed architecture such as the one considered here. Incoming task allocation, as a previous step, is a fundamental service aiming for improving distributed system performance facilitating further dynamic load balancing. A Neural Network Device inserted into the kernel of a distributed system as an intelligent tool, allows to achieve automatic allocation of execution requests under some predefined performance criteria based on resource availability and incoming process requirements. This paper being, a twofold proposal, shows firstly, some design and implementation insights to build a system where decision support for load distribution is based on a neural network device and secondly a distributed implementation to provide parallel learning of neural networks using a pattern partitioning approach. In the latter case, some performance results of the parallelised approach for learning of backpropagation neural networks, are shown. This include a comparison of recall and generalisation abilities and speed-up when using a socket interface or PVM.Sistemas InteligentesRed de Universidades con Carreras en Informática (RedUNCI

    Comparative Analysis of CPU and GPU Profiling for Deep Learning Models

    Full text link
    Deep Learning(DL) and Machine Learning(ML) applications are rapidly increasing in recent days. Massive amounts of data are being generated over the internet which can derive meaningful results by the use of ML and DL algorithms. Hardware resources and open-source libraries have made it easy to implement these algorithms. Tensorflow and Pytorch are one of the leading frameworks for implementing ML projects. By using those frameworks, we can trace the operations executed on both GPU and CPU to analyze the resource allocations and consumption. This paper presents the time and memory allocation of CPU and GPU while training deep neural networks using Pytorch. This paper analysis shows that GPU has a lower running time as compared to CPU for deep neural networks. For a simpler network, there are not many significant improvements in GPU over the CPU.Comment: 6 pages, 11 figure

    Multi-Layer Neural Networks for Quality of Service oriented Server-State Classification in Cloud Servers

    Get PDF
    Task allocation systems in the Cloud have been recently proposed so that their performance is optimised in real-time based on reinforcement learning with spiking Random Neural Networks (RNN). In this paper, rather than reinforcement learning, we suggest the use of multi-layer neural network architectures to infer the state of servers in a dynamic networked Cloud environment, and propose to select the most adequate server based on the task that optimises Quality of Service. First, a procedure is presented to construct datasets for state classification by collecting time-varying data from Cloud servers that have different resource configurations, so that the identification of server states is carried out with supervised classification. We test four distinct multi-layer neural network architectures to this effect: multi-layer dense clusters of RNNs (MLRNN), the hierarchical extreme learning machine (H-ELM), the multi-layer perceptron, and convolutional neural networks. Our experimental results indicate that server-state identification can be carried out efficiently and with the best accuracy using the MLRNN and H-ELM

    D-TIPO: Deep time-inconsistent portfolio optimization with stocks and options

    Full text link
    In this paper, we propose a machine learning algorithm for time-inconsistent portfolio optimization. The proposed algorithm builds upon neural network based trading schemes, in which the asset allocation at each time point is determined by a a neural network. The loss function is given by an empirical version of the objective function of the portfolio optimization problem. Moreover, various trading constraints are naturally fulfilled by choosing appropriate activation functions in the output layers of the neural networks. Besides this, our main contribution is to add options to the portfolio of risky assets and a risk-free bond and using additional neural networks to determine the amount allocated into the options as well as their strike prices. We consider objective functions more in line with the rational preference of an investor than the classical mean-variance, apply realistic trading constraints and model the assets with a correlated jump-diffusion SDE. With an incomplete market and a more involved objective function, we show that it is beneficial to add options to the portfolio. Moreover, it is shown that adding options leads to a more constant stock allocation with less demand for drastic re-allocations.Comment: 27 pages, 7 figure

    Learning with Local Gradients at the Edge

    Full text link
    To enable learning on edge devices with fast convergence and low memory, we present a novel backpropagation-free optimization algorithm dubbed Target Projection Stochastic Gradient Descent (tpSGD). tpSGD generalizes direct random target projection to work with arbitrary loss functions and extends target projection for training recurrent neural networks (RNNs) in addition to feedforward networks. tpSGD uses layer-wise stochastic gradient descent (SGD) and local targets generated via random projections of the labels to train the network layer-by-layer with only forward passes. tpSGD doesn't require retaining gradients during optimization, greatly reducing memory allocation compared to SGD backpropagation (BP) methods that require multiple instances of the entire neural network weights, input/output, and intermediate results. Our method performs comparably to BP gradient-descent within 5% accuracy on relatively shallow networks of fully connected layers, convolutional layers, and recurrent layers. tpSGD also outperforms other state-of-the-art gradient-free algorithms in shallow models consisting of multi-layer perceptrons, convolutional neural networks (CNNs), and RNNs with competitive accuracy and less memory and time. We evaluate the performance of tpSGD in training deep neural networks (e.g. VGG) and extend the approach to multi-layer RNNs. These experiments highlight new research directions related to optimized layer-based adaptor training for domain-shift using tpSGD at the edge
    • …
    corecore