442 research outputs found

    A survey on modern trainable activation functions

    Full text link
    In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

    ForecastNet: A Time-Variant Deep Feed-Forward Neural Network Architecture for Multi-Step-Ahead Time-Series Forecasting

    Full text link
    Recurrent and convolutional neural networks are the most common architectures used for time series forecasting in deep learning literature. These networks use parameter sharing by repeating a set of fixed architectures with fixed parameters over time or space. The result is that the overall architecture is time-invariant (shift-invariant in the spatial domain) or stationary. We argue that time-invariance can reduce the capacity to perform multi-step-ahead forecasting, where modelling the dynamics at a range of scales and resolutions is required. We propose ForecastNet which uses a deep feed-forward architecture to provide a time-variant model. An additional novelty of ForecastNet is interleaved outputs, which we show assist in mitigating vanishing gradients. ForecastNet is demonstrated to outperform statistical and deep learning benchmark models on several datasets

    Incremental construction of LSTM recurrent neural network

    Get PDF
    Long Short--Term Memory (LSTM) is a recurrent neural network that uses structures called memory blocks to allow the net remember significant events distant in the past input sequence in order to solve long time lag tasks, where other RNN approaches fail. Throughout this work we have performed experiments using LSTM networks extended with growing abilities, which we call GLSTM. Four methods of training growing LSTM has been compared. These methods include cascade and fully connected hidden layers as well as two different levels of freezing previous weights in the cascade case. GLSTM has been applied to a forecasting problem in a biomedical domain, where the input/output behavior of five controllers of the Central Nervous System control has to be modelled. We have compared growing LSTM results against other neural networks approaches, and our work applying conventional LSTM to the task at hand.Postprint (published version

    Two-Layer Feed Forward Neural Network (TLFN) in Predicting Loan Default Probability

    Get PDF
    The main objective of the thesis is to apply a Neural Network (NN) approach in the PD used to assess whether a credit operation is granted or not. That is, given an operation, the NN model should predict whether it is granted [0], or not granted [1]. Credit Risk Models and Deep Learning concepts are also explained

    ATM Cash demand forecasting in an Indian Bank with chaos and deep learning

    Full text link
    This paper proposes to model chaos in the ATM cash withdrawal time series of a big Indian bank and forecast the withdrawals using deep learning methods. It also considers the importance of day-of-the-week and includes it as a dummy exogenous variable. We first modelled the chaos present in the withdrawal time series by reconstructing the state space of each series using the lag, and embedding dimension found using an auto-correlation function and Cao's method. This process converts the uni-variate time series into multi variate time series. The "day-of-the-week" is converted into seven features with the help of one-hot encoding. Then these seven features are augmented to the multivariate time series. For forecasting the future cash withdrawals, using algorithms namely ARIMA, random forest (RF), support vector regressor (SVR), multi-layer perceptron (MLP), group method of data handling (GMDH), general regression neural network (GRNN), long short term memory neural network and 1-dimensional convolutional neural network. We considered a daily cash withdrawals data set from an Indian commercial bank. After modelling chaos and adding exogenous features to the data set, we observed improvements in the forecasting for all models. Even though the random forest (RF) yielded better Symmetric Mean Absolute Percentage Error (SMAPE) value, deep learning algorithms, namely LSTM and 1D CNN, showed similar performance compared to RF, based on t-test.Comment: 20 pages; 6 figures and 3 table

    DeepCough: A Deep Convolutional Neural Network in A Wearable Cough Detection System

    Full text link
    In this paper, we present a system that employs a wearable acoustic sensor and a deep convolutional neural network for detecting coughs. We evaluate the performance of our system on 14 healthy volunteers and compare it to that of other cough detection systems that have been reported in the literature. Experimental results show that our system achieves a classification sensitivity of 95.1% and a specificity of 99.5%.Comment: BioCAS-201

    A general purpose intelligent surveillance system for mobile devices using deep learning

    Get PDF
    In this paper the design, implementation, and evaluation of a general purpose smartphone based intelligent surveillance system is presented. It has two main elements; i) a detection module, and ii) a classification module. The detection module is based on the recently introduced approach that combines the well-known background subtraction method with the optical flow and recursively estimated density. The classification module is based on a neural network using Deep Learning methodology. Firstly, the architecture design of the convolutional neural network is presented and analyzed in the context of the four selected architectures (two of them recent successful types) and two custom modifications specifically made for the problem at hand. The results are carefully evaluated, and the best one is selected to be used within the proposed system. In addition, the system is implemented on both a PC (using Linux type OS) and on a smartphone (using Android). In addition to the compatibility with all modern Android-based devices, most GPU powered platforms such as Raspberry Pi, Nvidia Tegra X1 and Jetson run on Linux. The proposed system can easily be installed on any such device benefiting from the advantage of parallelisation for faster execution. The proposed system achieved a performance which surpasses that of a human (classification accuracy of the top 1 class >95.9% for automatic recognition of a detected object into one of the seven selected categories. For the top-2 classes, the accuracy is even higher (99.85%). That means, at least, one of the two top classes suggested by the system is correct. Finally, a number of visual examples are showcased of the system in use in both PC and Android devices

    Objective Assessment of Machine Learning Algorithms for Speech Enhancement in Hearing Aids

    Get PDF
    Speech enhancement in assistive hearing devices has been an area of research for many decades. Noise reduction is particularly challenging because of the wide variety of noise sources and the non-stationarity of speech and noise. Digital signal processing (DSP) algorithms deployed in modern hearing aids for noise reduction rely on certain assumptions on the statistical properties of undesired signals. This could be disadvantageous in accurate estimation of different noise types, which subsequently leads to suboptimal noise reduction. In this research, a relatively unexplored technique based on deep learning, i.e. Recurrent Neural Network (RNN), is used to perform noise reduction and dereverberation for assisting hearing-impaired listeners. For noise reduction, the performance of the deep learning model was evaluated objectively and compared with that of open Master Hearing Aid (openMHA), a conventional signal processing based framework, and a Deep Neural Network (DNN) based model. It was found that the RNN model can suppress noise and improve speech understanding better than the conventional hearing aid noise reduction algorithm and the DNN model. The same RNN model was shown to reduce reverberation components with proper training. A real-time implementation of the deep learning model is also discussed

    Transform Diabetes - Harnessing Transformer-Based Machine Learning and Layered Ensemble with Enhanced Training for Improved Glucose Prediction.

    Get PDF
    Type 1 diabetes is a common chronic disease characterized by the body’s inability to regulate the blood glucose level, leading to severe health consequences if not handled manually. Accurate blood glucose level predictions can enable better disease management and inform subsequent treatment decisions. However, predicting future blood glucose levels is a complex problem due to the inherent complexity and variability of the human body. This thesis investigates using a Transformer model to outperform a state-of-the-art Convolutional Recurrent Neural Network model by forecasting blood glucose levels on the same dataset. The problem is structured, and the data is preprocessed as a multivariate multi-step time series. A unique Layered Ensemble technique that Enhances the Training of the final model is introduced. This technique manages missing data and counters potential issues from other techniques by employing both a Long Short-Term Memory model and a Transformer model together. The experimental results show that this novel ensemble technique reduces the root mean squared error by approximately 14.28% when predicting the blood glucose level 30 minutes in the future compared to the state-of-the-art model. This improvement highlights the potential of this approach to assist diabetes patients with effective disease management
    • …
    corecore