27 research outputs found

    Semi-tied Units for Efficient Gating in LSTM and Highway Networks

    Full text link
    Gating is a key technique used for integrating information from multiple sources by long short-term memory (LSTM) models and has recently also been applied to other models such as the highway network. Although gating is powerful, it is rather expensive in terms of both computation and storage as each gating unit uses a separate full weight matrix. This issue can be severe since several gates can be used together in e.g. an LSTM cell. This paper proposes a semi-tied unit (STU) approach to solve this efficiency issue, which uses one shared weight matrix to replace those in all the units in the same layer. The approach is termed "semi-tied" since extra parameters are used to separately scale each of the shared output values. These extra scaling factors are associated with the network activation functions and result in the use of parameterised sigmoid, hyperbolic tangent, and rectified linear unit functions. Speech recognition experiments using British English multi-genre broadcast data showed that using STUs can reduce the calculation and storage cost by a factor of three for highway networks and four for LSTMs, while giving similar word error rates to the original models.Comment: To appear in Proc. INTERSPEECH 2018, September 2-6, 2018, Hyderabad, Indi

    Low latency modeling of temporal contexts for speech recognition

    Get PDF
    This thesis focuses on the development of neural network acoustic models for large vocabulary continuous speech recognition (LVCSR) to satisfy the design goals of low latency and low computational complexity. Low latency enables online speech recognition; and low computational complexity helps reduce the computational cost both during training and inference. Long span sequential dependencies and sequential distortions in the input vector sequence are a major challenge in acoustic modeling. Recurrent neural networks have been shown to effectively model these dependencies. Specifically, bidirectional long short term memory (BLSTM) networks, provide state-of-the-art performance across several LVCSR tasks. However the deployment of bidirectional models for online LVCSR is non-trivial due to their large latency; and unidirectional LSTM models are typically preferred. In this thesis we explore the use of hierarchical temporal convolution to model long span temporal dependencies. We propose a sub-sampled variant of these temporal convolution neural networks, termed time-delay neural networks (TDNNs). These sub-sampled TDNNs reduce the computation complexity by ~5x, compared to TDNNs, during frame randomized pre-training. These models are shown to be effective in modeling long-span temporal contexts, however there is a performance gap compared to (B)LSTMs. As recent advancements in acoustic model training have eliminated the need for frame randomized pre-training we modify the TDNN architecture to use higher sampling rates, as the increased computation can be amortized over the sequence. These variants of sub- sampled TDNNs provide performance superior to unidirectional LSTM networks, while also affording a lower real time factor (RTF) during inference. However we show that the BLSTM models outperform both the TDNN and LSTM models. We propose a hybrid architecture interleaving temporal convolution and LSTM layers which is shown to outperform the BLSTM models. Further we improve these BLSTM models by using higher frame rates at lower layers and show that the proposed TDNN- LSTM model performs similar to these superior BLSTM models, while reducing the overall latency to 200 ms. Finally we describe an online system for reverberation robust ASR, using the above described models in conjunction with other data augmentation techniques like reverberation simulation, which simulates far-field environments, and volume perturbation, which helps tackle volume variation even without gain normalization

    Machine Learning in Digital Signal Processing for Optical Transmission Systems

    Get PDF
    The future demand for digital information will exceed the capabilities of current optical communication systems, which are approaching their limits due to component and fiber intrinsic non-linear effects. Machine learning methods are promising to find new ways of leverage the available resources and to explore new solutions. Although, some of the machine learning methods such as adaptive non-linear filtering and probabilistic modeling are not novel in the field of telecommunication, enhanced powerful architecture designs together with increasing computing power make it possible to tackle more complex problems today. The methods presented in this work apply machine learning on optical communication systems with two main contributions. First, an unsupervised learning algorithm with embedded additive white Gaussian noise (AWGN) channel and appropriate power constraint is trained end-to-end, learning a geometric constellation shape for lowest bit-error rates over amplified and unamplified links. Second, supervised machine learning methods, especially deep neural networks with and without internal cyclical connections, are investigated to combat linear and non-linear inter-symbol interference (ISI) as well as colored noise effects introduced by the components and the fiber. On high-bandwidth coherent optical transmission setups their performances and complexities are experimentally evaluated and benchmarked against conventional digital signal processing (DSP) approaches. This thesis shows how machine learning can be applied to optical communication systems. In particular, it is demonstrated that machine learning is a viable designing and DSP tool to increase the capabilities of optical communication systems

    Hybrid Methods for Time Series Forecasting

    Get PDF
    Time series forecasting is a crucial task in various fields of business and science. There are two coexisting approaches to time series forecasting, which are statistical methods and machine learning methods. Both come with different strengths and limitations. Statistical methods such as the Holt-Winters’ Method or ARIMA have been practiced for decades. They stand out due to their robustness and flexibility. Furthermore, these methods work well when few data is available and can exploit a priori knowledge. However, statistical methods assume linear relationships in the data, which is not necessarily the case in real-world data, inhibiting forecasting performance. On the other hand, machine learning methods such as Multilayer Perceptrons or Long Short-Term Memory Networks do not have the assumption of linearity and have the exceptional advantage of universally approximating almost any function. In addition to that, machine learning methods can exploit cross-series information to enhance an individual forecast. Besides these strengths, machine learning methods face several limitations in terms of data and computation requirements. Hybrid methods promise to advance time series forecasting by combining the best of statistical and machine learning methods. The fundamental idea is that the combination compensates for the limitations of one approach with the strengths of the other. This thesis shows that the combination of a Holt-Winters’ Method and a Long Short-Term Memory Network is promising when the periodicity of a time series can be precisely specified. The precise specification enables the Holt-Winters’ Method to simplify the forecasting task for the Long Short-Term Memory Network and, consequently, facilitates the hybrid method to obtain accurate forecasts. The research question to be answered is which characteristics of a time series determine the superiority of either statistical, machine learning, or hybrid approaches. The result of the conducted experiment shows that this research question can not be answered generally. Nevertheless, the results propose findings for specific forecasting methods. The Holt-Winters’ Method provides reliable forecasts when the periodicity can be precisely determined. ARIMA, however, handles overlying seasonalities better than the Holt-Winters’ Method due to its autoregressive approach. Furthermore, the results suggest the hypothesis that machine learning methods have difficulties extrapolating time series with trend. Finally, the Multilayer Perceptron can conduct accurate forecasts for various time series despite its simplicity, and the Long Short-Term Memory Network proves that it needs relevant datasets of adequate length to conduct accurate forecasts
    corecore