1,132 research outputs found

    Noise adaptive training for subspace Gaussian mixture models

    Get PDF
    Noise adaptive training (NAT) is an effective approach to normalise the environmental distortions in the training data. This paper investigates the model-based NAT scheme using joint uncertainty decoding (JUD) for subspace Gaussian mixture models (SGMMs). A typical SGMM acoustic model has much larger number of surface Gaussian components, which makes it computationally infeasible to compensate each Gaussian explicitly. JUD tackles the problem by sharing the compensation parameters among the Gaussians and hence reduces the computational and memory demands. For noise adaptive training, JUD is reformulated into a generative model, which leads to an efficient expectation-maximisation (EM) based algorithm to update the SGMM acoustic model parameters. We evaluated the SGMMs with NAT on the Aurora 4 database, and obtained higher recognition accuracy compared to systems without adaptive training. Index Terms: adaptive training, noise robustness, joint uncertainty decoding, subspace Gaussian mixture model

    Joint Uncertainty Decoding with Unscented Transform for Noise Robust Subspace Gaussian Mixture Models

    Get PDF
    Common noise compensation techniques use vector Taylor series (VTS) to approximate the mismatch function. Recent work shows that the approximation accuracy may be improved by sampling. One such sampling technique is the unscented transform (UT), which draws samples deterministically from clean speech and noise model to derive the noise corrupted speech parameters. This paper applies UT to noise compensation of the subspace Gaussian mixture model (SGMM). Since UT requires relatively smaller number of samples for accurate estimation, it has significantly lower computational cost compared to other random sampling techniques. However, the number of surface Gaussians in an SGMM is typically very large, making the direct application of UT, for compensating individual Gaussian components, computationally impractical. In this paper, we avoid the computational burden by employing UT in the framework of joint uncertainty decoding (JUD), which groups all the Gaussian components into small number of classes, sharing the compensation parameters by class. We evaluate the JUD-UT technique for an SGMM system using the Aurora 4 corpus. Experimental results indicate that UT can lead to increased accuracy compared to VTS approximation if the JUD phase factor is untuned, and to similar accuracy if the phase factor is tuned empirically. 1

    Metamaterials for Enhanced Polarization Conversion in Plasmonic Excitation

    Get PDF
    Surface plasmons efficient excitation is typically expected to be strongly constrained to transverse magnetic (TM) polarized incidence, as demonstrated so far, due to its intrinsic TM polarization. We report a designer plasmonic metamaterial that is engineered in a deep subwavelength scale in visible optical frequencies to overcome this fundamental limitation, and allows transverse electric (TE) polarized incidence to be strongly coupled to surface plasmons. The experimental verification, which is consistent with the analytical and numerical models, demonstrates this enhanced TE-to-plasmon coupling with efficiency close to 100%, which is far from what is possible through naturally available materials. This discovery will help to efficiently utilize the energy fallen into TE polarization and drastically increase overall excitation efficiency of future plasmonic devices

    Knowledge Distillation for Small-footprint Highway Networks

    Get PDF
    Deep learning has significantly advanced state-of-the-art of speech recognition in the past few years. However, compared to conventional Gaussian mixture acoustic models, neural network models are usually much larger, and are therefore not very deployable in embedded devices. Previously, we investigated a compact highway deep neural network (HDNN) for acoustic modelling, which is a type of depth-gated feedforward neural network. We have shown that HDNN-based acoustic models can achieve comparable recognition accuracy with much smaller number of model parameters compared to plain deep neural network (DNN) acoustic models. In this paper, we push the boundary further by leveraging on the knowledge distillation technique that is also known as {\it teacher-student} training, i.e., we train the compact HDNN model with the supervision of a high accuracy cumbersome model. Furthermore, we also investigate sequence training and adaptation in the context of teacher-student training. Our experiments were performed on the AMI meeting speech recognition corpus. With this technique, we significantly improved the recognition accuracy of the HDNN acoustic model with less than 0.8 million parameters, and narrowed the gap between this model and the plain DNN with 30 million parameters.Comment: 5 pages, 2 figures, accepted to icassp 201

    Multiplicative LSTM for sequence modelling

    Get PDF
    We introduce multiplicative LSTM (mLSTM), a recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by its ability to have different recurrent transition functions for each possible input, which we argue makes it more expressive for autoregressive density estimation. We demonstrate empirically that mLSTM outperforms standard LSTM and its deep variants for a range of character level language modelling tasks. In this version of the paper, we regularise mLSTM to achieve 1.27 bits/char on text8 and 1.24 bits/char on Hutter Prize. We also apply a purely byte-level mLSTM on the WikiText-2 dataset to achieve a character level entropy of 1.26 bits/char, corresponding to a word level perplexity of 88.8, which is comparable to word level LSTMs regularised in similar ways on the same task

    Regularized Subspace Gaussian Mixture Models for Speech Recognition

    Full text link

    Are Socially Responsible Exchange-Traded Funds Paying Off in Performance?

    Get PDF
    This study examines the Socially Responsible (SR) exchange-traded funds (ETFs) by comparing their risk-adjusted performance with a matched group of conventional ETFs in the U.S. equity market. In contrast to prior studies that focus on actively managed mutual funds, we find that the risk-adjusted returns of SR ETFs are significantly lower than those of conventional ETFs during the 2005–2020 period. Such underperformance is only observed in non-crisis periods but not in economic crisis periods (i.e., the 2020 pandemic recession and 2008 financial turmoil). We attribute the observed underperformance of SR ETFs during the non-crisis periods to their limited diversification of unsystematic risks resulting from various negative or positive screens employed in the funds. We also find that net fund flows of the SR ETFs are less sensitive to past negative performance than are conventional fund flows. Collectively, our findings suggest that, instead of seeking wealth maximization, socially conscious investors may choose SR ETFs to gain non-economic utility

    Stream Processing in the Context of CTS

    Get PDF
    The recent development of innovative technologies related to mobile computing combined with smart city infrastructures is generating massive, heterogeneous data and creating opportunities for novel applications in transportational computation science. The heterogeneous data sources provide streams of information that can be used to create smart cities. The knowledge on stream analysis is thus crucial and requires collaboration of people working in logistics, city planning, transportation engineering and data science. We provide a list of materials for a course on stream processing for computational transportation science. The objectives of the course are: Motivate data stream and event processing, its model and challenges. Acquire basic knowledge about data stream processing systems. Understand and analyze their application in the transportation domain..
    corecore