2,666 research outputs found

    Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

    Full text link
    Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6% relative reduction in word error rate

    MLI: An API for Distributed Machine Learning

    Full text link
    MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability

    Microservices and Machine Learning Algorithms for Adaptive Green Buildings

    Get PDF
    In recent years, the use of services for Open Systems development has consolidated and strengthened. Advances in the Service Science and Engineering (SSE) community, promoted by the reinforcement of Web Services and Semantic Web technologies and the presence of new Cloud computing techniques, such as the proliferation of microservices solutions, have allowed software architects to experiment and develop new ways of building open and adaptable computer systems at runtime. Home automation, intelligent buildings, robotics, graphical user interfaces are some of the social atmosphere environments suitable in which to apply certain innovative trends. This paper presents a schema for the adaptation of Dynamic Computer Systems (DCS) using interdisciplinary techniques on model-driven engineering, service engineering and soft computing. The proposal manages an orchestrated microservices schema for adapting component-based software architectural systems at runtime. This schema has been developed as a three-layer adaptive transformation process that is supported on a rule-based decision-making service implemented by means of Machine Learning (ML) algorithms. The experimental development was implemented in the Solar Energy Research Center (CIESOL) applying the proposed microservices schema for adapting home architectural atmosphere systems on Green Buildings

    Load curve data cleansing and imputation via sparsity and low rank

    Full text link
    The smart grid vision is to build an intelligent power network with an unprecedented level of situational awareness and controllability over its services and infrastructure. This paper advocates statistical inference methods to robustify power monitoring tasks against the outlier effects owing to faulty readings and malicious attacks, as well as against missing data due to privacy concerns and communication errors. In this context, a novel load cleansing and imputation scheme is developed leveraging the low intrinsic-dimensionality of spatiotemporal load profiles and the sparse nature of "bad data.'' A robust estimator based on principal components pursuit (PCP) is adopted, which effects a twofold sparsity-promoting regularization through an â„“1\ell_1-norm of the outliers, and the nuclear norm of the nominal load profiles. Upon recasting the non-separable nuclear norm into a form amenable to decentralized optimization, a distributed (D-) PCP algorithm is developed to carry out the imputation and cleansing tasks using networked devices comprising the so-termed advanced metering infrastructure. If D-PCP converges and a qualification inequality is satisfied, the novel distributed estimator provably attains the performance of its centralized PCP counterpart, which has access to all networkwide data. Computer simulations and tests with real load curve data corroborate the convergence and effectiveness of the novel D-PCP algorithm.Comment: 8 figures, submitted to IEEE Transactions on Smart Grid - Special issue on "Optimization methods and algorithms applied to smart grid

    Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator

    Full text link
    The incorporation of biasing words obtained through contextual knowledge is of paramount importance in automatic speech recognition (ASR) applications. This paper proposes an innovative method for achieving end-to-end contextual ASR using graph neural network (GNN) encodings based on the tree-constrained pointer generator method. GNN node encodings facilitate lookahead for future word pieces in the process of ASR decoding at each tree node by incorporating information about all word pieces on the tree branches rooted from it. This results in a more precise prediction of the generation probability of the biasing words. The study explores three GNN encoding techniques, namely tree recursive neural networks, graph convolutional network (GCN), and GraphSAGE, along with different combinations of the complementary GCN and GraphSAGE structures. The performance of the systems was evaluated using the Librispeech and AMI corpus, following the visual-grounded contextual ASR pipeline. The findings indicate that using GNN encodings achieved consistent and significant reductions in word error rate (WER), particularly for words that are rare or have not been seen during the training process. Notably, the most effective combination of GNN encodings obtained more than 60% WER reduction for rare and unseen words compared to standard end-to-end systems.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processin
    • …
    corecore