131 research outputs found

    Role of sentiment classification in sentiment analysis: a survey

    Get PDF
    Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results

    Feature analysis for discriminative confidence estimation in spoken term detection

    Get PDF
    This is the author’s version of a work that was accepted for publication in Computer Speech & Language. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Speech & Language, 28, 5, (2014) DOI: 10.1016/j.csl.2013.09.008Discriminative confidence based on multi-layer perceptrons (MLPs) and multiple features has shown significant advantage compared to the widely used lattice-based confidence in spoken term detection (STD). Although the MLP-based framework can handle any features derived from a multitude of sources, choosing all possible features may lead to over complex models and hence less generality. In this paper, we design an extensive set of features and analyze their contribution to STD individually and as a group. The main goal is to choose a small set of features that are sufficiently informative while keeping the model simple and generalizable. We employ two established models to conduct the analysis: one is linear regression which targets for the most relevant features and the other is logistic linear regression which targets for the most discriminative features. We find the most informative features are comprised of those derived from diverse sources (ASR decoding, duration and lexical properties) and the two models deliver highly consistent feature ranks. STD experiments on both English and Spanish data demonstrate significant performance gains with the proposed feature sets.This work has been partially supported by project PriorSPEECH (TEC2009-14719-C02-01) from the Spanish Ministry of Science and Innovation and by project MAV2VICMR (S2009/TIC-1542) from the Community of Madrid

    Advances in deep learning methods for speech recognition and understanding

    Full text link
    Ce travail expose plusieurs études dans les domaines de la reconnaissance de la parole et compréhension du langage parlé. La compréhension sémantique du langage parlé est un sous-domaine important de l'intelligence artificielle. Le traitement de la parole intéresse depuis longtemps les chercheurs, puisque la parole est une des charactéristiques qui definit l'être humain. Avec le développement du réseau neuronal artificiel, le domaine a connu une évolution rapide à la fois en terme de précision et de perception humaine. Une autre étape importante a été franchie avec le développement d'approches bout en bout. De telles approches permettent une coadaptation de toutes les parties du modèle, ce qui augmente ainsi les performances, et ce qui simplifie la procédure d'entrainement. Les modèles de bout en bout sont devenus réalisables avec la quantité croissante de données disponibles, de ressources informatiques et, surtout, avec de nombreux développements architecturaux innovateurs. Néanmoins, les approches traditionnelles (qui ne sont pas bout en bout) sont toujours pertinentes pour le traitement de la parole en raison des données difficiles dans les environnements bruyants, de la parole avec un accent et de la grande variété de dialectes. Dans le premier travail, nous explorons la reconnaissance de la parole hybride dans des environnements bruyants. Nous proposons de traiter la reconnaissance de la parole, qui fonctionne dans un nouvel environnement composé de différents bruits inconnus, comme une tâche d'adaptation de domaine. Pour cela, nous utilisons la nouvelle technique à l'époque de l'adaptation du domaine antagoniste. En résumé, ces travaux antérieurs proposaient de former des caractéristiques de manière à ce qu'elles soient distinctives pour la tâche principale, mais non-distinctive pour la tâche secondaire. Cette tâche secondaire est conçue pour être la tâche de reconnaissance de domaine. Ainsi, les fonctionnalités entraînées sont invariantes vis-à-vis du domaine considéré. Dans notre travail, nous adoptons cette technique et la modifions pour la tâche de reconnaissance de la parole dans un environnement bruyant. Dans le second travail, nous développons une méthode générale pour la régularisation des réseaux génératif récurrents. Il est connu que les réseaux récurrents ont souvent des difficultés à rester sur le même chemin, lors de la production de sorties longues. Bien qu'il soit possible d'utiliser des réseaux bidirectionnels pour une meilleure traitement de séquences pour l'apprentissage des charactéristiques, qui n'est pas applicable au cas génératif. Nous avons développé un moyen d'améliorer la cohérence de la production de longues séquences avec des réseaux récurrents. Nous proposons un moyen de construire un modèle similaire à un réseau bidirectionnel. L'idée centrale est d'utiliser une perte L2 entre les réseaux récurrents génératifs vers l'avant et vers l'arrière. Nous fournissons une évaluation expérimentale sur une multitude de tâches et d'ensembles de données, y compris la reconnaissance vocale, le sous-titrage d'images et la modélisation du langage. Dans le troisième article, nous étudions la possibilité de développer un identificateur d'intention de bout en bout pour la compréhension du langage parlé. La compréhension sémantique du langage parlé est une étape importante vers le développement d'une intelligence artificielle de type humain. Nous avons vu que les approches de bout en bout montrent des performances élevées sur les tâches, y compris la traduction automatique et la reconnaissance de la parole. Nous nous inspirons des travaux antérieurs pour développer un système de bout en bout pour la reconnaissance de l'intention.This work presents several studies in the areas of speech recognition and understanding. The semantic speech understanding is an important sub-domain of the broader field of artificial intelligence. Speech processing has had interest from the researchers for long time because language is one of the defining characteristics of a human being. With the development of neural networks, the domain has seen rapid progress both in terms of accuracy and human perception. Another important milestone was achieved with the development of end-to-end approaches. Such approaches allow co-adaptation of all the parts of the model thus increasing the performance, as well as simplifying the training procedure. End-to-end models became feasible with the increasing amount of available data, computational resources, and most importantly with many novel architectural developments. Nevertheless, traditional, non end-to-end, approaches are still relevant for speech processing due to challenging data in noisy environments, accented speech, and high variety of dialects. In the first work, we explore the hybrid speech recognition in noisy environments. We propose to treat the recognition in the unseen noise condition as the domain adaptation task. For this, we use the novel at the time technique of the adversarial domain adaptation. In the nutshell, this prior work proposed to train features in such a way that they are discriminative for the primary task, but non-discriminative for the secondary task. This secondary task is constructed to be the domain recognition task. Thus, the features trained are invariant towards the domain at hand. In our work, we adopt this technique and modify it for the task of noisy speech recognition. In the second work, we develop a general method for regularizing the generative recurrent networks. It is known that the recurrent networks frequently have difficulties staying on same track when generating long outputs. While it is possible to use bi-directional networks for better sequence aggregation for feature learning, it is not applicable for the generative case. We developed a way improve the consistency of generating long sequences with recurrent networks. We propose a way to construct a model similar to bi-directional network. The key insight is to use a soft L2 loss between the forward and the backward generative recurrent networks. We provide experimental evaluation on a multitude of tasks and datasets, including speech recognition, image captioning, and language modeling. In the third paper, we investigate the possibility of developing an end-to-end intent recognizer for spoken language understanding. The semantic spoken language understanding is an important step towards developing a human-like artificial intelligence. We have seen that the end-to-end approaches show high performance on the tasks including machine translation and speech recognition. We draw the inspiration from the prior works to develop an end-to-end system for intent recognition

    Building multi-domain conversational systems from single domain resources

    Get PDF
    Current Advances In The Development Of Mobile And Smart Devices Have Generated A Growing Demand For Natural Human-Machine Interaction And Favored The Intelligent Assistant Metaphor, In Which A Single Interface Gives Access To A Wide Range Of Functionalities And Services. Conversational Systems Constitute An Important Enabling Technology In This Paradigm. However, They Are Usually Defined To Interact In Semantic-Restricted Domains In Which Users Are Offered A Limited Number Of Options And Functionalities. The Design Of Multi-Domain Systems Implies That A Single Conversational System Is Able To Assist The User In A Variety Of Tasks. In This Paper We Propose An Architecture For The Development Of Multi-Domain Conversational Systems That Allows: (1) Integrating Available Multi And Single Domain Speech Recognition And Understanding Modules, (2) Combining Available System In The Different Domains Implied So That It Is Not Necessary To Generate New Expensive Resources For The Multi-Domain System, (3) Achieving Better Domain Recognition Rates To Select The Appropriate Interaction Management Strategies. We Have Evaluated Our Proposal Combining Three Systems In Different Domains To Show That The Proposed Architecture Can Satisfactory Deal With Multi-Domain Dialogs. (C) 2017 Elsevier B.V. All Rights Reserved.Work partially supported by projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02

    Machine Learning for Microcontroller-Class Hardware -- A Review

    Full text link
    The advancements in machine learning opened a new opportunity to bring intelligence to the low-end Internet-of-Things nodes such as microcontrollers. Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers. This paper highlights the unique requirements of enabling onboard machine learning for microcontroller class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of machine learning model development for microcontroller class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa

    A machine learning taxonomic classifier for science publications

    Get PDF
    Dissertação de mestrado integrado em Engineering and Management of Information SystemsThe evolution in scientific production, associated with the growing interdomain collaboration of knowledge and the increasing co-authorship of scientific works remains supported by processes of manual, highly subjective classification, subject to misinterpretation. The very taxonomy on which this same classification process is based is not consensual, with governmental organizations resorting to taxonomies that do not keep up with changes in scientific areas, and indexers / repositories that seek to keep up with those changes. We find a reality distinct from what is expected and that the domains where scientific work is recorded can easily be misrepresentative of the work itself. The taxonomy applied today by governmental bodies, such as the one that regulates scientific production in Portugal, is not enough, is limiting, and promotes classification in areas close to the desired, therefore with great potential for error. An automatic classification process based on machine learning algorithms presents itself as a possible solution to the subjectivity problem in classification, and while it does not solve the issue of taxonomy mismatch this work shows this possibility with proved results. In this work, we propose a classification taxonomy, as well as we develop a process based on machine learning algorithms to solve the classification problem. We also present a set of directions for future work for an increasingly representative classification of evolution in science, which is not intended as airtight, but flexible and perhaps increasingly based on phenomena and not just disciplines.A evolução na produção de ciência, associada à crescente colaboração interdomínios do conhecimento e à também crescente coautoria de trabalhos permanece suportada por processos de classificação manual, subjetiva e sujeita a interpretações erradas. A própria taxonomia na qual assenta esse mesmo processo de classificação não é consensual, com organismos estatais a recorrerem a taxonomias que não acompanham as alterações nas áreas científicas, e indexadores/repositórios que procuram acompanhar essas mesmas alterações. Verificamos uma realidade distinta do espectável e que os domínios onde são registados os trabalhos científicos podem facilmente estar desenquadrados. A taxonomia hoje aplicada pelos organismos governamentais, como o caso do organismo que regulamenta a produção científica em Portugal, não é suficiente, é limitadora, e promove a classificação em domínios aproximados do desejado, logo com grande potencial para erro. Um processo de classificação automática com base em algoritmos de machine learning apresenta-se como uma possível solução para o problema da subjetividade na classificação, e embora não resolva a questão do desenquadramento da taxonomia utilizada, é apresentada neste trabalho como uma possibilidade comprovada. Neste trabalho propomos uma taxonomia de classificação, bem como nós desenvolvemos um processo baseado em machine learning algoritmos para resolver o problema de classificação. Apresentamos ainda um conjunto de direções para trabalhos futuros para uma classificação cada vez mais representativa da evolução nas ciências, que não pretende ser hermética, mas flexível e talvez cada vez mais baseada em fenómenos e não apenas em disciplinas

    Quantifying the Performance of Explainability Algorithms

    Get PDF
    Given the complexity of the deep neural network (DNN), DNN has long been criticized for its lack of interpretability in its decision-making process. This 'black box' nature has been preventing the adaption of DNN in life-critical tasks. In recent years, there has been a surge of interest around the concept of artificial intelligence explainability/interpretability (XAI), where the goal is to produce an interpretation for a decision made by a DNN algorithm. While many explainability algorithms have been proposed for peaking into the decision-making process of DNN, there has been a limited exploration into the assessment of the performance of explainability methods, with most evaluations centred around subjective human visual perception of the produced interpretations. In this study, we explore a more objective strategy for quantifying the performance of explainability algorithms on DNNs. More specifically, we propose two quantitative performance metrics: i) \textbf{Impact Score} and ii) \textbf{Impact Coverage}. Impact Score assesses the percentage of critical factors with either strong confidence reduction impact or decision shifting impact. Impact Coverage accesses the percentage overlapping of adversarially impacted factors in the input. Furthermore, a comprehensive analysis using this approach was conducted on several explainability methods (LIME, SHAP, and Expected Gradients) on different task domains, such as visual perception, speech recognition and natural language processing (NLP). The empirical evidence suggests that there is significant room for improvement for all evaluated explainability methods. At the same time, the evidence also suggests that even the latest explainability methods can not produce steady better results across different task domains and different test scenarios
    corecore