4,355 research outputs found

    Knowledge Graph semantic enhancement of input data for improving AI

    Full text link
    Intelligent systems designed using machine learning algorithms require a large number of labeled data. Background knowledge provides complementary, real world factual information that can augment the limited labeled data to train a machine learning algorithm. The term Knowledge Graph (KG) is in vogue as for many practical applications, it is convenient and useful to organize this background knowledge in the form of a graph. Recent academic research and implemented industrial intelligent systems have shown promising performance for machine learning algorithms that combine training data with a knowledge graph. In this article, we discuss the use of relevant KGs to enhance input data for two applications that use machine learning -- recommendation and community detection. The KG improves both accuracy and explainability

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

    Get PDF
    Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities

    ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ ๋ชจ๋ธ์„ ์ด์šฉํ•œ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ์ด์ƒ๊ตฌ.Recent advances in generation capability of deep learning models have spurred interest in utilizing deep generative models for unsupervised generative data augmentation (GDA). Generative data augmentation aims to improve the performance of a downstream machine learning model by augmenting the original dataset with samples generated from a deep latent variable model. This data augmentation approach is attractive to the natural language processing community, because (1) there is a shortage of text augmentation techniques that require little supervision and (2) resource scarcity being prevalent. In this dissertation, we explore the feasibility of exploiting deep latent variable models for data augmentation on three NLP tasks: sentence classification, spoken language understanding (SLU) and dialogue state tracking (DST), represent NLP tasks of various complexities and properties -- SLU requires multi-task learning of text classification and sequence tagging, while DST requires the understanding of hierarchical and recurrent data structures. For each of the three tasks, we propose a task-specific latent variable model based on conditional, hierarchical and sequential variational autoencoders (VAE) for multi-modal joint modeling of linguistic features and the relevant annotations. We conduct extensive experiments to statistically justify our hypothesis that deep generative data augmentation is beneficial for all subject tasks. Our experiments show that deep generative data augmentation is effective for the select tasks, supporting the idea that the technique can potentially be utilized for other range of NLP tasks. Ablation and qualitative studies reveal deeper insight into the underlying mechanisms of generative data augmentation. As a secondary contribution, we also shed light onto the recurring posterior collapse phenomenon in autoregressive VAEs and, subsequently, propose novel techniques to reduce the model risk, which is crucial for proper training of complex VAE models, enabling them to synthesize better samples for data augmentation. In summary, this work intends to demonstrate and analyze the effectiveness of unsupervised generative data augmentation in NLP. Ultimately, our approach enables standardized adoption of generative data augmentation, which can be applied orthogonally to existing regularization techniques.์ตœ๊ทผ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ ๋ชจ๋ธ์˜ ๊ธ‰๊ฒฉํ•œ ๋ฐœ์ „์œผ๋กœ ์ด๋ฅผ ์ด์šฉํ•œ ์ƒ์„ฑ ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•(generative data augmentation, GDA)์˜ ์‹คํ˜„ ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ ๊ธฐ๋Œ€๊ฐ€ ์ปค์ง€๊ณ  ์žˆ๋‹ค. ์ƒ์„ฑ ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์€ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ž ์žฌ๋ณ€์ˆ˜ ๋ชจ๋ธ์—์„œ ์ƒ์„ฑ ๋œ ์ƒ˜ํ”Œ์„ ์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์— ์ถ”๊ฐ€ํ•˜์—ฌ ์—ฐ๊ด€๋œ ํƒœ์Šคํฌ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๊ธฐ์ˆ ์„ ์˜๋ฏธํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์ƒ์„ฑ ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์€ ๋ฐ์ดํ„ฐ ๊ณต๊ฐ„์—์„œ ์ด๋ค„์ง€๋Š” ์ •๊ทœํ™” ๊ธฐ์ˆ ์˜ ํ•œ ํ˜•ํƒœ๋กœ ๊ฐ„์ฃผ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ ๋ชจ๋ธ์˜ ์ƒˆ๋กœ์šด ํ™œ์šฉ ๊ฐ€๋Šฅ์„ฑ์€ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ถ„์•ผ์—์„œ ๋”์šฑ ์ค‘์š”ํ•˜๊ฒŒ ๋ถ€๊ฐ๋˜๋Š” ์ด์œ ๋Š” (1) ๋ฒ”์šฉ ๊ฐ€๋Šฅํ•œ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ์ˆ ์˜ ๋ถ€์žฌ์™€ (2) ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ํฌ์†Œ์„ฑ์„ ๊ทน๋ณตํ•  ์ˆ˜ ์žˆ๋Š” ๋Œ€์•ˆ์ด ํ•„์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋ฌธ์ œ์˜ ๋ณต์žก๋„์™€ ํŠน์ง•์„ ๊ณจ๊ณ ๋ฃจ ์ฑ„์ง‘ํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํ…์ŠคํŠธ ๋ถ„๋ฅ˜(text classification), ์ˆœ์ฐจ์  ๋ ˆ์ด๋ธ”๋ง๊ณผ ๋ฉ€ํ‹ฐํƒœ์Šคํ‚น ๊ธฐ์ˆ ์ด ํ•„์š”ํ•œ ๋ฐœํ™” ์ดํ•ด(spoken language understanding, SLU), ๊ณ„์ธต์ ์ด๋ฉฐ ์žฌ๊ท€์ ์ธ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ์— ๋Œ€ํ•œ ๊ณ ๋ ค๊ฐ€ ํ•„์š”ํ•œ ๋Œ€ํ™” ์ƒํƒœ ์ถ”์ (dialogue state tracking, DST) ๋“ฑ ์„ธ ๊ฐ€์ง€ ๋ฌธ์ œ์—์„œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์˜ ํƒ€๋‹น์„ฑ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์กฐ๊ฑด๋ถ€, ๊ณ„์ธต์  ๋ฐ ์ˆœ์ฐจ์  variational autoencoder (VAE)์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๊ฐ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ฌธ์ œ์— ํŠนํ™”๋œ ํ…์ŠคํŠธ ๋ฐ ์—ฐ๊ด€ ๋ถ€์ฐฉ ์ •๋ณด๋ฅผ ๋™์‹œ์— ์ƒ์„ฑํ•˜๋Š” ํŠน์ˆ˜ ๋”ฅ๋Ÿฌ๋‹ ์ƒ์„ฑ ๋ชจ๋ธ๋“ค์„ ์ œ์‹œํ•˜๊ณ , ๋‹ค์–‘ํ•œ ํ•˜๋ฅ˜ ๋ชจ๋ธ๊ณผ ๋ฐ์ดํ„ฐ์…‹์„ ๋‹ค๋ฃจ๋Š” ๋“ฑ ํญ ๋„“์€ ์‹คํ—˜์„ ํ†ตํ•ด ๋”ฅ ์ƒ์„ฑ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์˜ ํšจ๊ณผ๋ฅผ ํ†ต๊ณ„์ ์œผ๋กœ ์ž…์ฆํ•˜์˜€๋‹ค. ๋ถ€์ˆ˜์  ์—ฐ๊ตฌ์—์„œ๋Š” ์ž๊ธฐํšŒ๊ท€์ (autoregressive) VAE์—์„œ ๋นˆ๋ฒˆํžˆ ๋ฐœ์ƒํ•˜๋Š” posterior collapse ๋ฌธ์ œ์— ๋Œ€ํ•ด ํƒ๊ตฌํ•˜๊ณ , ํ•ด๋‹น ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ์‹ ๊ทœ ๋ฐฉ์•ˆ๋„ ์ œ์•ˆํ•œ๋‹ค. ํ•ด๋‹น ๋ฐฉ๋ฒ•์„ ์ƒ์„ฑ์  ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์— ํ•„์š”ํ•œ ๋ณต์žกํ•œ VAE ๋ชจ๋ธ์— ์ ์šฉํ•˜์˜€์„ ๋•Œ, ์ƒ์„ฑ ๋ชจ๋ธ์˜ ์ƒ์„ฑ ์งˆ์ด ํ–ฅ์ƒ๋˜์–ด ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ํšจ๊ณผ์—๋„ ๊ธ์ •์ ์ธ ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ์Œ์„ ๊ฒ€์ฆํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์„ ํ†ตํ•ด ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ถ„์•ผ์—์„œ ๊ธฐ์กด ์ •๊ทœํ™” ๊ธฐ๋ฒ•๊ณผ ๋ณ‘ํ–‰ ์ ์šฉ ๊ฐ€๋Šฅํ•œ ๋น„์ง€๋„ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์˜ ํ‘œ์ค€ํ™”๋ฅผ ๊ธฐ๋Œ€ํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค.1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Overview 6 2 Background and Related Work 8 2.1 Deep Latent Variable Models 8 2.1.1 Variational Autoencoder (VAE) 10 2.1.2 Deep Generative Models and Text Generation 12 2.2 Data Augmentation 12 2.2.1 General Description 13 2.2.2 Categorization of Data Augmentation 14 2.2.3 Theoretical Explanations 21 2.3 Summary 24 3 Basic Task: Text Classi cation 25 3.1 Introduction 25 3.2 Our Approach 28 3.2.1 Proposed Models 28 3.2.2 Training with I-VAE 29 3.3 Experiments 31 3.3.1 Datasets 32 3.3.2 Experimental Settings 33 3.3.3 Implementation Details 34 3.3.4 Data Augmentation Results 36 3.3.5 Ablation Studies 39 3.3.6 Qualitative Analysis 40 3.4 Summary 45 4 Multi-task Learning: Spoken Language Understanding 46 4.1 Introduction 46 4.2 Related Work 48 4.3 Model Description 48 4.3.1 Framework Formulation 48 4.3.2 Joint Generative Model 49 4.4 Experiments 56 4.4.1 Datasets 56 4.4.2 Experimental Settings 57 4.4.3 Generative Data Augmentation Results 61 4.4.4 Comparison to Other State-of-the-art Results 63 4.4.5 Ablation Studies 63 4.5 Summary 67 5 Complex Data: Dialogue State Tracking 68 5.1 Introduction 68 5.2 Background and Related Work 70 5.2.1 Task-oriented Dialogue 70 5.2.2 Dialogue State Tracking 72 5.2.3 Conversation Modeling 72 5.3 Variational Hierarchical Dialogue Autoencoder (VHDA) 73 5.3.1 Notations 73 5.3.2 Variational Hierarchical Conversational RNN 74 5.3.3 Proposed Model 75 5.3.4 Posterior Collapse 82 5.4 Experimental Results 84 5.4.1 Experimental Settings 84 5.4.2 Data Augmentation Results 90 5.4.3 Intrinsic Evaluation - Language Evaluation 94 5.4.4 Qualitative Results 95 5.5 Summary 101 6 Conclusion 103 6.1 Summary 103 6.2 Limitations 104 6.3 Future Work 105Docto
    • โ€ฆ
    corecore