3 research outputs found

    NMF-based compositional models for audio source separation

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2017. 2. ๊น€๋‚จ์ˆ˜.Many classes of data can be represented by constructive combinations of parts. Most signal and data from nature have nonnegative values and can be explained and reconstructed by constructive models. By the constructive models, only the additive combination is allowed and it does not result in subtraction of parts. The compositional models include dictionary learning, exemplar-based approaches, and nonnegative matrix factorization (NMF). Compositional models are desirable in many areas including image or visual signal processing, text information processing, audio signal processing, and music information retrieval. In this dissertation, we choose NMF for compositional models and NMF-based target source separation is performed for the application. The target source separation is the extraction or reconstruction of the target signals in the mixture signals which consists with the target and interfering signals. The target source separation can be thought as blind source separation (BSS). BSS aims that the original unknown source signals are extracted without knowing or with very limited information. However, in these days, much of prior information is frequently utilized, and various approaches have been proposed for single channel source separation. NMF basically approximates a nonnegative data matrix V with a product of nonnegative basis and encoding matrices W and H, i.e., V WH. Since both W and H are nonnegative, NMF often leads to a part based representation of the data. The methods based on NMF have shown impressive results in single channel source separation The objective function of NMF is generally presented Euclidean distant, Kullback-Leibler divergence, and Itakura-saito divergence. Many optimization methods have been proposed and utilized, e.g., multiplicative update rule, projected gradient descent and NeNMF. However, NMF-based audio source separation has some issues as follows: non-uniqueness of the bases, a high dependence to the prior information, the overlapped subspace between target bases and interfering bases, a disregard of the encoding vectors from the training phase, and insucient analysis of sparse NMF. In this dissertation, we propose new approaches to resolve the above issues. In section 4, we propose a novel speech enhancement method that combines the statistical model-based enhancement scheme with the NMF-based gain function. For a better performance in time-varying noise environments, both the speech and noise bases of NMF are adapted simultaneously with the help of the estimated speech presence probability. In section 5, we propose a discriminative NMF (DNMF) algorithm which exploits the reconstruction error for the interfering signals as well as the target signal based on target bases. In section 6, we propose an approach to robust bases estimation in which an incremental strategy is adopted. Based on an analogy between clustering and NMF analysis, we incrementally estimate the NMF bases similar to the modied k-means and Linde-Buzo-Gray algorithms popular in the data clustering area. In Section 7, the distribution of the encoding vector is modeled as a multivariate exponential PDF (MVE) with a single scaling factor for each source. In Section 8, several sparse penalty terms for NMF are analyzed and compared in terms of signal to distortion ratio, sparseness of encoding vectors, reconstruction error, and entropy of basis vectors. The new objective function which applied sparse representation and discriminative NMF (DNMF) is also proposed.1 Introduction 1 1.1 Audio source separation 1 1.2 Speech enhancement 3 1.3 Measurements 4 1.4 Outline of the dissertation 6 2 Compositional model and NMF 9 2.1 Compositional model 9 2.2 NMF 14 2.2.1 Update rules: MuR, PGD 16 2.2.2 Modied NMF 20 3 NMF-based audio source separation and issues 23 3.1 NMF-based audio source separation 23 3.2 Problems of NMF in audio source separation 26 3.2.1 A high dependency to the prior knowledge 26 3.2.2 A overlapped subspace between the target and interfering basis matrices 28 3.2.3 A non-uniqueness of the bases 29 3.2.4 A prior knowledge of the encoding vectors 30 3.2.5 Sparse NMF for the source separation 32 4 Online bases update 33 4.1 Introduction 33 4.2 NMF-based speech enhancement using spectral gain function 36 4.3 Speech enhancement combining statistical model-based and NMFbased methods with the on-line bases update 38 4.3.1 On-line update of speech and noise bases 40 4.3.2 Determining maximum update rates 42 4.4 Experiment result 43 5 Discriminative NMF 47 5.1 Introduction 47 5.2 Discriminative NMF utilizing cross reconstruction error 48 5.2.1 DNMF using the reconstruction error of the other source 49 5.2.2 DNMF using the interference factors 50 5.3 Experiment result 52 6 Incremental approach for bases estimate 57 6.1 Introduction 57 6.2 Incremental approach based on modied k-means clustering and Linde-Buzo-Gray algorithm 59 6.2.1 Based on modied k-means clustering 59 6.2.2 LBG based incremental approach 62 6.3 Experiment result 63 6.3.1 Modied k-means clustering based approach 63 6.3.2 LBG based approach 66 7 Prior model of encoding vectors 77 7.1 Introduction 77 7.2 Prior model of encoding vectors based on multivariate exponential distribution 78 7.3 Experiment result 82 8 Conclusions 87 Bibliography 91 ๊ตญ๋ฌธ์ดˆ๋ก 105Docto

    Efficient Incremental Model Learning on Data Streams

    Get PDF
    abstract: With the development of modern technological infrastructures, such as social networks or the Internet of Things (IoT), data is being generated at a speed that is never before seen. Analyzing the content of this data helps us further understand underlying patterns and discover relationships among different subsets of data, enabling intelligent decision making. In this thesis, I first introduce the Low-rank, Win-dowed, Incremental Singular Value Decomposition (SVD) framework to inclemently maintain SVD factors over streaming data. Then, I present the Group Incremental Non-Negative Matrix Factorization framework to leverage redundancies in the data to speed up incremental processing. They primarily tackle the challenges of using factorization models in the scenarios with streaming textual data. In order to tackle the challenges in improving the effectiveness and efficiency of generative models in this streaming environment, I introduce the Incremental Dynamic Multiscale Topic Model framework, which identifies multi-scale patterns and their evolutions within streaming datasets. While the latent factor models assume the linear independence in the latent factors, the generative models assume the observation is generated from a set of latent variables with various distributions. Furthermore, some models may not be accessible or their underlying structures are too complex to understand, such as simulation ensembles, where there may be thousands of parameters with a huge parameter space, the only way to learn information from it is to execute real simulations. When performing knowledge discovery and decision making through data- and model-driven simulation ensembles, it is expensive to operate these ensembles continuously at large scale, due to the high computational. Consequently, given a relatively small simulation budget, it is desirable to identify a sparse ensemble that includes the most informative simulations, while still permitting effective exploration of the input parameter space. Therefore, I present Complexity-Guided Parameter Space Sampling framework, which is an intelligent, top-down sampling scheme to select the most salient simulation parameters to execute, given a limited computational budget. Moreover, I also present a Pivot-Guided Parameter Space Sampling framework, which incrementally maintains a diverse ensemble of models of the simulation ensemble space and uses a pivot guided mechanism for future sample selection.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    corecore