9 research outputs found

    Dimensionality-dependent generalization bounds for k-dimensional coding schemes

    Full text link
    ยฉ 2016 Massachusetts Institute of Technology. The k-dimensional coding schemes refer to a collection of methods that attempt to represent data using a set of representative k-dimensional vectors and include nonnegative matrix factorization, dictionary learning, sparse coding, k-means clustering, and vector quantization as special cases. Previous generalization bounds for the reconstruction error of the k-dimensional coding schemes are mainly dimensionality-independent. A major advantage of these bounds is that they can be used to analyze the generalization error when data are mapped into an infinite- or high-dimensional feature space. However, many applications use finite-dimensional data features. Can we obtain dimensionality-dependent generalization bounds for k-dimensional coding schemes that are tighter than dimensionality-independent bounds when data are in a finite-dimensional feature space? Yes. In this letter, we address this problem and derive a dimensionality-dependent generalization bound for k-dimensional coding schemes by bounding the covering number of the loss function class induced by the reconstruction error. The bound is of order O((mk ln(mkn)/n)ฮปn, where m is the dimension of features, k is the number of the columns in the linear implementation of coding schemes, and n is the size of sample, ฮปn > 0.5 when n is finite and ฮปn = 0.5 when n is infinite. We show that our bound can be tighter than previous results because it avoids inducing the worst-case upper bound on k of the loss function. The proposed generalization bound is also applied to some specific coding schemes to demonstrate that the dimensionality-dependent bound is an indispensable complement to the dimensionality-independent generalization bounds

    Ensemble deep learning: A review

    Get PDF
    Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

    ๋ฐ€๋„ํ‘œํ˜„ ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ๊ณผ ๊ฐ์„ฑ๋ถ„์„, ๋„๋ฉ”์ธ ์ ์‘์—์˜ ์‘์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์‚ฐ์—…๊ณตํ•™๊ณผ, 2018. 2. ์ด์žฌ์šฑ.As more and more raw data are created and accumulated, it becomes important to identify information from the data. In order to analyze the collected data, machine learning and deep learning models are mainly used in recent years, but the performance of these models is highly dependent on data representation. Recent works on representation learning have shown that capturing the input density can be helpful to get useful information from data. Therefore, in this dissertation we focus on density-based representation learning. In high-dimensional data, manifold assumption is one of the important concepts in representation learning because high-dimensional data are actually concentrated near the lower dimensional high density region (manifold). For unstructured data, converting to numerical vectors is necessary to apply machine learning and deep learning models. In case of text data, distributed representation learning can effectively reflect information of input data while acquiring continuous vectors of words and documents. In this dissertation, we disentangle some issues on manifold of input data and distributed representation of text data in terms of density-based representation learning. First, we examine denoising autoencoders (DAE) from the perspective of dynamical systems when the input density is defined as a distribution on manifold. We construct a dynamic projection system associated with the score function, which can be directly obtained from an autoencoder model that is trained from a Gaussian-convoluted input data. Several analytical results for this system are proposed and applied to develop a nonlinear projection algorithm to recognize the high-density region and reduce the noises of corrupted inputs. The effectiveness of this algorithm is verified through the experiments on toy examples and real image benchmarking datasets. Support vector domain description model can estimate the input density from the trained kernel radius function under some mild conditions on margin and kernel parameters. We propose a novel inductive ensemble clustering method, where kernel support matching is applied to a co-association matrix that aggregates arbitrary basic partitions in order to construct a new similarity for kernel radius function. Experimental results demonstrate that the proposed method is effective with respect to clustering quality and has robustness to induce clusters of out-of-sample data. We also develop low-density regularization methods of DAE model by exploiting the energy of the trained kernel radius function. Illustrative examples show that the regularization is effective to pull up the energy outside the support. Learning document representation is important in applying machine learning algorithms for sentiment analysis. Distributed representation learning models of words and documents, one of neural language models, have been utilized successively in many natural language processing (NLP) tasks including sentiment analysis. However, because such models learn the embeddings only with a context-based objective, it is hard for embeddings to reflect the sentiment of texts. In this research, we address this problem by introducing a semi-supervised sentiment-discriminative objective using partial sentiment information of documents. Our method not only reflects the partial sentiment information, but also preserves local structures induced from original distributed representation learning objectives by considering only sentiment relationships between neighboring documents. Using real-world datasets, the proposed method is validated by sentiment visualization and classification tasks and achieves consistently superior performance to other representation methods in both Amazon and Yelp datasets. NLP is one of the most important application areas in domain adaptation because a property of texts highly depends on their corpus. Many domain adaptation methods for NLP have been developed based on the numerical representation of texts instead of on textual input. Thus, we develop a distributed representation learning method of documents and words for the domain adaptation that addresses the support separation problem, wherein the supports of different domains are separable. In this study, we propose a new method based on negative sampling. The proposed method learns document embeddings by assuming that noise distribution is dependent on a domain. The proposed method can be divided into two cases according to the dependency of the noise distribution of words on domains when training word embeddings. Through experiments on Amazon reviews, we verify that the proposed methods outperform other representation methods in terms of visualization and proxy A-distance results. We also perform sentiment classification tasks to validate the effectiveness of document embeddings, and the proposed methods achieve consistently better results compared with other methods. Recently, there are a large amount of available data that have high dimensional representation or exist in text form, so representation learning to capture manifold of high-dimensional data and to obtain numerical vectors of text that reflect the useful information is required. Therefore, our algorithms can be helpful to suffice these requirements and applied to various data analytics tasks.1. Introduction 1 1.1 Motivation of the Dissertation 1 1.2 Aims of the Dissertation 7 1.3 Organization of the Dissertation 9 2. Stability Analysis of Denoising Autoencoder 11 2.1 Chapter Overview 11 2.2 Motivation for Using Dynamical System 13 2.3 Stability Analysis of the Dynamical Projection System 16 2.4 Nonlinear Projection Algorithm 21 2.5 Experimental Results 23 2.5.1 Toy Examples 24 2.5.2 Real Datasets 27 2.6 Chapter Summary 33 3. Inductive ensemble clustering and low-density regularization with SVDD 35 3.1 Chapter Overview 35 3.2 Inductive Ensemble Clustering with Kernel Radius Function 36 3.2.1 Inductive Support Vector Ensemble Clustering 37 3.2.2 Experimental Results 41 3.3 Low-density Regularization of Denoising Autoencoder with Kernel Radius Function 44 3.3.1 Necessity of Low-density Regularization 44 3.3.2 Proposed Method 46 3.3.3 Illustrative Experiments 49 3.4 Chapter Summary 52 4. Semi-supervised Distributed Representation for Sentiment Analysis 55 4.1 Chapter Overview 55 4.2 Distributed Representations 57 4.3 Proposed Method 60 4.4 Experimental Results 65 4.4.1 Data description 65 4.4.2 Experimental procedure 66 4.4.3 Visualization 69 4.4.4 Classification 74 4.4.5 Parameter analysis 77 4.5 Chapter Summary 80 5. Domain-Adapted Distributed Representation 83 5.1 Chapter Overview 83 5.2 Representation Learning for Domain Adaptation 85 5.3 Proposed Method 87 5.4 Experimental Results 93 5.4.1 Data description 93 5.4.2 Experimental design 94 5.4.3 Visualization 96 5.4.4 Sentiment classification 99 5.4.5 Application to Domain Adversarial Neural Network 103 5.5 Chapter Summary 105 6. Conclusion 109 6.1 Summary 109 6.2 Future Work 111Docto
    corecore