4,239 research outputs found

    The Structure Transfer Machine Theory and Applications

    Get PDF
    Representation learning is a fundamental but challenging problem, especially when the distribution of data is unknown. We propose a new representation learning method, termed Structure Transfer Machine (STM), which enables feature learning process to converge at the representation expectation in a probabilistic way. We theoretically show that such an expected value of the representation (mean) is achievable if the manifold structure can be transferred from the data space to the feature space. The resulting structure regularization term, named manifold loss, is incorporated into the loss function of the typical deep learning pipeline. The STM architecture is constructed to enforce the learned deep representation to satisfy the intrinsic manifold structure from the data, which results in robust features that suit various application scenarios, such as digit recognition, image classification and object tracking. Compared to state-of-the-art CNN architectures, we achieve the better results on several commonly used benchmarks\footnote{The source code is available. https://github.com/stmstmstm/stm }

    Generalization Error in Deep Learning

    Get PDF
    Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results
    • …
    corecore