3,454 research outputs found

    Training Echo State Networks with Regularization through Dimensionality Reduction

    Get PDF
    In this paper we introduce a new framework to train an Echo State Network to predict real valued time-series. The method consists in projecting the output of the internal layer of the network on a space with lower dimensionality, before training the output layer to learn the target task. Notably, we enforce a regularization constraint that leads to better generalization capabilities. We evaluate the performances of our approach on several benchmark tests, using different techniques to train the readout of the network, achieving superior predictive performance when using the proposed framework. Finally, we provide an insight on the effectiveness of the implemented mechanics through a visualization of the trajectory in the phase space and relying on the methodologies of nonlinear time-series analysis. By applying our method on well known chaotic systems, we provide evidence that the lower dimensional embedding retains the dynamical properties of the underlying system better than the full-dimensional internal states of the network

    Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models

    Get PDF
    Many complex multi-target prediction problems that concern large target spaces are characterised by a need for efficient prediction strategies that avoid the computation of predictions for all targets explicitly. Examples of such problems emerge in several subfields of machine learning, such as collaborative filtering, multi-label classification, dyadic prediction and biological network inference. In this article we analyse efficient and exact algorithms for computing the top-KK predictions in the above problem settings, using a general class of models that we refer to as separable linear relational models. We show how to use those inference algorithms, which are modifications of well-known information retrieval methods, in a variety of machine learning settings. Furthermore, we study the possibility of scoring items incompletely, while still retaining an exact top-K retrieval. Experimental results in several application domains reveal that the so-called threshold algorithm is very scalable, performing often many orders of magnitude more efficiently than the naive approach

    A kernel-based framework for learning graded relations from data

    Get PDF
    Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are often expressed in a graded manner in real-world applications. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and graded relations are considered, and it unifies existing approaches because different types of graded relations can be modeled, including symmetric and reciprocal relations. This framework establishes important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated through various experiments on synthetic and real-world data.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    A Deep Spatio-Temporal Fuzzy Neural Network for Passenger Demand Prediction

    Get PDF
    In spite of its importance, passenger demand prediction is a highly challenging problem, because the demand is simultaneously influenced by the complex interactions among many spatial and temporal factors and other external factors such as weather. To address this problem, we propose a Spatio-TEmporal Fuzzy neural Network (STEF-Net) to accurately predict passenger demands incorporating the complex interactions of all known important factors. We design an end-to-end learning framework with different neural networks modeling different factors. Specifically, we propose to capture spatio-temporal feature interactions via a convolutional long short-term memory network and model external factors via a fuzzy neural network that handles data uncertainty significantly better than deterministic methods. To keep the temporal relations when fusing two networks and emphasize discriminative spatio-temporal feature interactions, we employ a novel feature fusion method with a convolution operation and an attention layer. As far as we know, our work is the first to fuse a deep recurrent neural network and a fuzzy neural network to model complex spatial-temporal feature interactions with additional uncertain input features for predictive learning. Experiments on a large-scale real-world dataset show that our model achieves more than 10% improvement over the state-of-the-art approaches.Comment: https://epubs.siam.org/doi/abs/10.1137/1.9781611975673.1

    A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression

    Full text link
    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction or network inference problems. During the last decade kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify existing kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency and spectral filtering properties. Our theoretical results provide valuable insights in assessing the advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427
    corecore