9 research outputs found

    A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression

    Full text link
    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction or network inference problems. During the last decade kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify existing kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency and spectral filtering properties. Our theoretical results provide valuable insights in assessing the advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427

    A comparative study of pairwise learning methods based on Kernel ridge regression

    Get PDF
    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.</p

    Supervised learning methods to predict species interactions based

    Get PDF
    Species interaction networks - pollination networks, host-phage networks, food webs and the like - are key tools to study community ecosystems. Biologists enjoy working with networks as they provide a sound mathematical description of a system and come equipped with a large toolkit to analyze various properties such as stability, diversity or dynamics. Species interaction networks can be obtained experimentally or by field observations. Modern techniques such as DNA barcoding and camera traps, coupled with large databases, contribute further to the popularity of networks in ecology. In practice however, a collected network rarely contains all in sito interactions, as this would require an unfeasible large sampling effort. Species distributions are also subject to changes, for example due to climate change, which leads to new potential interactions. It is of great importance to be able to predict such interactions, for example to anticipate the effect of exotic species in an ecosystem. In our work, we study how to use supervised machine learning tools to be able to predict new species interactions. Based on an observed network, we learn a function that takes as inputs the description of two species (e.g. traits, phylogenetic similarity or a morphological description) and predicts whether these two species are likely to interact or not. This framework for pairwise learning is based on kernels and similar methods have been highly successful for predicting molecular networks and for recommender systems, as used by companies such as Netflix and Amazon. We have shown that these methods can detect missing interactions in many different types of species interaction networks. A large focus of our work is on how the accuracy of these models can be estimated realistically. Our methods are available in an R package called xnet, making them easy to use for ecology researchers

    Matrix completion and extrapolation via kernel regression

    Get PDF
    Matrix completion and extrapolation (MCEX) are dealt with here over reproducing kernel Hilbert spaces (RKHSs) in order to account for prior information present in the available data. Aiming at a faster and low-complexity solver, the task is formulated as a kernel ridge regression. The resultant MCEX algorithm can also afford online implementation, while the class of kernel functions also encompasses several existing approaches to MC with prior information. Numerical tests on synthetic and real datasets show that the novel approach performs faster than widespread methods such as alternating least squares (ALS) or stochastic gradient descent (SGD), and that the recovery error is reduced, especially when dealing with noisy data

    Cold-start problems in data-driven prediction of drug-drug interaction effects

    Get PDF
    Combining drugs, a phenomenon often referred to as polypharmacy, can induce additional adverse effects. The identification of adverse combinations is a key task in pharmacovigilance. In this context, in silico approaches based on machine learning are promising as they can learn from a limited number of combinations to predict for all. In this work, we identify various subtasks in predicting effects caused by drug–drug interaction. Predicting drug–drug interaction effects for drugs that already exist is very different from predicting outcomes for newly developed drugs, commonly called a cold-start problem. We propose suitable validation schemes for the different subtasks that emerge. These validation schemes are critical to correctly assess the performance. We develop a new model that obtains AUC-ROC =0.843 for the hardest cold-start task up to AUC-ROC =0.957 for the easiest one on the benchmark dataset of Zitnik et al. Finally, we illustrate how our predictions can be used to improve post-market surveillance systems or detect drug–drug interaction effects earlier during drug development

    Biol Psychiatry Cogn Neurosci Neuroimaging

    Get PDF
    Background:Posttraumatic Stress Disorder (PTSD) is a debilitating disorder and there is no current accurate prediction of who develops it after trauma. Neurobiologically, individuals with chronic PTSD exhibit aberrant resting-state functional connectivity (rsFC) between the hippocampus and other brain regions (e.g., amygdala, prefrontal cortex, posterior cingulate), and these aberrations correlate with severity of illness. Prior small-scale research (n < 25) has also shown that hippocampal-rsFC measured acutely after trauma is predictive of future severity using an ROI-based approach. While a promising biomarker, to-date no study has employed a data-driven approach to test whole-brain hippocampal-FC patterns in forecasting the development of PTSD symptoms.Methods:Ninety-eight adults at risk of PTSD were recruited from the emergency department following traumatic injury and completed resting functional magnetic resonance imaging (rsfMRI; 8min) within 1-month; 6-months later they completed the Clinician-Administered PTSD Scale (CAPS-5) for assessment of PTSD symptom severity. Whole-brain rsFC values with bilateral hippocampi were extracted (CONN) and used in a machine learning kernel ridge regression analysis (PRoNTo); both a k-folds (k=10) and 70/30 testing vs. training split approach were used for cross-validation (1,000 iterations to bootstrap confidence intervals for significance values).Results:Acute hippocampal-rsFC significantly predicted CAPS-5 scores at 6-months (r=0.30, p=0.006; MSE=120.58, p=0.006; R2=0.09, p=0.025). In post-hoc analyses, hippocampal-rsFC remained significant after controlling for demographics, PTSD symptoms at baseline, and depression, anxiety, and stress severity at 6-months (B=0.59, SE=0.20, p=0.003).Conclusions:Findings suggest functional connectivity of the hippocampus across the brain acutely after traumatic injury is associated with prospective PTSD symptom severity.TL1 TR001437/TR/NCATS NIH HHSUnited States/R01 MH106574/MH/NIMH NIH HHSUnited States/U01 CE002944/CE/NCIPC CDC HHSUnited States/UL1 TR001436/TR/NCATS NIH HHSUnited States/U01 NS088034/NS/NINDS NIH HHSUnited States

    Strea MRAK a streaming multi-resolution adaptive kernel algorithm

    Get PDF
    Kernel ridge regression (KRR) is a popular scheme for non-linear non-parametric learning. However, existing implementations of KRR require that all the data is stored in the main memory, which severely limits the use of KRR in contexts where data size far exceeds the memory size. Such applications are increasingly common in data mining, bioinformatics, and control. A powerful paradigm for computing on data sets that are too large for memory is the streaming model of computation, where we process one data sample at a time, discarding each sample before moving on to the next one. In this paper, we propose StreaMRAK - a streaming version of KRR. StreaMRAK improves on existing KRR schemes by dividing the problem into several levels of resolution, which allows continual refinement to the predictions. The algorithm reduces the memory requirement by continuously and efficiently integrating new samples into the training model. With a novel sub-sampling scheme, StreaMRAK reduces memory and computational complexities by creating a sketch of the original data, where the sub-sampling density is adapted to the bandwidth of the kernel and the local dimensionality of the data. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum. The results show that the proposed algorithm is fast and accurate

    A comparative study of pairwise learning methods based on kernel ridge regression

    No full text
    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze urdversality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods
    corecore