81 research outputs found
An efficient algorithm for learning to rank from preference graphs
In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM
Matrix representations, linear transformations, and kernels for disambiguation in natural language
In the application of machine learning methods with natural language inputs, the words and their positions in the input text are some of the most important features. In this article, we introduce a framework based on a word-position matrix representation of text, linear feature transformations of the word-position matrices, and kernel functions constructed from the transformations. We consider two categories of transformations, one based on word similarities and the second on their positions, which can be applied simultaneously in the framework in an elegant way. We show how word and positional similarities obtained by applying previously proposed techniques, such as latent semantic analysis, can be incorporated as transformations in the framework. We also introduce novel ways to determine word and positional similarities. We further present efficient algorithms for computing kernel functions incorporating the transformations on the word-position matrices, and, more importantly, introduce a highly efficient method for prediction. The framework is particularly suitable to natural language disambiguation tasks where the aim is to select for a single word a particular property from a set of candidates based on the context of the word. We demonstrate the applicability of the framework to this type of tasks using context-sensitive spelling error correction on the Reuters News corpus as a model problem
Tournament Leave-pair-out Cross-validation for Receiver Operating Characteristic (ROC) Analysis
Receiver operating characteristic (ROC) analysis is widely used for
evaluating diagnostic systems. Recent studies have shown that estimating an
area under ROC curve (AUC) with standard cross-validation methods suffers from
a large bias. The leave-pair-out (LPO) cross-validation has been shown to
correct this bias. However, while LPO produces an almost unbiased estimate of
AUC, it does not provide a ranking of the data needed for plotting and
analyzing the ROC curve. In this study, we propose a new method called
tournament leave-pair-out (TLPO) cross-validation. This method extends LPO by
creating a tournament from pair comparisons to produce a ranking for the data.
TLPO preserves the advantage of LPO for estimating AUC, while it also allows
performing ROC analysis. We have shown using both synthetic and real world data
that TLPO is as reliable as LPO for AUC estimation and confirmed the bias in
leave-one-out cross-validation on low-dimensional data.Comment: 13 pages, 8 figure
All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning
Background
Automated extraction of protein-protein interactions (PPI) is an important and widely studied task in biomedical text mining. We propose a graph kernel based approach for this task. In contrast to earlier approaches to PPI extraction, the introduced all-paths graph kernel has the capability to make use of full, general dependency graphs representing the sentence structure.
Results
We evaluate the proposed method on five publicly available PPI corpora, providing the most comprehensive evaluation done for a machine learning based PPI-extraction system. We additionally perform a detailed evaluation of the effects of training and testing on different resources, providing insight into the challenges involved in applying a system beyond the data it was trained on. Our method is shown to achieve state-of-the-art performance with respect to comparable evaluations, with 56.4 F-score and 84.8 AUC on the AImed corpus.
Conclusion
We show that the graph kernel approach performs on state-of-the-art level in PPI extraction, and note the possible extension to the task of extracting complex interactions. Cross-corpus results provide further insight into how the learning generalizes beyond individual corpora. Further, we identify several pitfalls that can make evaluations of PPI-extraction systems incomparable, or even invalid. These include incorrect cross-validation strategies and problems related to comparing F-score results achieved on different evaluation resources. Recommendations for avoiding these pitfalls are provided.
</div
Efficient cross-validation for kernelized least-squares regression with sparse basis expansions
We propose an efficient algorithm for calculating hold-out and cross-validation (CV) type of estimates for sparse regularized least-squares predictors. Holding out H data points with our method requires O(min(H^2n,Hn^2)) time provided that a predictor with n basis vectors is already trained. In addition to holding out training examples, also some of the basis vectors used to train the sparse regularized least-squares predictor with the whole training set can be removed from the basis vector set used in the hold-out computation. In our experiments, we demonstrate the speed improvements provided by our algorithm in practice, and we empirically show the benefits of removing some of the basis vectors during the CV rounds
Estimating the prediction performance of spatial models via spatial k-fold cross validation
In machine learning, one often assumes the data are independent when evaluating model performance. However, this rarely holds in practice. Geographic information datasets are an example where the data points have stronger dependencies among each other the closer they are geographically. This phenomenon known as spatial autocorrelation (SAC) causes the standard cross validation (CV) methods to produce optimistically biased prediction performance estimates for spatial models, which can result in increased costs and accidents in practical applications. To overcome this problem, we propose a modified version of the CV method called spatial k-fold cross validation (SKCV), which provides a useful estimate for model prediction performance without optimistic bias due to SAC. We test SKCV with three real-world cases involving open natural data showing that the estimates produced by the ordinary CV are up to 40% more optimistic than those of SKCV. Both regression and classification cases are considered in our experiments. In addition, we will show how the SKCV method can be applied as a criterion for selecting data sampling density for new research area
Predictability of boreal forest soil bearing capacity by machine learning
In forest harvesting, terrain trafficability is the key parameter needed for route planning. Advance knowledge of the soil bearing capacity is crucial for heavy machinery operations. Especially peatland areas can cause severe problems for harvesting operations and can result in increased costs. In addition to avoiding potential damage to the soil, route planning must also take into consideration the root damage to the remaining trees. In this paper we study the predictability of boreal soil load bearing capacity by using remote sensing data and field measurement data. We conduct our research by using both linear and nonlinear methods of machine learning. With the best prediction method, ridge regression, the results are promising with a C-index value higher than 0.68 up to 200 m prediction range from the closest point with known bearing capacity, the baseline value being 0.5. The load bearing classification of the soil resulted in 76% accuracy up to 60 m by using a multilayer perceptron method. The results indicate that there is a potential for production applications and that there is a great need for automatic real-time sensoring in order to produce applicable predictions. (C) 2016 ISTVS. Published by Elsevier Ltd. All rights reserved
Learning valued relations from data
Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are in many real-world applications often expressed in a graded manner. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and valued relations are considered, and it unifies existing approaches because different types of valued relations can be modeled, including symmetric and reciprocal relations. This framework establishes in this way important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated on a case study in document retrieval
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
Leveraging multi-way interactions for systematic prediction of pre-clinical drug combination effects
- …