16,640 research outputs found
An evolutionary strategy with machine learning for learning to rank in information retrieval
Learning to Rank (LTR) is one of the problems in Information Retrieval (IR) that nowadays attracts attention from researchers. The LTR problem refers to ranking the retrieved documents for users in search engines, question answering and product recommendation systems. There is a number of LTR approaches based on machine learning and computational intelligence techniques. Most existing LTR methods have limitations, like being too slow or not being very effective or requiring large computer memory to operate. This paper proposes a LTR method that combines a (1+1)-Evolutionary Strategy with machine learning. Three variants of the method are investigated: ES-Rank, IESR-Rank and IESVMRank. They differ on the mechanism to initialize the chromosome for the evolutionary process. ES-Rank simply sets all genes in the initial chromosome to the same value. IESRRank uses linear regression and IESVM-Rank uses support vector machine for the initialization process. Experimental results from comparing the proposed method to fourteen other approaches from the literature show that IESRRank achieves the overall best performance. Ten problem instances are used here, obtained from four datasets: MSLR-WEB10K, LETOR 3 and LETOR 4. Performance is measured at the top-10 query-document pairs retrieved, using five metrics: Mean Average Precision (MAP), Root Mean Square Error (RMSE), Precision (P@10), Reciprocal Rank (RR@10) and Normalized Discounted Cumulative Gain (NDCG@10). The contribution of this paper is an effective and efficient LTR method combining a listwise evolutionary technique with point-wise and pair-wise machine learning techniques
Recommended from our members
Local search: A guide for the information retrieval practitioner
There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR
Recommended from our members
An experimental comparison of a genetic algorithm and a hill-climber for term selection
Purpose – The term selection problem for selecting query terms in information filtering and routing has been investigated using hill-climbers of various kinds, largely through the Okapi experiments in the TREC series of conferences. Although these are simple deterministic approaches which examine the effect of changing the weight of one term at a time, they have been shown to improve the retrieval effectiveness of filtering queries in these TREC experiments. Hill-climbers are, however, likely to get trapped in local optima, and the use of more sophisticated local search techniques for this problem that attempt to break out of these optima are worth investigating. To this end, we apply a genetic algorithm (GA) to the same problem.
Design/Methodology/Approach – We use a standard TREC test collection from the TREC-8 filtering track, recording mean average precision and recall measures to allow comparison between the hillclimber and GA algorithms. We also vary elements of the GA, such as probability of a word being included, probability of mutation and population size in order to measure the effect of these variables. Different strategies such as Elitist and Non-Elitist methods are used, as well as Roulette Wheel and Rank selection GA algorithms.
Findings – The results of tests suggest that both techniques are, on average, better than the baseline, but the implemented GA does not match the overall performance of a hill-climber. The Rank selection algorithm does better on average than the Roulette Wheel algorithm. There is no evidence in this study that varying word inclusion probability, mutation probability or Elitist method make much difference to the overall results. Small population sizes do not appear to be as effective as larger population sizes.
Research limitations/implications – The evidence provided here would suggest that being stuck in a local optima for the term selection optimization problem does not appear to be detrimental to the overall success of the hill-climber. The evidence from term rank order would appear to provide extra useful evidence which hill-climbers can use efficiently and effectively to narrow the search space.
Originality/Value – The paper represents the first attempt to compare hill-climbers with GAs on a problem of this type
Pairwise meta-rules for better meta-learning-based algorithm ranking
In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset
ES-Rank: evolution strategy learning to rank approach
Learning to Rank (LTR) is one of the current problems in Information Retrieval (IR) that attracts the attention from researchers. The LTR problem is mainly about ranking the retrieved documents for users in search engines, question answering and product recommendation systems. There are a number of LTR approaches from the areas of machine learning and computational intelligence. Most approaches have the limitation of being too slow or not being very effective. This paper investigates the application of evolutionary computation, specifically a (1+1) Evolutionary Strategy called ES-Rank, to tackle the LTR problem. Experimental results from comparing the proposed method to fourteen other approaches from the literature, show that ESRank achieves the overall best performance. Three datasets (MQ2007, MQ2008 and MSLR-WEB10K) from the LETOR benchmark collection and two performance metrics, Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG) at top-10 query-document pairs retrieved, were used in the experiments. The contribution of this paper is an effective and efficient method for the LTR problem
- …