9 research outputs found

    Learning to Rank: Online Learning, Statistical Theory and Applications.

    Full text link
    Learning to rank is a supervised machine learning problem, where the output space is the special structured space of emph{permutations}. Learning to rank has diverse application areas, spanning information retrieval, recommendation systems, computational biology and others. In this dissertation, we make contributions to some of the exciting directions of research in learning to rank. In the first part, we extend the classic, online perceptron algorithm for classification to learning to rank, giving a loss bound which is reminiscent of Novikoff's famous convergence theorem for classification. In the second part, we give strategies for learning ranking functions in an online setting, with a novel, feedback model, where feedback is restricted to labels of top ranked items. The second part of our work is divided into two sub-parts; one without side information and one with side information. In the third part, we provide novel generalization error bounds for algorithms applied to various Lipschitz and/or smooth ranking surrogates. In the last part, we apply ranking losses to learn policies for personalized advertisement recommendations, partially overcoming the problem of click sparsity. We conduct experiments on various simulated and commercial datasets, comparing our strategies with baseline strategies for online learning to rank and personalized advertisement recommendation.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133334/1/sougata_1.pd

    Kernel-Based Ranking. Methods for Learning and Performance Estimation

    Get PDF
    Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.Siirretty Doriast

    Generalization Analysis of Listwise Learning-to-Rank Algorithms

    No full text
    This paper presents a theoretical framework for ranking, and demonstrates how to perform generalization analysis of listwise ranking algorithms using the framework. Many learning-to-rank algorithms have been proposed in recent years. Among them, the listwise approach has shown higher empirical ranking performance when compared to the other approaches. However, there is no theoretical study on the listwise approach as far as we know. In this paper, we propose a theoretical framework for ranking, which can naturally describe various listwise learningto-rank algorithms. With this framework, we prove a theorem which gives a generalization bound of a listwise ranking algorithm, on the basis of Rademacher Average of the class of compound functions. The compound functions take listwise loss functions as outer functions and ranking models as inner functions. We then compute the Rademacher Averages for existing listwise algorithms of ListMLE, ListNet, and RankCosine. We also discuss the tightness of the bounds in different situations with regard to the list length and transformation function

    Transfer learning for information retrieval

    Get PDF
    The lack of relevance labels is increasingly challenging and presents a bottleneck in the training of reliable learning-to-rank (L2R) models. Obtaining relevance labels using human judgment is expensive and even impossible in some scenarios. Previous research has studied different approaches to solving the problem including generating relevance labels by crowdsourcing and active learning. Recent studies have started to find ways to reuse knowledge from a related collection to help the ranking in a new collection. However, the effectiveness of a ranking function trained in one collection may be degraded when used in another collection due to the generalization issues of machine learning. Transfer learning involves a set of algorithms that are used to train or adapt a model for a target collection without sucient training labels by transferring knowledge from a related source collection with abundant labels. Transfer learning can also be applied to L2R to help train ranking functions for a new task by reusing data from a related collection while minimizing the generalization gap. Some attempts have been made to apply transfer learning techniques on L2R tasks. This thesis investigates different approaches to transfer learning methods for L2R, which are called transfer ranking. However, most of the existing studies on transfer ranking have been focused on the scenario when there are a small but not sucient number of relevance labels. The field of transfer ranking with no target collection labels is still relatively undeveloped. Moreover, the main reason why a transfer ranking solution is needed is that a ranking function trained in the source collection cannot generalize to the target collection, due to the differences in the data distribution of the two collections. However, the effect of the data distribution differences on ranking model generalization has not been examined in detail. The focus of this study is the scenario when there are no relevance labels from the new collection (the target collection), but where a related collection (the target collection) has an abundant amount of training data and labels. In this thesis, we first demonstrate the generalization gap of different L2R algorithms when the distribution of the source and target collections are different in multiple ways, and we then develop alternative solutions to tackling the problem, which includes instance weighting algorithms and self-labeling methods. Instance weighting algorithms estimate weights for each training query in the source collection according to the target query distribution and use the weighted objective function to optimize a ranking function for the target collection. The results on different test collections suggest that instance weighting methods, including existing approaches, are not reliable. The self-labeling methods use other approaches to generate imputed relevance labels for queries in the target collection, which look to transfer the ranking knowledge to the target collection by transferring the label knowledge. The algorithms were tested on various transferring scenarios and showed significant effectiveness and consistency. We thus demonstrate that the performance of self-labeling methods can be further improved with a minimal number of calibration labels from the target collection. The algorithms and knowledge developed in this thesis can help solve generic ranking knowledge transfer problems under different scenarios

    Adquisici贸n y representaci贸n del conocimiento mediante procesamiento del lenguaje natural

    Get PDF
    [Resumen] Este trabajo introduce un marco para la recuperaci贸n de informaci贸n combinando el procesamiento del lenguaje natural y conocimiento de un dominio, abordando la totalidad del proceso de creaci贸n, gesti贸n e interrogaci贸n de una colecci贸n documental. La perspectiva empleada integra autom谩ticamente conocimiento ling眉铆stico en un modelo formal de representaci贸n sem谩ntica, directamente manejable por el sistema. Ello permite la construcci贸n de algoritmos que simplifican las tareas de mantenimiento, proporcionan un acceso m谩s flexible al usuario no especializado, y eliminan componentes subjetivas que lleven a comportamientos dif铆cilmente predecibles. La adquisici贸n de conocimientos ling眉铆sticos parte de un an谩lisis de dependencias basado en un formalismo gramatical suavemente dependiente del contexto. Conjugamos de este modo eficacia computacional y potencia expresiva. La interpretaci贸n formal de la sem谩ntica descansa en la noci贸n de grafo conceptual, sirviendo de base para la representaci贸n de la colecci贸n y para las consultas que la interrogan. En este contexto, la propuesta resuelve la generaci贸n autom谩tica de estas representaciones a partir del conocimiento ling眉铆stico adquirido de los textos y constituyen el punto de partida para su indexaci贸n. Luego, se utilizan operaciones sobre grafos as铆 como el principio de proyecci贸n y generalizaci贸n para calcular y ordenar las respuestas, de tal manera que se considere la imprecisi贸n intr铆nseca y el car谩cter incompleto de la recuperaci贸n. Adem谩s, el aspecto visual de los grafos permiten la construcci贸n de interfaces de usuario amigables, conciliando precisi贸n e intuici贸n en su gesti贸n. En este punto, la propuesta tambi茅n engloba un marco de pruebas formales.[Resumo] Este traballo introduce un marco para a recuperaci贸n de informaci贸n combinando procesamento da linguaxe natural e o co帽ecemento dun dominio, abordando a totalidade do proceso de creaci贸n, xesti贸n e interrogaci贸n dunha colecci贸n documental. A perspectiva empregada integra autom谩ticamente co帽ecementos ling眉铆sticos nun modelo formal de representaci贸n sem谩ntica, directamente manexable polo sistema. Isto permite a construci贸n de algoritmos que simplifican as tarefas de mantemento, proporcionan un acceso m谩is flexible ao usuario non especializado, e eliminan compo帽entes subxectivos que levan a comportamentos dif铆cilmente predicibles. A adquisici贸n de co帽ecementos ling眉铆sticos parte duhna an谩lise de dependencias basada nun formalismo gramatical suavemente dependente do contexto. Conxugamos deste modo eficacia computacional e potencia expresiva. A interpretaci贸n formal da sem谩ntica descansa na noci贸n de grafo conceptual, servindo de base para a representaci贸n da colecci贸n e para as consultas que a interrogan. Neste contexto, a proposta resolve a xeraci贸n autom谩tica destas representaci贸ns a partires do co帽ecemento ling眉铆stico adquirido dos textos e constit煤e o punto de partida para a s煤a indexaci贸n. Logo, empr茅ganse operaci贸ns sobre grafos as铆 como o principio de proxecci贸n e xeneralizaci贸n para calcular e ordenar as respostas, de tal maneira que se considere a imprecisi贸n intr铆nseca e o car谩cter incompleto da recuperaci贸n. Adem谩is, o aspecto visual dos grafos permiten a construci贸n de interfaces de usuario amigables, conciliando precisi贸n e intuici贸n na s煤a xesti贸n. Neste punto, a proposta tam茅n engloba un marco de probas formais.[Abstract] This thesis introduces a framework for information retrieval combining natural language processing and a domain knowledge, dealing with the whole process of creation, management and interrogation of a documental collection. The perspective used integrates automatically linguistic knowledge in a formal model of semantic representation directly manageable by the system. This allows the construction of algorithms that simplify maintenance tasks, provide more flexible access to non-specialist user, and eliminate subjective components that lead to hardly predictable behavior. The linguistic knowledge adquisition starts from a dependency parse based on a midly context-sensitive grammatical formalism. In this way, we combine computational efficiency and expressive power. The formal interpretation of the semantics is based on the notion of conceptual graph, providing a basis for the representation of the collection and for queries that interrogate. In this context, the proposal addresses the automatic generation of these representations from linguistic knowledge acquired from texts and constitute the starting point for indexing. Then operations on graphs are used and the principle of projection and generalization to calculate and manage replies, so that is considered the inherent inaccuracy and incompleteness of the recovery. In addition, the visual aspect of graphs allow the construction of user-friendly interfaces, balancing precision and intuition in management. At this point, the proposal also includes a framework for formal testing
    corecore