17 research outputs found

    From Query-By-Keyword to Query-By-Example: LinkedIn Talent Search Approach

    Full text link
    One key challenge in talent search is to translate complex criteria of a hiring position into a search query, while it is relatively easy for a searcher to list examples of suitable candidates for a given position. To improve search efficiency, we propose the next generation of talent search at LinkedIn, also referred to as Search By Ideal Candidates. In this system, a searcher provides one or several ideal candidates as the input to hire for a given position. The system then generates a query based on the ideal candidates and uses it to retrieve and rank results. Shifting from the traditional Query-By-Keyword to this new Query-By-Example system poses a number of challenges: How to generate a query that best describes the candidates? When moving to a completely different paradigm, how does one leverage previous product logs to learn ranking models and/or evaluate the new system with no existing usage logs? Finally, given the different nature between the two search paradigms, the ranking features typically used for Query-By-Keyword systems might not be optimal for Query-By-Example. This paper describes our approach to solving these challenges. We present experimental results confirming the effectiveness of the proposed solution, particularly on query building and search ranking tasks. As of writing this paper, the new system has been available to all LinkedIn members

    RankMerging: A supervised learning-to-rank framework to predict links in large social network

    Get PDF
    Uncovering unknown or missing links in social networks is a difficult task because of their sparsity and because links may represent different types of relationships, characterized by different structural patterns. In this paper, we define a simple yet efficient supervised learning-to-rank framework, called RankMerging, which aims at combining information provided by various unsupervised rankings. We illustrate our method on three different kinds of social networks and show that it substantially improves the performances of unsupervised metrics of ranking. We also compare it to other combination strategies based on standard methods. Finally, we explore various aspects of RankMerging, such as feature selection and parameter estimation and discuss its area of relevance: the prediction of an adjustable number of links on large networks.Comment: 43 pages, published in Machine Learning Journa

    Runtime Optimizations for Tree-Based Machine Learning Models

    Get PDF
    Tree-based models have proven to be an effective solution for web ranking as well as other machine learning problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, specifically using gradient-boosted regression trees for learning to rank. Although exceedingly simple conceptually, most implementations of tree-based models do not efficiently utilize modern superscalar processors. By laying out data structures in memory in a more cache-conscious fashion, removing branches from the execution flow using a technique called predication, and micro-batching predictions using a technique called vectorization, we are able to better exploit modern processor architectures. Experiments on synthetic data and on three standard learning-to-rank datasets show that our approach is significantly faster than standard implementations

    QuickRank: a C++ Suite of Learning to Rank Algorithms

    Get PDF
    Ranking is a central task of many Information Retrieval (IR) problems, particularly challenging in the case of large-scale Web collections where it involves effectiveness requirements and effciency constraints that are not common to other ranking-based applications. This paper describes QuickRank, a C++ suite of effcient and effective Learning to Rank (LtR) algorithms that allows high-quality ranking functions to be devised from possibly huge training datasets. QuickRank is a project with a double goal: i) answering industrial need of Tiscali S.p.A. for a exible and scalable LtR solution for learning ranking models from huge training datasets; ii) providing the IR research community with a exible, extensible and effcient LtR framework to design LtR solutions and fairly compare the performance of different algorithms and ranking models. This paper presents our choices in designing QuickRank and report some preliminary use experiences.Ranking is a central task of many Information Retrieval (IR) problems, particularly challenging in the case of large-scale Web collections where it involves eectiveness requirements and eciency constraints that are not common to other ranking-based applications. This paper describes QuickRank, a C++ suite of ecient and eective Learning to Rank (LtR) algorithms that allows high-quality ranking functions to be devised from possibly huge training datasets. QuickRank is a project with a double goal: i) answering industrial need of Tiscali S.p.A. for a exible and scalable LtR solution for learning ranking models from huge training datasets; ii) providing the IR research community with a exible, extensible and ecient LtR framework to design LtR solutions and fairly compare the performance of dierent algorithms and ranking models. This paper presents our choices in designing QuickRank and report some preliminary use experiences

    RankMerging:a supervised learning-to-rank framework to predict links in large social networks

    Get PDF
    Uncovering unknown or missing links in social networks is a difficult task because of their sparsity and because links may represent different types of relationships, characterized by different structural patterns. In this paper, we define a simple yet efficient supervised learning-to-rank framework, called RankMerging, which aims at combining information provided by various unsupervised rankings. We illustrate our method on three different kinds of social networks and show that it substantially improves the performances of unsupervised methods of ranking as well as standard supervised combination strategies. We also describe various properties of RankMerging, such as its computational complexity, its robustness to feature selection and parameter estimation and discuss its area of relevance: the prediction of an adjustable number of links on large networks

    Living analytics methods for the social web

    Get PDF
    [no abstract

    The scoring of poses in protein-protein docking: current capabilities and future directions

    Get PDF
    BACKGROUND: Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling. RESULTS: We present an evaluation of 115 scoring functions on an unbound docking decoy benchmark covering 118 complexes for which a near-native solution can be found, yielding top 10 success rates of up to 58%. Hierarchical clustering is performed, so as to group together functions which identify near-natives in similar subsets of complexes. Three set theoretic approaches are used to identify pairs of scoring functions capable of correctly scoring different complexes. This shows that functions in different clusters capture different aspects of binding and are likely to work together synergistically. CONCLUSIONS: All functions designed specifically for docking perform well, indicating that functions are transferable between sampling methods. We also identify promising methods from the field of homology modelling. Further, differential success rates by docking difficulty and solution quality suggest a need for flexibility-dependent scoring. Investigating pairs of scoring functions, the set theoretic measures identify known scoring strategies as well as a number of novel approaches, indicating promising augmentations of traditional scoring methods. Such augmentation and parameter combination strategies are discussed in the context of the learning-to-rank paradigm
    corecore