Search CORE

77,946 research outputs found

Are Algorithms Directly Optimizing IR Measures Really Direct?

Author: Yan Liu
Yin He
Publication venue
Publication date: 11/04/2020
Field of study

Abstract In information retrieval (IR), the objective of ranking problem is to construct and return a ranked list of relevant documents to the user. The document ranking list is demanded to satisfy user's information need as much as possible with respect to a user's query. To evaluate the goodness of the returned document ranking list, performance measures, such as Normalized Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP), are adopted. Many learning to rank algorithms, which automatically learn ranking function through optimizing specially designed objective functions, are proposed to resolve the ranking problem. Intuitively, the IR performance measures are the ideal objective functions to be optimized to learn ranking function. However, IR performance measures, such as NDCG and MAP, are non-smooth and non-differentiable with respect to the ranking function parameter. Thus, most existing learning to rank algorithms are designed to optimize objective functions that are loosely related to the IR performance measures. As a result, such algorithms may only achieve sub-optimization of the IR performance measures even they can perform very well on optimizing their adopted objective functions. Therefore, it is highly demanded that learning to rank algorithms should be improved to be able to directly or approximately directly optimize information retrieval performance measures. To tackle the challenge of direct optimization of IR performance measures, several approaches, such as SoftRank[1] and SVM-MAP[2] are proposed. Although these algorithms can achieve good empirical performance, there are still some questions that are unclear and not yet answered: a) can ranking function learned by direct optimization of IR performance measures still perform well over unseen queries with respect to the optimized IR performance measures? b) how directly are IR performance measures optimized by the proposed approaches? In this report, we will attempt to answer the above questions. We first point out that, under some conditions, the ranking function learned by direct optimization of IR performance measures can also perform well upon unseen queries with respect to the optimized IR performance measures. Then, to study how directly IR performance measures are optimized by previous approaches, we proposed a directness evaluate metric. Based on this metric, SoftRank is analyzed and corresponding results are presented

CiteSeerX

The most representative composite rank ordering of multi-attribute objects by the particle swarm optimization

Author: Mishra SK
Publication venue
Publication date
Field of study

Rank-ordering of individuals or objects on multiple criteria has many important practical applications. A reasonably representative composite rank ordering of multi-attribute objects/individuals or multi-dimensional points is often obtained by the Principal Component Analysis, although much inferior but computationally convenient methods also are frequently used. However, such rank ordering – even the one based on the Principal Component Analysis – may not be optimal. This has been demonstrated by several numerical examples. To solve this problem, the Ordinal Principal Component Analysis was suggested some time back. However, this approach cannot deal with various types of alternative schemes of rank ordering, mainly due to its dependence on the method of solution by the constrained integer programming. In this paper we propose an alternative method of solution, namely by the Particle Swarm Optimization. A computer program in FORTRAN to solve the problem has also been provided. The suggested method is notably versatile and can take care of various schemes of rank ordering, norms and types or measures of correlation. The versatility of the method and its capability to obtain the most representative composite rank ordering of multi-attribute objects or multi-dimensional points have been demonstrated by several numerical examples. It has also been found that rank ordering based on maximization of the sum of absolute values of the correlation coefficients of composite rank scores with its constituent variables has robustness, but it may have multiple optimal solutions. Thus, while it solves the one problem, it gives rise to the other problem. The overall ranking of objects by maximin correlation principle performs better if the composite rank scores are obtained by direct optimization with respect to the individual ranking scores.Rank ordering, standard; modified; competition; fractional; dense; ordinal; principal component; integer programming; repulsive particle swarm; maximin; absolute; correlation; FORTRAN; program

Research Papers in Economics

Netter: re-ranking gene network inference predictions using structural network properties

Author: Demeester Piet
Dhaene Tom
Ruyssinck Joeri
Saeys Yvan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background: Many algorithms have been developed to infer the topology of gene regulatory networks from gene expression data. These methods typically produce a ranking of links between genes with associated confidence scores, after which a certain threshold is chosen to produce the inferred topology. However, the structural properties of the predicted network do not resemble those typical for a gene regulatory network, as most algorithms only take into account connections found in the data and do not include known graph properties in their inference process. This lowers the prediction accuracy of these methods, limiting their usability in practice. Results: We propose a post-processing algorithm which is applicable to any confidence ranking of regulatory interactions obtained from a network inference method which can use, inter alia, graphlets and several graph-invariant properties to re-rank the links into a more accurate prediction. To demonstrate the potential of our approach, we re-rank predictions of six different state-of-the-art algorithms using three simple network properties as optimization criteria and show that Netter can improve the predictions made on both artificially generated data as well as the DREAM4 and DREAM5 benchmarks. Additionally, the DREAM5 E. coli. community prediction inferred from real expression data is further improved. Furthermore, Netter compares favorably to other post-processing algorithms and is not restricted to correlation-like predictions. Lastly, we demonstrate that the performance increase is robust for a wide range of parameter settings. Netter is available at http://bioinformatics. intec. ugent. be. Conclusions: Network inference from high-throughput data is a long-standing challenge. In this work, we present Netter, which can further refine network predictions based on a set of user-defined graph properties. Netter is a flexible system which can be applied in unison with any method producing a ranking from omics data. It can be tailored to specific prior knowledge by expert users but can also be applied in general uses cases. Concluding, we believe that Netter is an interesting second step in the network inference process to further increase the quality of prediction

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

Opinion-Based Centrality in Multiplex Networks: A Convex Optimization Approach

Author: Labatut Vincent
Reiffers-Masson Alexandre
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2017
Field of study

Most people simultaneously belong to several distinct social networks, in which their relations can be different. They have opinions about certain topics, which they share and spread on these networks, and are influenced by the opinions of other persons. In this paper, we build upon this observation to propose a new nodal centrality measure for multiplex networks. Our measure, called Opinion centrality, is based on a stochastic model representing opinion propagation dynamics in such a network. We formulate an optimization problem consisting in maximizing the opinion of the whole network when controlling an external influence able to affect each node individually. We find a mathematical closed form of this problem, and use its solution to derive our centrality measure. According to the opinion centrality, the more a node is worth investing external influence, and the more it is central. We perform an empirical study of the proposed centrality over a toy network, as well as a collection of real-world networks. Our measure is generally negatively correlated with existing multiplex centrality measures, and highlights different types of nodes, accordingly to its definition

arXiv.org e-Print Archive

End-to-End Cross-Modality Retrieval with CCA Projections and Pairwise Ranking Loss

Author: Dorfer Matthias
Korzeniowski Filip
Schlüter Jan
Vall Andreu
Widmer Gerhard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/04/2018
Field of study

Cross-modality retrieval encompasses retrieval tasks where the fetched items are of a different type than the search query, e.g., retrieving pictures relevant to a given text query. The state-of-the-art approach to cross-modality retrieval relies on learning a joint embedding space of the two modalities, where items from either modality are retrieved using nearest-neighbor search. In this work, we introduce a neural network layer based on Canonical Correlation Analysis (CCA) that learns better embedding spaces by analytically computing projections that maximize correlation. In contrast to previous approaches, the CCA Layer (CCAL) allows us to combine existing objectives for embedding space learning, such as pairwise ranking losses, with the optimal projections of CCA. We show the effectiveness of our approach for cross-modality retrieval on three different scenarios (text-to-image, audio-sheet-music and zero-shot retrieval), surpassing both Deep CCA and a multi-view network using freely learned projections optimized by a pairwise ranking loss, especially when little training data is available (the code for all three methods is released at: https://github.com/CPJKU/cca_layer).Comment: Preliminary version of a paper published in the International Journal of Multimedia Information Retrieva

arXiv.org e-Print Archive

JKU | ePub

A recommender system for process discovery

Author: A. Rozinat
F. Campolongo
I.M. Sobol
J. Bobadilla
J.R. Rice
L. Xu
M.D. Morris
R. Fagin
W.M.P. Aalst van der
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Over the last decade, several algorithms for process discovery and process conformance have been proposed. Still, it is well-accepted that there is no dominant algorithm in any of these two disciplines, and then it is often difficult to apply them successfully. Most of these algorithms need a close-to expert knowledge in order to be applied satisfactorily. In this paper, we present a recommender system that uses portfolio-based algorithm selection strategies to face the following problems: to find the best discovery algorithm for the data at hand, and to allow bridging the gap between general users and process mining algorithms. Experiments performed with the developed tool witness the usefulness of the approach for a variety of instances.Peer ReviewedPostprint (author’s final draft

HAL-CentraleSupelec

Crossref

UPCommons. Portal del coneixement obert de la UPC

INRIA a CCSD electronic archive server

HAL-Rennes 1

Identification of functionally related enzymes by learning-to-rank methods

Author: Airola Antti
De Baets Bernard
Fober Thomas
Glinca Serghei
Hüllermeier Eyke
Klebe Gerhard
Pahikkala Tapio
Stock Michiel
Waegeman Willem
Publication venue
Publication date: 01/01/2014
Field of study

Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Ranking Alternatives on the Basis of a Dominance Intensity Measure

Author: Aguayo Garcia Ernesto Aaron
Jiménez Martín Antonio
Mateos Caballero Alfonso
Moreno Díaz Arminda
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2010
Field of study

The additive multi-attribute utility model is widely used within MultiAttribute Utility Theory (MAUT), demanding all the information describing the decision-making situation. However, these information requirements can obviously be far too strict in many practical situations. Consequently, incomplete information about input parameters has been incorporated into the decisionmaking process. We propose an approach based on a dominance intensity measure to deal with such situations. The approach is based on the dominance values between pairs of alternatives that can be computed by linear programming. These dominance values are transformed into dominance intensities from which a dominance intensity measure is derived. It is used to analyze the robustness of a ranking of technologies for the disposition of surplus weapons-grade plutonium by the Department of Energy in the USA, and compared with other dominance measuring methods

Archivo Digital UPM

Hashing as Tie-Aware Learning to Rank

Author: Bargal Sarah Adel
Cakir Fatih
He Kun
Sclaroff Stan
Publication venue
Publication date: 09/10/2018
Field of study

Hashing, or learning binary embeddings of data, is frequently used in nearest neighbor retrieval. In this paper, we develop learning to rank formulations for hashing, aimed at directly optimizing ranking-based evaluation metrics such as Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG). We first observe that the integer-valued Hamming distance often leads to tied rankings, and propose to use tie-aware versions of AP and NDCG to evaluate hashing for retrieval. Then, to optimize tie-aware ranking metrics, we derive their continuous relaxations, and perform gradient-based optimization with deep neural networks. Our results establish the new state-of-the-art for image retrieval by Hamming ranking in common benchmarks.Comment: 15 pages, 3 figures. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 201

arXiv.org e-Print Archive

Crossref