7 research outputs found
Active Sampling for Large-scale Information Retrieval Evaluation
Evaluation is crucial in Information Retrieval. The development of models,
tools and methods has significantly benefited from the availability of reusable
test collections formed through a standardized and thoroughly tested
methodology, known as the Cranfield paradigm. Constructing these collections
requires obtaining relevance judgments for a pool of documents, retrieved by
systems participating in an evaluation task; thus involves immense human labor.
To alleviate this effort different methods for constructing collections have
been proposed in the literature, falling under two broad categories: (a)
sampling, and (b) active selection of documents. The former devises a smart
sampling strategy by choosing only a subset of documents to be assessed and
inferring evaluation measure on the basis of the obtained sample; the sampling
distribution is being fixed at the beginning of the process. The latter
recognizes that systems contributing documents to be judged vary in quality,
and actively selects documents from good systems. The quality of systems is
measured every time a new document is being judged. In this paper we seek to
solve the problem of large-scale retrieval evaluation combining the two
approaches. We devise an active sampling method that avoids the bias of the
active selection methods towards good systems, and at the same time reduces the
variance of the current sampling approaches by placing a distribution over
systems, which varies as judgments become available. We validate the proposed
method using TREC data and demonstrate the advantages of this new method
compared to past approaches
Adversarial Personalized Ranking for Recommendation
Item recommendation is a personalized ranking task. To this end, many
recommender systems optimize models with pairwise ranking objectives, such as
the Bayesian Personalized Ranking (BPR). Using matrix Factorization (MF) ---
the most widely used model in recommendation --- as a demonstration, we show
that optimizing it with BPR leads to a recommender model that is not robust. In
particular, we find that the resultant model is highly vulnerable to
adversarial perturbations on its model parameters, which implies the possibly
large error in generalization.
To enhance the robustness of a recommender model and thus improve its
generalization performance, we propose a new optimization framework, namely
Adversarial Personalized Ranking (APR). In short, our APR enhances the pairwise
ranking method BPR by performing adversarial training. It can be interpreted as
playing a minimax game, where the minimization of the BPR objective function
meanwhile defends an adversary, which adds adversarial perturbations on model
parameters to maximize the BPR objective function. To illustrate how it works,
we implement APR on MF by adding adversarial perturbations on the embedding
vectors of users and items. Extensive experiments on three public real-world
datasets demonstrate the effectiveness of APR --- by optimizing MF with APR, it
outperforms BPR with a relative improvement of 11.2% on average and achieves
state-of-the-art performance for item recommendation. Our implementation is
available at: https://github.com/hexiangnan/adversarial_personalized_ranking.Comment: SIGIR 201
Ranking and Retrieval under Semantic Relevance
This thesis presents a series of conceptual and empirical developments on the ranking and retrieval of candidates under semantic relevance. Part I of the thesis introduces the concept of uncertainty in various semantic tasks (such as recognizing textual entailment) in natural language processing, and the machine learning techniques commonly employed to model these semantic phenomena. A unified view of ranking and retrieval will be presented, and the trade-off between model expressiveness, performance, and scalability in model design will be discussed.
Part II of the thesis focuses on applying these ranking and retrieval techniques to text: Chapter 3 examines the feasibility of ranking hypotheses given a premise with respect to a human's subjective probability of the hypothesis happening, effectively extending the traditional categorical task of natural language inference. Chapter 4 focuses on detecting situation frames for documents using ranking methods. Then we extend the ranking notion to retrieval, and develop both sparse (Chapter 5) and dense (Chapter 6) vector-based methods to facilitate scalable retrieval for potential answer paragraphs in question answering.
Part III turns the focus to mentions and entities in text, while continuing the theme on ranking and retrieval: Chapter 7 discusses the ranking of fine-grained types that an entity mention could belong to, leading to state-of-the-art performance on hierarchical multi-label fine-grained entity typing. Chapter 8 extends the semantic relation of coreference to a cross-document setting, enabling models to retrieve from a large corpus, instead of in a single document, when resolving coreferent entity mentions