Search CORE

600 research outputs found

Sustainable Questions: Determining the Expiration Date of Answers

Author: de Goede B.
de Rijke M.
Schuth A.
Publication venue: Microsoft Research
Publication date: 01/01/2012
Field of study

International Migration, Integration and Social Cohesion online publications

Processing content-and-structure queries for XML retrieval

Author: de Rijke M.
Kamps J.
Sigurbjörnsson B.
Publication venue
Publication date: 01/01/2004
Field of study

International Migration, Integration and Social Cohesion online publications

Determining the Presence of Political Parties in Social Circles:abstract

Author: de Rijke M.
Goethals B.
Van Gysel C.
Publication venue
Publication date: 01/01/2015
Field of study

International Migration, Integration and Social Cohesion online publications

A semantic perspective on query log analysis

Author: de Rijke M.
Hofmann K.
Huurnink B.
Meij E.
Publication venue: CEUR-WS
Publication date: 01/01/2009
Field of study

International Migration, Integration and Social Cohesion online publications

Finding Influential Training Samples for Gradient Boosted Decision Trees

Author: de Rijke M.
Serdyukov P.
Sharchilev B.
Ustinovsky Y.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Finding Influential Training Samples for Gradient Boosted Decision Trees

Author: de Rijke M.
Serdyukov P.
Sharchilev B.
Ustinovsky Y.
Publication venue
Publication date: 01/01/2018
Field of study

We address the problem of finding influential training samples for a particular case of tree ensemble-based models, e.g., Random Forest (RF) or Gradient Boosted Decision Trees (GBDT). A natural way of formalizing this problem is studying how the model's predictions change upon leave-one-out retraining, leaving out each individual training sample. Recent work has shown that, for parametric models, this analysis can be conducted in a computationally efficient way. We propose several ways of extending this framework to non-parametric GBDT ensembles under the assumption that tree structures remain fixed. Furthermore, we introduce a general scheme of obtaining further approximations to our method that balance the trade-off between performance and computational complexity. We evaluate our approaches on various experimental setups and use-case scenarios and demonstrate both the quality of our approach to finding influential training samples in comparison to the baselines and its computational efficiency.Comment: Added the "Acknowledgements" sectio

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Learning Semantic Query Suggestions

Author: Bron M.
de Rijke M.
Hollink L.
Huurnink B.
Meij E.
Publication venue: Radboud Universiteit Nijmegen, Information Foraging Lab
Publication date: 01/01/2010
Field of study

International Migration, Integration and Social Cohesion online publications

Learning Semantic Query Suggestions

Author: Bron M.
de Rijke M.
Hollink L.
Huurnink B.
Meij E.
Publication venue: Radboud Universiteit Nijmegen, Information Foraging Lab
Publication date: 01/01/2010
Field of study

International Migration, Integration and Social Cohesion online publications

Neural Networks for Information Retrieval

Author: Borisov A.
de Rijke M.
Dehghani M.
Kenter T.
Mitra B.
Van Gysel C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Language modeling approaches to blog post and feed finding

Author: B J Ernsting
M De Rijke
W Weerkamp
Publication venue
Publication date: 01/01/2008
Field of study

Language modeling approaches to blog post and feed finding Ernsting, B.J.; Weerkamp, W.; de Rijke, M. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. In the opinion task we looked at the differences in performance between Indri and our mixture model, the influence of external expansion and document priors to improve opinion finding; results show that an out-of-the-box Indri implementation outperforms our mixture model, and that external expansion on a news corpus is very benificial. Opinion finding can be improved using either lexicons or the number of comments as document priors. Our approach to the feed distillation task is based on aggregating post-level scores to obtain a feed-level ranking. We integrated time-based and persistence aspects into the retrieval model. After correcting bugs in our post-score aggregation module we found that time-based retrieval improves results only marginally, while persistence-based ranking results in substantial improvements under the right circumstances

CiteSeerX

International Migration, Integration and Social Cohesion online publications

UvA-DARE