25,409 research outputs found
Query variation performance prediction for systematic reviews
When conducting systematic reviews, medical researchers heavily deliberate over the final query to pose to the information retrieval system. Given the possible query variations that they could construct, selecting the best performing query is difficult. This motivates a new type of query performance prediction (QPP) task where the challenge is to estimate the performance of a set of query variations given a particular topic. Query variations are the reductions, expansions and modifications of a given seed query under the hypothesis that there exists some variations (either generated from permutations or hand crafted) which will improve retrieval effectiveness over the original query. We use the CLEF 2017 TAR Collection, to evaluate sixteen pre and post retrieval predictors for the task of Query Variation Performance Prediction (QVPP). Our findings show the IDF based QPPs exhibits the strongest correlations with performance. However, when using QPPs to select the best query, little improvement over the original query can be obtained, despite the fact that there are query variations which perform significantly better. Our findings highlight the difficulty in identifying effective queries within the context of this new task, and motivates further research to develop more accurate methods to help systematic review researchers in the query selection process
Machine Learning, Quantum Mechanics, and Chemical Compound Space
We review recent studies dealing with the generation of machine learning
models of molecular and solid properties. The models are trained and validated
using standard quantum chemistry results obtained for organic molecules and
materials selected from chemical space at random
Fertility and its Meaning: Evidence from Search Behavior
Fertility choices are linked to the different preferences and constraints of
individuals and couples, and vary importantly by socio-economic status, as well
by cultural and institutional context. The meaning of childbearing and
child-rearing, therefore, differs between individuals and across groups. In
this paper, we combine data from Google Correlate and Google Trends for the
U.S. with ground truth data from the American Community Survey to derive new
insights into fertility and its meaning. First, we show that Google Correlate
can be used to illustrate socio-economic differences on the circumstances
around pregnancy and birth: e.g., searches for "flying while pregnant" are
linked to high income fertility, and "paternity test" are linked to non-marital
fertility. Second, we combine several search queries to build predictive models
of regional variation in fertility, explaining about 75% of the variance.
Third, we explore if aggregated web search data can also be used to model
fertility trends.Comment: This is a preprint of a short paper accepted at ICWSM'17. Please cite
that version instea
- …