25,430 research outputs found
Enhanced information retrieval using domain-specific recommender models
The objective of an information retrieval (IR) system is to retrieve relevant items which meet a user information need. There is currently significant interest in personalized IR which seeks to improve IR effectiveness by incorporating a model of the user’s interests. However, in some situations
there may be no opportunity to learn about the interests of a specific user on a certain topic. In our work, we propose an IR approach which combines a recommender algorithm with IR methods to improve retrieval for domains where the system has no opportunity to learn prior information about the user’s knowledge of a domain for which they have not previously entered a query. We use search data from other previous users interested in the same topic to build a
recommender model for this topic. When a user enters a query on a topic, new to this user, an appropriate recommender model is selected and used to predict a ranking which the user may find interesting based on the behaviour of previous
users with similar queries. The recommender output is integrated with a standard IR method in a weighted linear combination to provide a final result for the user. Experiments using the INEX 2009 data collection with a simulated recommender training set show that our approach can improve on a baseline IR system
Breaking the paradigm: Dr Insight empowers signature-free, enhanced drug repurposing
Motivation: Transcriptome-based computational drug repurposing has attracted considerable interest by bringing about faster and more cost-effective drug discovery. Nevertheless, key limitations of the current drug connectivity-mapping paradigm have been long overlooked, including the lack of effective means to determine optimal query gene signatures. Results: The novel approach Dr Insight implements a frame-breaking statistical model for the ‘hand-shake’ between disease and drug data. The genome-wide screening of concordantly expressed genes (CEGs) eliminates the need for subjective selection of query signatures, added to eliciting better proxy for potential disease-specific drug targets. Extensive comparisons on simulated and real cancer datasets have validated the superior performance of Dr Insight over several popular drug-repurposing methods to detect known cancer drugs and drug–target interactions. A proof-of-concept trial using the TCGA breast cancer dataset demonstrates the application of Dr Insight for a comprehensive analysis, from redirection of drug therapies, to a systematic construction of disease-specific drug-target networks
Simulated evaluation of faceted browsing based on feature selection
In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness
CEDR: Contextualized Embeddings for Document Ranking
Although considerable attention has been given to neural ranking
architectures recently, far less attention has been paid to the term
representations that are used as input to these models. In this work, we
investigate how two pretrained contextualized language models (ELMo and BERT)
can be utilized for ad-hoc document ranking. Through experiments on TREC
benchmarks, we find that several existing neural ranking architectures can
benefit from the additional context provided by contextualized language models.
Furthermore, we propose a joint approach that incorporates BERT's
classification vector into existing neural models and show that it outperforms
state-of-the-art ad-hoc ranking baselines. We call this joint approach CEDR
(Contextualized Embeddings for Document Ranking). We also address practical
challenges in using these models for ranking, including the maximum input
length imposed by BERT and runtime performance impacts of contextualized
language models.Comment: Appeared in SIGIR 2019, 4 page
Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach
A significant amount of search queries originate from some real world
information need or tasks. In order to improve the search experience of the end
users, it is important to have accurate representations of tasks. As a result,
significant amount of research has been devoted to extracting proper
representations of tasks in order to enable search systems to help users
complete their tasks, as well as providing the end user with better query
suggestions, for better recommendations, for satisfaction prediction, and for
improved personalization in terms of tasks. Most existing task extraction
methodologies focus on representing tasks as flat structures. However, tasks
often tend to have multiple subtasks associated with them and a more
naturalistic representation of tasks would be in terms of a hierarchy, where
each task can be composed of multiple (sub)tasks. To this end, we propose an
efficient Bayesian nonparametric model for extracting hierarchies of such tasks
\& subtasks. We evaluate our method based on real world query log data both
through quantitative and crowdsourced experiments and highlight the importance
of considering task/subtask hierarchies.Comment: 10 pages. Accepted at SIGIR 2017 as a full pape
A Centralized Mechanism to Make Predictions Based on Data From Multiple WSNs
In this work, we present a method that exploits a scenario with
inter-Wireless Sensor Networks (WSNs) information exchange by making
predictions and adapting the workload of a WSN according to their outcomes. We
show the feasibility of an approach that intelligently utilizes information
produced by other WSNs that may or not belong to the same administrative
domain. To illustrate how the predictions using data from external WSNs can be
utilized, a specific use-case is considered, where the operation of a WSN
measuring relative humidity is optimized using the data obtained from a WSN
measuring temperature. Based on a dedicated performance score, the simulation
results show that this new approach can find the optimal operating point
associated to the trade-off between energy consumption and quality of
measurements. Moreover, we outline the additional challenges that need to be
overcome, and draw conclusions to guide the future work in this field.Comment: 10 pages, simulation results and figures. Published i
Reply With: Proactive Recommendation of Email Attachments
Email responses often contain items-such as a file or a hyperlink to an
external document-that are attached to or included inline in the body of the
message. Analysis of an enterprise email corpus reveals that 35% of the time
when users include these items as part of their response, the attachable item
is already present in their inbox or sent folder. A modern email client can
proactively retrieve relevant attachable items from the user's past emails
based on the context of the current conversation, and recommend them for
inclusion, to reduce the time and effort involved in composing the response. In
this paper, we propose a weakly supervised learning framework for recommending
attachable items to the user. As email search systems are commonly available,
we constrain the recommendation task to formulating effective search queries
from the context of the conversations. The query is submitted to an existing IR
system to retrieve relevant items for attachment. We also present a novel
strategy for generating labels from an email corpus---without the need for
manual annotations---that can be used to train and evaluate the query
formulation model. In addition, we describe a deep convolutional neural network
that demonstrates satisfactory performance on this query formulation task when
evaluated on the publicly available Avocado dataset and a proprietary dataset
of internal emails obtained through an employee participation program.Comment: CIKM2017. Proceedings of the 26th ACM International Conference on
Information and Knowledge Management. 201
- …