245 research outputs found
Investigating Retrieval Method Selection with Axiomatic Features
We consider algorithm selection in the context of ad-hoc information retrieval. Given a query and a pair of retrieval methods, we propose a meta-learner that predicts how to combine the methods' relevance scores into an overall relevance score. Inspired by neural models' different properties with regard to IR axioms, these predictions are based on features that quantify axiom-related properties of the query and its top ranked documents. We conduct an evaluation on TREC Web Track data and find that the meta-learner often significantly improves over the individual methods. Finally, we conduct feature and query weight analyses to investigate the meta-learner's behavior
Real Time Web Search Framework for Performing Efficient Retrieval of Data
With the rapidly growing amount of information on the internet, real-time system is one of the key strategies to cope with the information overload and to help users in finding highly relevant information. Real-time events and domain-specific information are important knowledge base references on the Web that frequently accessed by millions of users. Real-time system is a vital to product and a technique must resolve the context of challenges to be more reliable, e.g. short data life-cycles, heterogeneous user interests, strict time constraints, and context-dependent article relevance. Since real-time data have only a short time to live, real-time models have to be continuously adapted, ensuring that real-time data are always up-to-date. The focal point of this manuscript is for designing a real-time web search approach that aggregates several web search algorithms at query time to tune search results for relevancy. We learn a context-aware delegation algorithm that allows choosing the best real-time algorithms for each query request. The evaluation showed that the proposed approach outperforms the traditional models, in which it allows us to adapt the specific properties of the considered real-time resources. In the experiments, we found that it is highly relevant for most recently searched queries, consistent in its performance, and resilient to the drawbacks faced by other algorithms
Context-Based Quotation Recommendation
While composing a new document, anything from a news article to an email or
essay, authors often utilize direct quotes from a variety of sources. Although
an author may know what point they would like to make, selecting an appropriate
quote for the specific context may be time-consuming and difficult. We
therefore propose a novel context-aware quote recommendation system which
utilizes the content an author has already written to generate a ranked list of
quotable paragraphs and spans of tokens from a given source document.
We approach quote recommendation as a variant of open-domain question
answering and adapt the state-of-the-art BERT-based methods from open-QA to our
task. We conduct experiments on a collection of speech transcripts and
associated news articles, evaluating models' paragraph ranking and span
prediction performances. Our experiments confirm the strong performance of
BERT-based methods on this task, which outperform bag-of-words and neural
ranking baselines by more than 30% relative across all ranking metrics.
Qualitative analyses show the difficulty of the paragraph and span
recommendation tasks and confirm the quotability of the best BERT model's
predictions, even if they are not the true selected quotes from the original
news articles.Comment: 12 pages, 3 figure
Aggregated search: a new information retrieval paradigm
International audienceTraditional search engines return ranked lists of search results. It is up to the user to scroll this list, scan within different documents and assemble information that fulfill his/her information need. Aggregated search represents a new class of approaches where the information is not only retrieved but also assembled. This is the current evolution in Web search, where diverse content (images, videos, ...) and relational content (similar entities, features) are included in search results. In this survey, we propose a simple analysis framework for aggregated search and an overview of existing work. We start with related work in related domains such as federated search, natural language generation and question answering. Then we focus on more recent trends namely cross vertical aggregated search and relational aggregated search which are already present in current Web search
Using Semantic-Based User Profile Modeling for Context-Aware Personalised Place Recommendations
Place Recommendation Systems (PRS's) are used to recommend places to visit to World Wide Web users. Existing PRS's are still limited by several problems, some of which are the problem of recommending similar set of places to different users (Lack of Personalization) and no diversity in the set of recommended items (Content Overspecialization). One of the main objectives in the PRS's or Contextual suggestion systems is to fill the semantic gap among the queries and suggestions and going beyond keywords matching. To address these issues, in this study we attempt to build a personalized context-aware place recommender system using semantic-based user profile modeling to address the limitations of current user profile building techniques and to improve the retrieval performance of personalized place recommender system. This approach consists of building a place ontology based on the Open Directory Project (ODP), a hierarchical ontology scheme for organizing websites. We model a semantic user profile from the place concepts extracted from place ontology and weighted according to their semantic relatedness to user interests. The semantic user profile is then exploited to devise a personalized recommendation by re-ranking process of initial search results for improving retrieval performance. We evaluate this approach on dataset obtained using Google Paces API. Results show that our proposed approach significantly improves the retrieval performance compare to classic keyword-based place recommendation model
Doctor of Philosophy
dissertationPublic health surveillance systems are crucial for the timely detection and response to public health threats. Since the terrorist attacks of September 11, 2001, and the release of anthrax in the following month, there has been a heightened interest in public health surveillance. The years immediately following these attacks were met with increased awareness and funding from the federal government which has significantly strengthened the United States surveillance capabilities; however, despite these improvements, there are substantial challenges faced by today's public health surveillance systems. Problems with the current surveillance systems include: a) lack of leveraging unstructured public health data for surveillance purposes; and b) lack of information integration and the ability to leverage resources, applications or other surveillance efforts due to systems being built on a centralized model. This research addresses these problems by focusing on the development and evaluation of new informatics methods to improve the public health surveillance. To address the problems above, we first identified a current public surveillance workflow which is affected by the problems described and has the opportunity for enhancement through current informatics techniques. The 122 Mortality Surveillance for Pneumonia and Influenza was chosen as the primary use case for this dissertation work. The second step involved demonstrating the feasibility of using unstructured public health data, in this case death certificates. For this we created and evaluated a pipeline iv composed of a detection rule and natural language processor, for the coding of death certificates and the identification of pneumonia and influenza cases. The second problem was addressed by presenting the rationale of creating a federated model by leveraging grid technology concepts and tools for the sharing and epidemiological analyses of public health data. As a case study of this approach, a secured virtual organization was created where users are able to access two grid data services, using death certificates from the Utah Department of Health, and two analytical grid services, MetaMap and R. A scientific workflow was created using the published services to replicate the mortality surveillance workflow. To validate these approaches, and provide proofs-of-concepts, a series of real-world scenarios were conducted
- …