Search CORE

217 research outputs found

Search Agent Model: a Conceptual Framework for Search by Algorithms and Agent Systems

Author: Dalton Jeff
Foley John
Publication venue
Publication date: 19/08/2018
Field of study

No abstract available

Enlighten

Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

Author: Dalton Jeff
Li Zhenghua
Lin Jimmy
Mishne Gilad
Sharma Aneesh
Publication venue
Publication date: 27/10/2012
Field of study

We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

arXiv.org e-Print Archive

CiteSeerX

Search Agent Model: a Conceptual Framework for Search by Algorithms and Agent Systems

Author: Dalton Jeff
Foley John
Publication venue
Publication date: 19/08/2018
Field of study

No abstract available

Review of Erosion and Sedimentation Control Programs in the Piscataqua Region

Author: Clifford Jeff
Dalton Cayce
Hickey Ken
Piscataqua Region Estuaries Partnership
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 31/03/2010
Field of study

The Piscataqua Region Estuaries Partnership (PREP) seeks to minimize adverse impacts to water resources associated with construction site development activities. In order to achieve this goal, PREP must understand the strengths and weaknesses of existing erosion and sedimentation control (E&SC) programs in the 52 municipalities of the PREP water shed (Figure 1-1). A detailed understanding of the existing E&SC programs will enable PREP and other stakeholders to identify and implement actions to improve E&SC programs and minimize adverse impacts. This report provides a review and assessment of existing erosion and sedimentation control programs and a set of recommendations for improving these programs. Our approach in conducting the review was to obtain available federal, state and municipal programs data and to interview people who work with E&SC programs on a daily basis, including state, municipal, construction contractor and site inspector staff. A statement of the problem, an introduction to applicable regulations, and a description of our project approach are provided below

UNH Scholars' Repository

Generative and Pseudo-Relevant Feedback for Sparse, Dense and Learned Sparse Retrieval

Author: Chatterjee Shubham
Dalton Jeff
Mackie Iain
Publication venue
Publication date: 22/10/2023
Field of study

Pseudo-relevance feedback (PRF) is a classical approach to address lexical mismatch by enriching the query using first-pass retrieval. Moreover, recent work on generative-relevance feedback (GRF) shows that query expansion models using text generated from large language models can improve sparse retrieval without depending on first-pass retrieval effectiveness. This work extends GRF to dense and learned sparse retrieval paradigms with experiments over six standard document ranking benchmarks. We find that GRF improves over comparable PRF techniques by around 10% on both precision and recall-oriented measures. Nonetheless, query analysis shows that GRF and PRF have contrasting benefits, with GRF providing external context not present in first-pass retrieval, whereas PRF grounds the query to the information contained within the target corpus. Thus, we propose combining generative and pseudo-relevance feedback ranking signals to achieve the benefits of both feedback classes, which significantly increases recall over PRF methods on 95% of experiments

Edinburgh Research Explorer

DREQ: Document Re-Ranking Using Entity-based Query Understanding

Author: Chatterjee Shubham
Dalton Jeff
Mackie Iain
Publication venue
Publication date: 11/01/2024
Field of study

While entity-oriented neural IR models have advanced significantly, they often overlook a key nuance: the varying degrees of influence individual entities within a document have on its overall relevance. Addressing this gap, we present DREQ, an entity-oriented dense document re-ranking model. Uniquely, we emphasize the query-relevant entities within a document's representation while simultaneously attenuating the less relevant ones, thus obtaining a query-specific entity-centric document representation. We then combine this entity-centric document representation with the text-centric representation of the document to obtain a "hybrid" representation of the document. We learn a relevance score for the document using this hybrid representation. Using four large-scale benchmarks, we show that DREQ outperforms state-of-the-art neural and non-neural re-ranking methods, highlighting the effectiveness of our entity-oriented representation approach.Comment: To be presented as a full paper at ECIR 2024 in Glasgpow, U

arXiv.org e-Print Archive

DREQ: Document Re-Ranking Using Entity-based Query Understanding

Author: Chatterjee Shubham
Dalton Jeff
Mackie Iain
Publication venue
Publication date: 20/03/2024
Field of study

While entity-oriented neural IR models have advanced significantly, they often overlook a key nuance: the varying degrees of influence individual entities within a document have on its overall relevance. Addressing this gap, we present DREQ, an entity-oriented dense document re-ranking model. Uniquely, we emphasize the query-relevant entities within a document’s representation while simultaneously attenuating the less relevant ones, thus obtaining a query-specific entity-centric document representation. We then combine this entity-centric document representation with the text-centric representation of the document to obtain a “hybrid” representation of the document. We learn a relevance score for the document using this hybrid representation. Using four largescale benchmarks, we show that DREQ outperforms state-of-the-art neural and non-neural re-ranking methods, highlighting the effectiveness of our entity-oriented representation approach

Edinburgh Research Explorer

Local and global query expansion for hierarchical complex topics

Author: Allan James
Dalton Jeff
Dietz Laura
Naseri Shahrzad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/04/2019
Field of study

In this work we study local and global methods for query expansion for multifaceted complex topics. We study word-based and entity-based expansion methods and extend these approaches to complex topics using fine-grained expansion on different elements of the hierarchical query structure. For a source of hierarchical complex topics we use the TREC Complex Answer Retrieval (CAR) benchmark data collection. We find that leveraging the hierarchical topic structure is needed for both local and global expansion methods to be effective. Further, the results demonstrate that entity-based expansion methods show significant gains over word-based models alone, with local feedback providing the largest improvement. The results on the CAR paragraph retrieval task demonstrate that expansion models that incorporate both the hierarchical query structure and entity-based expansion result in a greater than 20% improvement over word-based expansion approaches

Crossref

Enlighten

Reliable & Resilient: The Value of Our Existing Coal Fleet - An Assessment of Measures to Improve Reliability & Efficiency While Reducing Emissions

Author: Carter Doug
Cichanowicz J. Edward
Dalton Stu
Wallace Jeff
Publication venue: The Research Repository @ WVU
Publication date: 01/05/2014
Field of study

The Research Repository @ WVU (West Virginia University)