Search CORE

7 research outputs found

Recommended from our members

Window based Enterprise Expert Search

Author: Lu W.
MacFarlane A.
Robertson S.
Zhao L.
Publication venue
Publication date: 01/01/2007
Field of study

This is the first year for the participation of the City University Centre of Interactive System Research (CISR) in the Expert Search Task. In this paper, we describe an expert search experiment based on window-based techniques, that is, we build profile for each expert by using information around the expert’s name and email address in the documents. We then use the traditional IR techniques to search and rank experts. Our experiment is done on Okapi and BM25 is used as the ranking model. Results show that parameter b does have an effect on the retrieval effectiveness and using a smaller value for b produces better results

City Research Online

Integrating multiple windows and document features for expert finding

Author: Aho
Amati
Balog
Balog
Balog
Brin
Burgess
Campbell
Cao
Charikar
Chen
Chu-Carroll
Ciravegna
Conrad
Craswell
Craswell
Craven
de Vries
Etzioni
Fang
Fu
Fu
Hu
Kang
Kolla
Macdonald
Macdonald
Maybury
Metzler
Nenadic
Page
Petkova
Petkova
Petkova
Robertson
Salton
Salton
Silverstein
Soboroff
Song
Vechtomova
Westerveld
Yao
Yimam-Seid
Zhao
Zhu
Publication venue: 'Wiley'
Publication date: 01/01/2009
Field of study

Expert finding is a key task in enterprise search and has recently attracted lots of attention from both research and industry communities. Given a search topic, a prominent existing approach is to apply some information retrieval (IR) system to retrieve top ranking documents, which will then be used to derive associations between experts and the search topic based on cooccurrences. However, we argue that expert finding is more sensitive to multiple levels of associations and document features that current expert finding systems insufficiently address, including (a) multiple levels of associations between experts and search topics, (b) document internal structure, and (c) document authority. We propose a novel approach that integrates the above-mentioned three aspects as well as a query expansion technique in a two-stage model for expert finding. A systematic evaluation is conducted on TREC collections to test the performance of our approach as well as the effects of multiple windows, document features, and query expansion. These experimental results show that query expansion can dramatically improve expert finding performance with statistical significance. For three well-known IR models with or without query expansion, document internal structures help improve a single window-based approach but without statistical significance, while our novel multiple window-based approach can significantly improve the performance of a single window-based approach both with and without document internal structures

CiteSeerX

Crossref

Open Access Institutional Repository at Robert Gordon University

Open Research Online (The Open University)

Entity finding in a document collection using adaptive window sizes

Author: Alarfaj Fawaz
Publication venue
Publication date: 01/01/2016
Field of study

Traditional search engines work by returning a list of documents in response to queries. However, such engines are often inadequate when the information need of the user involves entities. This issue has led to the development of entity-search, which unlike normal web search does not aim at returning documents but names of people, products, organisations, etc. Some of the most successful methods for identifying relevant entities were built around the idea of a proximity search. In this thesis, we present an adaptive, well-founded, general-purpose entity finding model. In contrast to the work of other researchers, where the size of the targeted part of the document (i.e., the window size) is fixed across the collection, our method uses a number of document features to calculate an adaptive window size for each document in the collection. We construct a new entity finding test collection called the ESSEX test collection for use in evaluating our method. This collection represents a university setting as the data was collected from the publicly accessible webpages of the University of Essex. We test our method on five different datasets including the W3C Dataset, CERC Dataset, UvT/TU Datasets, ESSEX dataset and the ClueWeb09 entity finding collection. Our method provides a considerable improvement over various baseline models on all of these datasets. We also find that the document features considered for the calculation of the window size have differing impacts on the performance of the search. These impacts depend on the structure of the documents and the document language. As users may have a variety of search requirements, we show that our method is adaptable to different applications, environments, types of named entities and document collections

University of Essex Research Repository

Evaluating Information Retrieval and Access Tasks

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book summarizes the first two decades of the NII Testbeds and Community for Information access Research (NTCIR). NTCIR is a series of evaluation forums run by a global team of researchers and hosted by the National Institute of Informatics (NII), Japan. The book is unique in that it discusses not just what was done at NTCIR, but also how it was done and the impact it has achieved. For example, in some chapters the reader sees the early seeds of what eventually grew to be the search engines that provide access to content on the World Wide Web, today’s smartphones that can tailor what they show to the needs of their owners, and the smart speakers that enrich our lives at home and on the move. We also get glimpses into how new search engines can be built for mathematical formulae, or for the digital record of a lived human life. Key to the success of the NTCIR endeavor was early recognition that information access research is an empirical discipline and that evaluation therefore lay at the core of the enterprise. Evaluation is thus at the heart of each chapter in this book. They show, for example, how the recognition that some documents are more important than others has shaped thinking about evaluation design. The thirty-three contributors to this volume speak for the many hundreds of researchers from dozens of countries around the world who together shaped NTCIR as organizers and participants. This book is suitable for researchers, practitioners, and students—anyone who wants to learn about past and present evaluation efforts in information retrieval, information access, and natural language processing, as well as those who want to participate in an evaluation task or even to design and organize one

OAPEN Library

Expert Finding in Disparate Environments

Author: D'Amore Raymond
Publication venue
Publication date: 01/03/2008
Field of study

Providing knowledge workers with access to experts and communities-of-practice is central to expertise sharing, and crucial to effective organizational performance, adaptation, and even survival. However, in complex work environments, it is difficult to know who knows what across heterogeneous groups, disparate locations, and asynchronous work. As such, where expert finding has traditionally been a manual operation there is increasing interest in policy and technical infrastructure that makes work visible and supports automated tools for locating expertise. Expert finding, is a multidisciplinary problem that cross-cuts knowledge management, organizational analysis, and information retrieval. Recently, a number of expert finders have emerged; however, many tools are limited in that they are extensions of traditional information retrieval systems and exploit artifact information primarily. This thesis explores a new class of expert finders that use organizational context as a basis for assessing expertise and for conferring trust in the system. The hypothesis here is that expertise can be inferred through assessments of work behavior and work derivatives (e.g., artifacts). The Expert Locator, developed within a live organizational environment, is a model-based prototype that exploits organizational work context. The system associates expertise ratings with expert’s signaling behavior and is extensible so that signaling behavior from multiple activity space contexts can be fused into aggregate retrieval scores. Post-retrieval analysis supports evidence review and personal network browsing, aiding users in both detection and selection. During operational evaluation, the prototype generated high-precision searches across a range of topics, and was sensitive to organizational role; ranking true experts (i.e., authorities) higher than brokers providing referrals. Precision increased with the number of activity spaces used in the model, but varied across queries. The highest performing queries are characterized by high specificity terms, and low organizational diffusion amongst retrieved experts; essentially, the highest rated experts are situated within organizational niches

White Rose E-theses Online

Understanding and modeling users of modern search engines

Author: Chuklin A.
Publication venue
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

THUIR at TREC 2005: Enterprise Track

Author: Min Zhang
Shaoping Ma
Wei Yu
Yiqun Liu
Yize Li
Yupeng Fu Wei
Publication venue
Publication date
Field of study

IR group of Tsinghua University participated in the expert finding task of TREC2005 enterprise track this year. We developed a novel method which is called document reorganization to solve the problem of locating expert for certain query topics. This method collects and combines related information from different media formats to organize a document which describes an expert candidate. This method proves both effective and efficient for expert finding task. Our submitted run (THUENT0505) obtains the best performance in all participants with evaluation metric MAP. The reorganized documents are also significantly smaller in size than the original corpus

CiteSeerX