Search CORE

3,779 research outputs found

University of Twente at the TREC 2007 Enterprise Track : modeling relevance propagation for the expert search task

Author: Hiemstra Djoerd
Rode Henning
Serdyukov Pavel
Publication venue: National Institute of Standards and Technology (NIST)
Publication date: 01/01/2007
Field of study

This paper describes several approaches which we used for the expert search task of the TREC 2007 Enterprise track.\ud We studied several methods of relevance propagation from documents to related candidate experts. Instead of one-step propagation from documents to directly related candidates, used by many systems in the previous years, we do not limit the relevance flow and disseminate it further through mutual documents-candidates connections. We model relevance propagation using random walk principles, or in formal terms, discrete Markov processes. We experiment with\ud innite and nite number of propagation steps. We also demonstrate how additional information, namely hyperlinks among documents, organizational structure of the enterprise and relevance feedback may be utilized by the presented techniques

Radboud Repository

University of Twente Research Information

Being Omnipresent To Be Almighty: The Importance of The Global Web Evidence for Organizational Expert Finding

Author: Hiemstra D.
Serdyukov P.
Publication venue: Amsterdam University Press
Publication date: 01/01/2008
Field of study

Modern expert nding algorithms are developed under the assumption that all possible expertise evidence for a person is concentrated in a company that currently employs the person. The evidence that can be acquired outside of an enterprise is traditionally unnoticed. At the same time, the Web is full of personal information which is sufficiently detailed to judge about a person's skills and knowledge. In this work, we review various sources of expertise evidence out-side of an organization and experiment with rankings built on the data acquired from six dierent sources, accessible through APIs of two major web search engines. We show that these rankings and their combinations are often more realistic and of higher quality than rankings built on organizational data only

Radboud Repository

University of Twente Research Information

Unsupervised, Efficient and Semantic Expertise Retrieval

Author: Bailey P.
Balog K.
Cao Y.
Craswell N.
Craswell N.
Davenport T. H.
Glorot X.
Hinton G. E.
Kiros R.
Maybury M. T.
Mikolov T.
Mikolov T.
Mnih A.
Mnih A.
Moreira C.
Rumelhart D.
Shaw J. A.
Sorg P.
Vapnik V.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed log-linear model achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches. It yields a statistically significant improved ranking over vector space and generative models in most cases, matching the performance of supervised methods on various benchmarks. That is, by using solely text we can do as well as methods that work with external evidence and/or relevance feedback. A contrastive analysis of rankings produced by discriminative and generative approaches shows that they have complementary strengths due to the ability of the unsupervised discriminative model to perform semantic matching.Comment: WWW2016, Proceedings of the 25th International Conference on World Wide Web. 201

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Recommended from our members

Integrating multiple document features in language models for expert finding

Author: Huang Xiangji
Rüger Stefan
Song Dawei
Zhu Jianhan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We argue that expert finding is sensitive to multiple document features in an organizational intranet. These document features include multiple levels of associations between experts and a query topic from sentence, paragraph, up to document levels, document authority information such as the PageRank, indegree, and URL length of documents, and internal document structures that indicate the experts' relationship with the content of documents. Our assumption is that expert finding can largely benefit from the incorporation of these document features. However, existing language modeling approaches for expert finding have not sufficiently taken into account these document features. We propose a novel language modeling approach, which integrates multiple document features, for expert finding. Our experiments on two large scale TREC Enterprise Track datasets, i.e., the W3C and CSIRO datasets, demonstrate that the natures of the two organizational intranets and two types of expert finding tasks, i.e., key contact finding for CSIRO and knowledgeable person finding for W3C, influence the effectiveness of different document features. Our work provides insights into which document features work for certain types of expert finding tasks, and helps design expert finding strategies that are effective for different scenarios. Our main contribution is to develop an effective formal method for modeling multiple document features in expert finding, and conduct a systematic investigation of their effects. It is worth noting that our novel approach achieves better results in terms of MAP than previous language model based approaches and the best automatic runs in both the TREC2006 and TREC2007 expert search tasks, respectively

Open Research Online (The Open University)

Design Patterns for Fusion-Based Object Retrieval

Author: C Macdonald
H Fang
M Shokouhi
W Weerkamp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2017
Field of study

We address the task of ranking objects (such as people, blogs, or verticals) that, unlike documents, do not have direct term-based representations. To be able to match them against keyword queries, evidence needs to be amassed from documents that are associated with the given object. We present two design patterns, i.e., general reusable retrieval strategies, which are able to encompass most existing approaches from the past. One strategy combines evidence on the term level (early fusion), while the other does it on the document level (late fusion). We demonstrate the generality of these patterns by applying them to three different object retrieval tasks: expert finding, blog distillation, and vertical ranking.Comment: Proceedings of the 39th European conference on Advances in Information Retrieval (ECIR '17), 201

arXiv.org e-Print Archive

Crossref

Quantitative modelling of the human–Earth System a new kind of science?

Author: Brede Markus
Finnigan John
Grigg Nicola
Publication venue: Australian Academy of Science
Publication date: 21/02/2013
Field of study

The five grand challenges set out for Earth System Science by the International Council for Science in 2010 require a true fusion of social science, economics and natural science—a fusion that has not yet been achieved. In this paper we propose that constructing quantitative models of the dynamics of the human–Earth system can serve as a catalyst for this fusion. We confront well-known objections to modelling societal dynamics by drawing lessons from the development of natural science over the last four centuries and applying them to social and economic science. First, we pose three questions that require real integration of the three fields of science. They concern the coupling of physical planetary boundaries via social processes; the extension of the concept of planetary boundaries to the human–Earth System; and the possibly self-defeating nature of the United Nation’s Millennium Development Goals. Second, we ask whether there are regularities or ‘attractors’ in the human–Earth System analogous to those that prompted the search for laws of nature. We nominate some candidates and discuss why we should observe them given that human actors with foresight and intentionality play a fundamental role in the human–Earth System. We conclude that, at sufficiently large time and space scales, social processes are predictable in some sense. Third, we canvass some essential mathematical techniques that this research fusion must incorporate, and we ask what kind of data would be needed to validate or falsify our models. Finally, we briefly review the state of the art in quantitative modelling of the human–Earth System today and highlight a gap between so-called integrated assessment models applied at regional and global scale, which could be filled by a new scale of model

Southampton (e-Prints Soton)

Modeling document features for expert finding

Author: Huang Xiangji
Rüger Stefan
Song Dawei
Zhu Jianhan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

We argue that expert finding is sensitive to multiple document features in an organization, and therefore, can benefit from the incorporation of these document features. We propose a unified language model, which integrates multiple document features, namely, multiple levels of associations, PageRank, indegree, internal document structure, and URL length. Our experiments on two TREC Enterprise Track collections, i.e., the W3C and CSIRO datasets, demonstrate that the natures of the two organizational intranets and two types of expert finding tasks, i.e., key contact finding for CSIRO and knowledgeable person finding for W3C, influence the effectiveness of different document features. Our work provides insights into which document features work for certain types of expert finding tasks, and helps design expert finding strategies that are effective for different scenarios

CiteSeerX

Crossref

Open Research Online (The Open University)

Enhancing Content-And-Structure Information Retrieval using a Native XML Database

Author: Pehcevski Jovan
Thom James A.
Vercoustre Anne-Marie
Publication venue
Publication date: 01/01/2004
Field of study

Three approaches to content-and-structure XML retrieval are analysed in this paper: first by using Zettair, a full-text information retrieval system; second by using eXist, a native XML database, and third by using a hybrid XML retrieval system that uses eXist to produce the final answers from likely relevant articles retrieved by Zettair. INEX 2003 content-and-structure topics can be classified in two categories: the first retrieving full articles as final answers, and the second retrieving more specific elements within articles as final answers. We show that for both topic categories our initial hybrid system improves the retrieval effectiveness of a native XML database. For ranking the final answer elements, we propose and evaluate a novel retrieval model that utilises the structural relationships between the answer elements of a native XML database and retrieves Coherent Retrieval Elements. The final results of our experiments show that when the XML retrieval task focusses on highly relevant elements our hybrid XML retrieval system with the Coherent Retrieval Elements module is 1.8 times more effective than Zettair and 3 times more effective than eXist, and yields an effective content-and-structure XML retrieval

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Sensor Search Techniques for Sensing as a Service Architecture for The Internet of Things

Author: Christen Peter
Compton Michael
Georgakopoulos Dimitrios
Liu Chi Harold
Perera Charith
Zaslavsky Arkady
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/09/2013
Field of study

The Internet of Things (IoT) is part of the Internet of the future and will comprise billions of intelligent communicating "things" or Internet Connected Objects (ICO) which will have sensing, actuating, and data processing capabilities. Each ICO will have one or more embedded sensors that will capture potentially enormous amounts of data. The sensors and related data streams can be clustered physically or virtually, which raises the challenge of searching and selecting the right sensors for a query in an efficient and effective way. This paper proposes a context-aware sensor search, selection and ranking model, called CASSARAM, to address the challenge of efficiently selecting a subset of relevant sensors out of a large set of sensors with similar functionality and capabilities. CASSARAM takes into account user preferences and considers a broad range of sensor characteristics, such as reliability, accuracy, location, battery life, and many more. The paper highlights the importance of sensor search, selection and ranking for the IoT, identifies important characteristics of both sensors and data capture processes, and discusses how semantic and quantitative reasoning can be combined together. This work also addresses challenges such as efficient distributed sensor search and relational-expression based filtering. CASSARAM testing and performance evaluation results are presented and discussed.Comment: IEEE sensors Journal, 2013. arXiv admin note: text overlap with arXiv:1303.244

arXiv.org e-Print Archive

Deakin Research Online

Online Research @ Cardiff

RMIT Research Repository

The Australian National University