Search CORE

15,611 research outputs found

Entity Ranking on Graphs: Studies on Expert Finding

Author: Hiemstra D.
Rode H.
Serdyukov P.
Zaragoza H.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Todays web search engines try to offer services for finding various information in addition to simple web pages, like showing locations or answering simple fact queries. Understanding the association of named entities and documents is one of the key steps towards such semantic search tasks. This paper addresses the ranking of entities and models it in a graph-based relevance propagation framework. In particular we study the problem of expert finding as an example of an entity ranking task. Entity containment graphs are introduced that represent the relationship between text fragments on the one hand and their contained entities on the other hand. The paper shows how these graphs can be used to propagate relevance information from the pre-ranked text fragments to their entities. We use this propagation framework to model existing approaches to expert finding based on the entity's indegree and extend them by recursive relevance propagation based on a probabilistic random walk over the entity containment graphs. Experiments on the TREC expert search task compare the retrieval performance of the different graph and propagation models

CiteSeerX

Radboud Repository

University of Twente Research Information

Symbiosis between the TRECVid benchmark and video libraries at the Netherlands Institute for Sound and Vision

Author: AF Smeaton
AF Smeaton
Alan F. Smeaton
B Huurnink
B Huurnink
CGM Snoek
CGM Snoek
CV Thornley
D. Tjondronegoro
H.-T. Pu
Johan Oomen
L. Hollink
M Hertzum
Paul Over
S Shatford
Wessel Kraaij
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Audiovisual archives are investing in large-scale digitisation efforts of their analogue holdings and, in parallel, ingesting an ever-increasing amount of born- digital files in their digital storage facilities. Digitisation opens up new access paradigms and boosted re-use of audiovisual content. Query-log analyses show the shortcomings of manual annotation, therefore archives are complementing these annotations by developing novel search engines that automatically extract information from both audio and the visual tracks. Over the past few years, the TRECVid benchmark has developed a novel relationship with the Netherlands Institute of Sound and Vision (NISV) which goes beyond the NISV just providing data and use cases to TRECVid. Prototype and demonstrator systems developed as part of TRECVid are set to become a key driver in improving the quality of search engines at the NISV and will ultimately help other audiovisual archives to offer more efficient and more fine-grained access to their collections. This paper reports the experiences of NISV in leveraging the activities of the TRECVid benchmark

Crossref

Irish Universities

DCU Online Research Access Service

Radboud Repository

Sound and Vision Publications

Learning to Rank Academic Experts in the DBLP Dataset

Author: Calado Pável
Martins Bruno
Moreira Catarina
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

Expert finding is an information retrieval task that is concerned with the search for the most knowledgeable people with respect to a specific topic, and the search is based on documents that describe people's activities. The task involves taking a user query as input and returning a list of people who are sorted by their level of expertise with respect to the user query. Despite recent interest in the area, the current state-of-the-art techniques lack in principled approaches for optimally combining different sources of evidence. This article proposes two frameworks for combining multiple estimators of expertise. These estimators are derived from textual contents, from graph-structure of the citation patterns for the community of experts, and from profile information about the experts. More specifically, this article explores the use of supervised learning to rank methods, as well as rank aggregation approaches, for combing all of the estimators of expertise. Several supervised learning algorithms, which are representative of the pointwise, pairwise and listwise approaches, were tested, and various state-of-the-art data fusion techniques were also explored for the rank aggregation framework. Experiments that were performed on a dataset of academic publications from the Computer Science domain attest the adequacy of the proposed approaches.Comment: Expert Systems, 2013. arXiv admin note: text overlap with arXiv:1302.041

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Recommended from our members

Integrating multiple document features in language models for expert finding

Author: Huang Xiangji
Rüger Stefan
Song Dawei
Zhu Jianhan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We argue that expert finding is sensitive to multiple document features in an organizational intranet. These document features include multiple levels of associations between experts and a query topic from sentence, paragraph, up to document levels, document authority information such as the PageRank, indegree, and URL length of documents, and internal document structures that indicate the experts' relationship with the content of documents. Our assumption is that expert finding can largely benefit from the incorporation of these document features. However, existing language modeling approaches for expert finding have not sufficiently taken into account these document features. We propose a novel language modeling approach, which integrates multiple document features, for expert finding. Our experiments on two large scale TREC Enterprise Track datasets, i.e., the W3C and CSIRO datasets, demonstrate that the natures of the two organizational intranets and two types of expert finding tasks, i.e., key contact finding for CSIRO and knowledgeable person finding for W3C, influence the effectiveness of different document features. Our work provides insights into which document features work for certain types of expert finding tasks, and helps design expert finding strategies that are effective for different scenarios. Our main contribution is to develop an effective formal method for modeling multiple document features in expert finding, and conduct a systematic investigation of their effects. It is worth noting that our novel approach achieves better results in terms of MAP than previous language model based approaches and the best automatic runs in both the TREC2006 and TREC2007 expert search tasks, respectively

Open Research Online (The Open University)

Recommended from our members

Where Are My Intelligent Assistant's Mistakes? A Systematic Testing Approach

Author: A. Blackwell
A. Glass
B. Lim
B. Lim
H. Raghavan
J. Rowan
J. Shen
J. Talbot
J. Tullio
M. Burnett
M. Fisher
M. Klann
O. Raz
P. Frankl
R. Abraham
R. Baeza-Yates
R. Miller
T. Hastie
T. Kulesza
T. Kulesza
V. Grigoreanu
Publication venue
Publication date: 01/01/2011
Field of study

Intelligent assistants are handling increasingly critical tasks, but until now, end users have had no way to systematically assess where their assistants make mistakes. For some intelligent assistants, this is a serious problem: if the assistant is doing work that is important, such as assisting with qualitative research or monitoring an elderly parent’s safety, the user may pay a high cost for unnoticed mistakes. This paper addresses the problem with WYSIWYT/ML (What You See Is What You Test for Machine Learning), a human/computer partnership that enables end users to systematically test intelligent assistants. Our empirical evaluation shows that WYSIWYT/ML helped end users find assistants’ mistakes significantly more effectively than ad hoc testing. Not only did it allow users to assess an assistant’s work on an average of 117 predictions in only 10 minutes, it also scaled to a much larger data set, assessing an assistant’s work on 623 out of 1,448 predictions using only the users’ original 10 minutes’ testing effort

City Research Online

Crossref

Enlighten

POLIS: a probabilistic summarisation logic for structured documents

Author: Forst Jan Frederik
Publication venue
Publication date: 01/01/2009
Field of study

PhDAs the availability of structured documents, formatted in markup languages such as SGML, RDF, or XML, increases, retrieval systems increasingly focus on the retrieval of document-elements, rather than entire documents. Additionally, abstraction layers in the form of formalised retrieval logics have allowed developers to include search facilities into numerous applications, without the need of having detailed knowledge of retrieval models. Although automatic document summarisation has been recognised as a useful tool for reducing the workload of information system users, very few such abstraction layers have been developed for the task of automatic document summarisation. This thesis describes the development of an abstraction logic for summarisation, called POLIS, which provides users (such as developers or knowledge engineers) with a high-level access to summarisation facilities. Furthermore, POLIS allows users to exploit the hierarchical information provided by structured documents. The development of POLIS is carried out in a step-by-step way. We start by defining a series of probabilistic summarisation models, which provide weights to document-elements at a user selected level. These summarisation models are those accessible through POLIS. The formal definition of POLIS is performed in three steps. We start by providing a syntax for POLIS, through which users/knowledge engineers interact with the logic. This is followed by a definition of the logics semantics. Finally, we provide details of an implementation of POLIS. The final chapters of this dissertation are concerned with the evaluation of POLIS, which is conducted in two stages. Firstly, we evaluate the performance of the summarisation models by applying POLIS to two test collections, the DUC AQUAINT corpus, and the INEX IEEE corpus. This is followed by application scenarios for POLIS, in which we discuss how POLIS can be used in specific IR tasks

CiteSeerX

Queen Mary Research Online

Realizing the Technical Advantages of Star Transformation

Author: Darling Karen L.
Publication venue: ePublications at Regis University
Publication date: 14/12/2009
Field of study

Data warehousing and business intelligence go hand in hand, each gives the other purpose for development, maintenance and improvement. Both have evolved over a few decades and build upon initial development. Management initiatives further drive the need and complexity of business intelligence, while in turn expanding the end user community so that business change, results and strategy are affected at the business unit level. The literature, including a recent business intelligence user survey, demonstrates that query performance is the most significant issue encountered. Oracle\u27s data warehouse 10g.2 is examined with improvements to query optimization via best practice through Star Transformation. Star Transformation is a star schema query rewrite and join back through a hash join, which provides extensive query performance improvement. Most data warehouses exist as normalized or in 3rd normal form (3NF), while star schemas in a denormalized warehouse are not the norm . Changes in the database environment must be implemented, along with agreement from business leadership and alignment of business objectives with a Star Transformation project. Often, so much change, shifting priorities and lack of understanding about query optimization benefits can stifle a project. Critical to the success of gaining support and financial backing is the official plan and demonstration of return on investment documentation. Query optimization is highly complex. Both the technological and business entities should prioritize goals and consider the benefits of improved query response time, realizing the technical advantages of Star Transformation

ePublications at Regis University

Workshop on Novel Methodologies for Evaluation in Information Retrieval : Workshop held at European Conference on Information Retrieval - ECIR 2008, Glasgow, United Kingdom, 30 March 2008

Author
Publication venue
Publication date: 01/01/2008
Field of study

White Rose Research Online

Beyond Personalization: Research Directions in Multistakeholder Recommendation

Author: Abdollahpouri Himan
Adomavicius Gediminas
Burke Robin
Guy Ido
Jannach Dietmar
Kamishima Toshihiro
Krasnodebski Jan
Pizzato Luiz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/12/2019
Field of study

Recommender systems are personalized information access applications; they are ubiquitous in today's online environment, and effective at finding items that meet user needs and tastes. As the reach of recommender systems has extended, it has become apparent that the single-minded focus on the user common to academic research has obscured other important aspects of recommendation outcomes. Properties such as fairness, balance, profitability, and reciprocity are not captured by typical metrics for recommender system evaluation. The concept of multistakeholder recommendation has emerged as a unifying framework for describing and understanding recommendation settings where the end user is not the sole focus. This article describes the origins of multistakeholder recommendation, and the landscape of system designs. It provides illustrative examples of current research, as well as outlining open questions and research directions for the field.Comment: 64 page

arXiv.org e-Print Archive

CU Scholar Institutional Repository

Greening information management: final report

Author: MacDonald A.
McCulloch E.
McDonald D.
Publication venue: University of Strathclyde
Publication date: 01/01/2009
Field of study

As the recent JISC report on ‘the ‘greening’ of ICT in education [1] highlights, the increasing reliance on ICT to underpin the business functions of higher education institutions has a heavy environmental impact, due mainly to the consumption of electricity to run computers and to cool data centres. While work is already under way to investigate how more energy efficient ICT can be introduced, to date there has been much less focus on the potential environmental benefits to be accrued from reducing the demand ‘at source’ through better data and information management. JISC thus commissioned the University of Strathclyde to undertake a study to gather evidence that establishes the efficacy of using information management options as components of Green ICT strategies within UK Higher Education environments, and to highlight existing practices which have the potential for wider replication

University of Strathclyde Institutional Repository