55,198 research outputs found

    University of Twente at the TREC 2008 Enterprise Track: using the Global Web as an expertise evidence source

    Get PDF
    This paper describes the details of our participation in expert search task of the TREC 2007 Enterprise track.\ud This is the fourth (and the last) year of TREC 2007 Enterprise Track and the second year the University of Twente (Database group) submitted runs for the expert nding task. In the methods that were used to produce these runs, we mostly rely on the predicting potential of those expertise evidence sources that are publicly available on the Global Web, but not hosted at the website of the organization under study (CSIRO). This paper describes the follow-up studies\ud complimentary to our recent research [8] that demonstrated how taking the web factor seriously signicantly improves the performance of expert nding in the enterprise

    Using the Global Web as an Expertise Evidence Source

    Get PDF
    This paper describes the details of our participation in expert search task of the TREC 2007 Enterprise track. The presented study demonstrates the predicting potential of the expertise evidence that can be found outside of the organization. We discovered that combining the ranking built solely on the Enterprise data with the Global Web based ranking may produce significant increases in performance. However, our main goal was to explore whether this result can be further improved by using various quality measures to distinguish among web result items. While, indeed, it was beneficial to use some of these measures, especially those measuring relevance of URL strings and titles, it stayed unclear whether they are decisively important

    Being Omnipresent To Be Almighty: The Importance of The Global Web Evidence for Organizational Expert Finding

    Get PDF
    Modern expert nding algorithms are developed under the assumption that all possible expertise evidence for a person is concentrated in a company that currently employs the person. The evidence that can be acquired outside of an enterprise is traditionally unnoticed. At the same time, the Web is full of personal information which is sufficiently detailed to judge about a person's skills and knowledge. In this work, we review various sources of expertise evidence out-side of an organization and experiment with rankings built on the data acquired from six dierent sources, accessible through APIs of two major web search engines. We show that these rankings and their combinations are often more realistic and of higher quality than rankings built on organizational data only

    Structural Regularities in Text-based Entity Vector Spaces

    Get PDF
    Entity retrieval is the task of finding entities such as people or products in response to a query, based solely on the textual documents they are associated with. Recent semantic entity retrieval algorithms represent queries and experts in finite-dimensional vector spaces, where both are constructed from text sequences. We investigate entity vector spaces and the degree to which they capture structural regularities. Such vector spaces are constructed in an unsupervised manner without explicit information about structural aspects. For concreteness, we address these questions for a specific type of entity: experts in the context of expert finding. We discover how clusterings of experts correspond to committees in organizations, the ability of expert representations to encode the co-author graph, and the degree to which they encode academic rank. We compare latent, continuous representations created using methods based on distributional semantics (LSI), topic models (LDA) and neural networks (word2vec, doc2vec, SERT). Vector spaces created using neural methods, such as doc2vec and SERT, systematically perform better at clustering than LSI, LDA and word2vec. When it comes to encoding entity relations, SERT performs best.Comment: ICTIR2017. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval. 201

    Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts

    Get PDF
    The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering

    Utilising semantic technologies for intelligent indexing and retrieval of digital images

    Get PDF
    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion
    • …
    corecore