16,473 research outputs found

    Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts

    Get PDF
    The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering

    PRESY: A Context Based Query Reformulation Tool for Information Retrieval on the Web

    Full text link
    Problem Statement: The huge number of information on the web as well as the growth of new inexperienced users creates new challenges for information retrieval. It has become increasingly difficult for these users to find relevant documents that satisfy their individual needs. Certainly the current search engines (such as Google, Bing and Yahoo) offer an efficient way to browse the web content. However, the result quality is highly based on uses queries which need to be more precise to find relevant documents. This task still complicated for the majority of inept users who cannot express their needs with significant words in the query. For that reason, we believe that a reformulation of the initial user's query can be a good alternative to improve the information selectivity. This study proposes a novel approach and presents a prototype system called PRESY (Profile-based REformulation SYstem) for information retrieval on the web. Approach: It uses an incremental approach to categorize users by constructing a contextual base. The latter is composed of two types of context (static and dynamic) obtained using the users' profiles. The architecture proposed was implemented using .Net environment to perform queries reformulating tests. Results: The experiments gives at the end of this article show that the precision of the returned content is effectively improved. The tests were performed with the most popular searching engine (i.e. Google, Bind and Yahoo) selected in particular for their high selectivity. Among the given results, we found that query reformulation improve the first three results by 10.7% and 11.7% of the next seven returned elements. So as we can see the reformulation of users' initial queries improves the pertinence of returned content.Comment: 8 page

    Conventions and mutual expectations — understanding sources for web genres

    Get PDF
    Genres can be understood in many different ways. They are often perceived as a primarily sociological construction, or, alternatively, as a stylostatistically observable objective characteristic of texts. The latter view is more common in the research field of information and language technology. These two views can be quite compatible and can inform each other; this present investigation discusses knowledge sources for studying genre variation and change by observing reader and author behaviour rather than performing analyses on the information objects themselves

    Exploiting the potential of large databases of electronic health records for research using rapid search algorithms and an intuitive query interface.

    Get PDF
    Objective: UK primary care databases, which contain diagnostic, demographic and prescribing information for millions of patients geographically representative of the UK, represent a significant resource for health services and clinical research. They can be used to identify patients with a specified disease or condition (phenotyping) and to investigate patterns of diagnosis and symptoms. Currently, extracting such information manually is time-consuming and requires considerable expertise. In order to exploit more fully the potential of these large and complex databases, our interdisciplinary team developed generic methods allowing access to different types of user. Materials and methods: Using the Clinical Practice Research Datalink database, we have developed an online user-focused system (TrialViz), which enables users interactively to select suitable medical general practices based on two criteria: suitability of the patient base for the intended study (phenotyping) and measures of data quality. Results: An end-to-end system, underpinned by an innovative search algorithm, allows the user to extract information in near real-time via an intuitive query interface and to explore this information using interactive visualization tools. A usability evaluation of this system produced positive results. Discussion: We present the challenges and results in the development of TrialViz and our plans for its extension for wider applications of clinical research. Conclusions: Our fast search algorithms and simple query algorithms represent a significant advance for users of clinical research databases

    QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

    Full text link
    In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of "social agony" to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

    Ontology-Based Recommendation of Editorial Products

    Get PDF
    Major academic publishers need to be able to analyse their vast catalogue of products and select the best items to be marketed in scientific venues. This is a complex exercise that requires characterising with a high precision the topics of thousands of books and matching them with the interests of the relevant communities. In Springer Nature, this task has been traditionally handled manually by publishing editors. However, the rapid growth in the number of scientific publications and the dynamic nature of the Computer Science landscape has made this solution increasingly inefficient. We have addressed this issue by creating Smart Book Recommender (SBR), an ontology-based recommender system developed by The Open University (OU) in collaboration with Springer Nature, which supports their Computer Science editorial team in selecting the products to market at specific venues. SBR recommends books, journals, and conference proceedings relevant to a conference by taking advantage of a semantically enhanced representation of about 27K editorial products. This is based on the Computer Science Ontology, a very large-scale, automatically generated taxonomy of research areas. SBR also allows users to investigate why a certain publication was suggested by the system. It does so by means of an interactive graph view that displays the topic taxonomy of the recommended editorial product and compares it with the topic-centric characterization of the input conference. An evaluation carried out with seven Springer Nature editors and seven OU researchers has confirmed the effectiveness of the solution
    • …
    corecore