16,473 research outputs found
Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts
The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering
PRESY: A Context Based Query Reformulation Tool for Information Retrieval on the Web
Problem Statement: The huge number of information on the web as well as the
growth of new inexperienced users creates new challenges for information
retrieval. It has become increasingly difficult for these users to find
relevant documents that satisfy their individual needs. Certainly the current
search engines (such as Google, Bing and Yahoo) offer an efficient way to
browse the web content. However, the result quality is highly based on uses
queries which need to be more precise to find relevant documents. This task
still complicated for the majority of inept users who cannot express their
needs with significant words in the query. For that reason, we believe that a
reformulation of the initial user's query can be a good alternative to improve
the information selectivity. This study proposes a novel approach and presents
a prototype system called PRESY (Profile-based REformulation SYstem) for
information retrieval on the web. Approach: It uses an incremental approach to
categorize users by constructing a contextual base. The latter is composed of
two types of context (static and dynamic) obtained using the users' profiles.
The architecture proposed was implemented using .Net environment to perform
queries reformulating tests. Results: The experiments gives at the end of this
article show that the precision of the returned content is effectively
improved. The tests were performed with the most popular searching engine (i.e.
Google, Bind and Yahoo) selected in particular for their high selectivity.
Among the given results, we found that query reformulation improve the first
three results by 10.7% and 11.7% of the next seven returned elements. So as we
can see the reformulation of users' initial queries improves the pertinence of
returned content.Comment: 8 page
Conventions and mutual expectations — understanding sources for web genres
Genres can be understood in many different ways. They are often perceived as a primarily sociological construction, or, alternatively, as a stylostatistically observable objective characteristic of texts. The latter view is more common in the research field of information and language technology. These two views can be quite compatible and can inform each other; this present investigation discusses knowledge sources for studying genre variation and change by observing reader and author behaviour rather than performing analyses on the information objects themselves
Exploiting the potential of large databases of electronic health records for research using rapid search algorithms and an intuitive query interface.
Objective: UK primary care databases, which contain diagnostic, demographic and prescribing information for millions of patients geographically representative of the UK, represent a significant resource for health services and clinical research. They can be used to identify patients with a specified disease or condition (phenotyping) and to investigate patterns of diagnosis and symptoms. Currently, extracting such information manually is time-consuming and requires considerable expertise. In order to exploit more fully the potential of these large and complex databases, our interdisciplinary team developed generic methods allowing access to different types of user.
Materials and methods: Using the Clinical Practice Research Datalink database, we have developed an online user-focused system (TrialViz), which enables users interactively to select suitable medical general practices based on two criteria: suitability of the patient base for the intended study (phenotyping) and measures of data quality.
Results: An end-to-end system, underpinned by an innovative search algorithm, allows the user to extract information in near real-time via an intuitive query interface and to explore this information using interactive visualization tools. A usability evaluation of this system produced positive results.
Discussion: We present the challenges and results in the development of TrialViz and our plans for its extension for wider applications of clinical research.
Conclusions: Our fast search algorithms and simple query algorithms represent a significant advance for users of clinical research databases
QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites
In this paper, we present a framework for Question Difficulty and Expertise
Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo!
Answers and Stack Overflow, which tackles a fundamental challenge in
crowdsourcing: how to appropriately route and assign questions to users with
the suitable expertise. This problem domain has been the subject of much
research and includes both language-agnostic as well as language conscious
solutions. We bring to bear a key language-agnostic insight: that users gain
expertise and therefore tend to ask as well as answer more difficult questions
over time. We use this insight within the popular competition (directed) graph
model to estimate question difficulty and user expertise by identifying key
hierarchical structure within said model. An important and novel contribution
here is the application of "social agony" to this problem domain. Difficulty
levels of newly posted questions (the cold-start problem) are estimated by
using our QDEE framework and additional textual features. We also propose a
model to route newly posted questions to appropriate users based on the
difficulty level of the question and the expertise of the user. Extensive
experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data
demonstrate the improved efficacy of our approach over contemporary
state-of-the-art models. The QDEE framework also allows us to characterize user
expertise in novel ways by identifying interesting patterns and roles played by
different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference
on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US
Ontology-Based Recommendation of Editorial Products
Major academic publishers need to be able to analyse their vast catalogue of products and select the best items to be marketed in scientific venues. This is a complex exercise that requires characterising with a high precision the topics of thousands of books and matching them with the interests of the relevant communities. In Springer Nature, this task has been traditionally handled manually by publishing editors. However, the rapid growth in the number of scientific publications and the dynamic nature of the Computer Science landscape has made this solution increasingly inefficient. We have addressed this issue by creating Smart Book Recommender (SBR), an ontology-based recommender system developed by The Open University (OU) in collaboration with Springer Nature, which supports their Computer Science editorial team in selecting the products to market at specific venues. SBR recommends books, journals, and conference proceedings relevant to a conference by taking advantage of a semantically enhanced representation of about 27K editorial products. This is based on the Computer Science Ontology, a very large-scale, automatically generated taxonomy of research areas. SBR also allows users to investigate why a certain publication was suggested by the system. It does so by means of an interactive graph view that displays the topic taxonomy of the recommended editorial product and compares it with the topic-centric characterization of the input conference. An evaluation carried out with seven Springer Nature editors and seven OU researchers has confirmed the effectiveness of the solution
- …