19,309 research outputs found
How users assess web pages for information-seeking
In this paper, we investigate the criteria used by online searchers when assessing the relevance of web pages for information-seeking tasks. Twenty four participants were given three tasks each, and indicated the features of web pages which they employed when deciding about the usefulness of the pages in relation to the tasks. These tasks were presented within the context of a simulated work-task situation. We investigated the relative utility of features identified by participants (web page content,structure and quality), and how the importance of these features is affected by the type of information-seeking task performed and the stage of the search. The results of this study provide a set of criteria used by searchers to decide about the utility of web pages for different types of tasks. Such criteria can have implications for the design of systems that use or recommend web pages
Meeting of the MINDS: an information retrieval research agenda
Since its inception in the late 1950s, the field of Information Retrieval (IR) has developed tools that help people find, organize, and analyze information. The key early influences on the field are well-known. Among them are H. P. Luhn's pioneering work, the development of the vector space retrieval model by Salton and his students, Cleverdon's development of the Cranfield experimental methodology, Spärck Jones' development of idf, and a series of probabilistic retrieval models by Robertson and Croft. Until the development of the WorldWideWeb (Web), IR was of greatest interest to professional information analysts such as librarians, intelligence analysts, the legal community, and the pharmaceutical industry
You can't see what you can't see: Experimental evidence for how much relevant information may be missed due to Google's Web search personalisation
The influence of Web search personalisation on professional knowledge work is
an understudied area. Here we investigate how public sector officials
self-assess their dependency on the Google Web search engine, whether they are
aware of the potential impact of algorithmic biases on their ability to
retrieve all relevant information, and how much relevant information may
actually be missed due to Web search personalisation. We find that the majority
of participants in our experimental study are neither aware that there is a
potential problem nor do they have a strategy to mitigate the risk of missing
relevant information when performing online searches. Most significantly, we
provide empirical evidence that up to 20% of relevant information may be missed
due to Web search personalisation. This work has significant implications for
Web research by public sector professionals, who should be provided with
training about the potential algorithmic biases that may affect their judgments
and decision making, as well as clear guidelines how to minimise the risk of
missing relevant information.Comment: paper submitted to the 11th Intl. Conf. on Social Informatics;
revision corrects error in interpretation of parameter Psi/p in RBO resulting
from discrepancy between the documentation of the implementation in R
(https://rdrr.io/bioc/gespeR/man/rbo.html) and the original definition
(https://dl.acm.org/citation.cfm?id=1852106) as per 20/05/201
Why People Search for Images using Web Search Engines
What are the intents or goals behind human interactions with image search
engines? Knowing why people search for images is of major concern to Web image
search engines because user satisfaction may vary as intent varies. Previous
analyses of image search behavior have mostly been query-based, focusing on
what images people search for, rather than intent-based, that is, why people
search for images. To date, there is no thorough investigation of how different
image search intents affect users' search behavior.
In this paper, we address the following questions: (1)Why do people search
for images in text-based Web image search systems? (2)How does image search
behavior change with user intent? (3)Can we predict user intent effectively
from interactions during the early stages of a search session? To this end, we
conduct both a lab-based user study and a commercial search log analysis.
We show that user intents in image search can be grouped into three classes:
Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals
different user behavior patterns under these three intents, such as first click
time, query reformulation, dwell time and mouse movement on the result page.
Based on user interaction features during the early stages of an image search
session, that is, before mouse scroll, we develop an intent classifier that is
able to achieve promising results for classifying intents into our three intent
classes. Given that all features can be obtained online and unobtrusively, the
predicted intents can provide guidance for choosing ranking methods immediately
after scrolling
Ordinary Search Engine Users Carrying Out Complex Search Tasks
Web search engines have become the dominant tools for finding information on
the Internet. Due to their popularity, users apply them to a wide range of
search needs, from simple look-ups to rather complex information tasks. This
paper presents the results of a study to investigate the characteristics of
these complex information needs in the context of Web search engines. The aim
of the study is to find out more about (1) what makes complex search tasks
distinct from simple tasks and if it is possible to find simple measures for
describing their complexity, (2) if search success for a task can be predicted
by means of unique measures, and (3) if successful searchers show a different
behavior than unsuccessful ones. The study includes 60 people who carried out a
set of 12 search tasks with current commercial search engines. Their behavior
was logged with the Search-Logger tool. The results confirm that complex tasks
show significantly different characteristics than simple tasks. Yet it seems to
be difficult to distinguish successful from unsuccessful search behaviors. Good
searchers can be differentiated from bad searchers by means of measurable
parameters. The implications of these findings for search engine vendors are
discussed.Comment: 60 page
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Comprehensive characterization of an open source document search engine
This work performs a thorough characterization and analysis of the open source Lucene search library. The article describes in detail the architecture, functionality, and micro-architectural behavior of the search engine, and investigates prominent online document search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput, explore the potential use of low power servers for document search, and examine the sources of performance degradation ands the causes of tail latencies. Some of our main conclusions are the following: (a) intra-server index partitioning can reduce tail latencies but with diminishing benefits as incoming query traffic increases, (b) low power servers given enough partitioning can provide same average and tail response times as conventional high performance servers, (c) index search is a CPU-intensive cache-friendly application, and (d) C-states are the main culprits for performance degradation in document search.Web of Science162art. no. 1
- …