96,398 research outputs found
Evaluating the retrieval effectiveness of Web search engines using a representative query sample
Search engine retrieval effectiveness studies are usually small-scale, using
only limited query samples. Furthermore, queries are selected by the
researchers. We address these issues by taking a random representative sample
of 1,000 informational and 1,000 navigational queries from a major German
search engine and comparing Google's and Bing's results based on this sample.
Jurors were found through crowdsourcing, data was collected using specialised
software, the Relevance Assessment Tool (RAT). We found that while Google
outperforms Bing in both query types, the difference in the performance for
informational queries was rather low. However, for navigational queries, Google
found the correct answer in 95.3 per cent of cases whereas Bing only found the
correct answer 76.6 per cent of the time. We conclude that search engine
performance on navigational queries is of great importance, as users in this
case can clearly identify queries that have returned correct results. So,
performance on this query type may contribute to explaining user satisfaction
with search engines
New perspectives on Web search engine research
Purpose–The purpose of this chapter is to give an overview of the context of Web search and search engine-related research, as well as to introduce the reader to the sections and chapters of the book. Methodology/approach–We review literature dealing with various aspects of search engines, with special emphasis on emerging areas of Web searching, search engine evaluation going beyond traditional methods, and new perspectives on Webs earching. Findings–The approaches to studying Web search engines are manifold. Given the importance of Web search engines for knowledge acquisition, research from different perspectives needs to be integrated into a more cohesive perspective. Researchlimitations/implications–The chapter suggests a basis for research in the field and also introduces further research directions. Originality/valueofpaper–The chapter gives a concise overview of the topics dealt with in the book and also shows directions for researchers interested in Web search engines
A three-year study on the freshness of Web search engine databases
This paper deals with one aspect of the index quality of search engines: index freshness. The purpose is to analyse the update strategies of the major Web search engines Google, Yahoo, and MSN/Live.com. We conducted a test of the
updates of 40 daily updated pages and 30 irregularly updated pages, respectively. We used data from a time span of six weeks in the years 2005, 2006, and 2007. We found that the best search engine in terms of up-to-dateness changes over the years and that none of the engines has an ideal solution for index freshness. Frequency distributions for the pages’ ages are skewed, which means that search engines do differentiate between often- and seldom-updated pages. This is confirmed by the difference between the average ages of daily updated pages and our control group of pages. Indexing patterns are often irregular, and there seems to be no clear policy regarding when to revisit Web pages. A major problem identified in our research is the delay in making crawled pages available for searching, which differs from one engine to another
Ordinary Search Engine Users Carrying Out Complex Search Tasks
Web search engines have become the dominant tools for finding information on
the Internet. Due to their popularity, users apply them to a wide range of
search needs, from simple look-ups to rather complex information tasks. This
paper presents the results of a study to investigate the characteristics of
these complex information needs in the context of Web search engines. The aim
of the study is to find out more about (1) what makes complex search tasks
distinct from simple tasks and if it is possible to find simple measures for
describing their complexity, (2) if search success for a task can be predicted
by means of unique measures, and (3) if successful searchers show a different
behavior than unsuccessful ones. The study includes 60 people who carried out a
set of 12 search tasks with current commercial search engines. Their behavior
was logged with the Search-Logger tool. The results confirm that complex tasks
show significantly different characteristics than simple tasks. Yet it seems to
be difficult to distinguish successful from unsuccessful search behaviors. Good
searchers can be differentiated from bad searchers by means of measurable
parameters. The implications of these findings for search engine vendors are
discussed.Comment: 60 page
The Freshness of Web search engines’ databases
This study measures the frequency in which search engines update their indices. Therefore, 38 websites that are updated on a daily basis were analysed within a time-span of six weeks. The analysed search engines were Google, Yahoo and MSN. We find that Google performs best overall with the most pages updated on a daily basis, but only MSN is able to update all pages within a time-span of less than 20 days. Both other engines have outliers that are quite older. In terms of indexing patterns, we find different approaches at the different engines: While MSN shows clear update patterns, Google shows some outliers and the update process of the Yahoo index seems to be quite chaotic. Implications are that the quality of different search engine indices varies and not only one engine should be used when searching for current content
Multimedia Chinese Web Search Engines: A Survey
The objective of this paper is to explore the state of multimedia search functionality on major general and dedicated Web search engines in Chinese language. The authors studied: a) how many Chinese Web search engines presently make use of multimedia searching, and b) the type of multimedia search functionality available. Specifically, the following were examined: a) multimedia features - features allowing multimedia search; and b) extent of personalization - the extent to which a search engine Web site allows users to control multimedia search. Overall, Chinese Web search engines offer limited multimedia searching functionality. The significance of the study is based on two factors: a) little research has been conducted on Chinese Web search engines, and b) the instrument used in the study and the results obtained by this research could help users, Web designers, and Web search engine developers. By large, general Web search engines support more multimedia features than specialized one
Why People Search for Images using Web Search Engines
What are the intents or goals behind human interactions with image search
engines? Knowing why people search for images is of major concern to Web image
search engines because user satisfaction may vary as intent varies. Previous
analyses of image search behavior have mostly been query-based, focusing on
what images people search for, rather than intent-based, that is, why people
search for images. To date, there is no thorough investigation of how different
image search intents affect users' search behavior.
In this paper, we address the following questions: (1)Why do people search
for images in text-based Web image search systems? (2)How does image search
behavior change with user intent? (3)Can we predict user intent effectively
from interactions during the early stages of a search session? To this end, we
conduct both a lab-based user study and a commercial search log analysis.
We show that user intents in image search can be grouped into three classes:
Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals
different user behavior patterns under these three intents, such as first click
time, query reformulation, dwell time and mouse movement on the result page.
Based on user interaction features during the early stages of an image search
session, that is, before mouse scroll, we develop an intent classifier that is
able to achieve promising results for classifying intents into our three intent
classes. Given that all features can be obtained online and unobtrusively, the
predicted intents can provide guidance for choosing ranking methods immediately
after scrolling
What Users See – Structures in Search Engine Results Pages
This paper investigates the composition of search engine results pages. We define what elements the most
popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to
which degree they are used for popular vs. rare queries. Therefore, we send 500 queries of both types to the
major search engines Google, Yahoo, Live.com and Ask. We count how often the different elements are used by
the individual engines. In total, our study is based on 42,758 elements. Findings include that search engines use
quite different approaches to results pages composition and therefore, the user gets to see quite different results
sets depending on the search engine and search query used. Organic results still play the major role in the results
pages, but different shortcuts are of some importance, too. Regarding the frequency of certain host within the
results sets, we find that all search engines show Wikipedia results quite often, while other hosts shown depend
on the search engine used. Both Google and Yahoo prefer results from their own offerings (such as YouTube or
Yahoo Answers). Since we used the .com interfaces of the search engines, results may not be valid for other
country-specific interfaces
Meeting of the MINDS: an information retrieval research agenda
Since its inception in the late 1950s, the field of Information Retrieval (IR) has developed tools that help people find, organize, and analyze information. The key early influences on the field are well-known. Among them are H. P. Luhn's pioneering work, the development of the vector space retrieval model by Salton and his students, Cleverdon's development of the Cranfield experimental methodology, Spärck Jones' development of idf, and a series of probabilistic retrieval models by Robertson and Croft. Until the development of the WorldWideWeb (Web), IR was of greatest interest to professional information analysts such as librarians, intelligence analysts, the legal community, and the pharmaceutical industry
- …