7 research outputs found

    Evaluating the retrieval effectiveness of Web search engines using a representative query sample

    Full text link
    Search engine retrieval effectiveness studies are usually small-scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, data was collected using specialised software, the Relevance Assessment Tool (RAT). We found that while Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3 per cent of cases whereas Bing only found the correct answer 76.6 per cent of the time. We conclude that search engine performance on navigational queries is of great importance, as users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines

    Does it matter which search engine is used? A user study using post-task relevance judgments

    Full text link
    The objective of this research was to find out how the two search engines Google and Bing perform when users work freely on pre-defined tasks, and judge the relevance of the results immediately after finishing their search session. In a user study, 64 participants conducted two search tasks each, and then judged the results on the following: (1) The quality of the results they selected in their search sessions, (2) The quality of the results they were presented with in their search sessions (but which they did not click on), (3) The quality of the results from the competing search engine for their queries (which they did not see in their search session). We found that users heavily relied on Google, that Google produced more relevant results than Bing, that users were well able to select relevant results from the results lists, and that users judged the relevance of results lower when they regarded a task as difficult and did not find the correct information

    Simple questions for complex matters? An enquiry into Swedish Google search queries on wind power

    Get PDF
    Renewable energy sources have emerged as a current subject matter in Sweden amidst discussions regarding energy costs, climate change and development of energy production. This study explores how Google Search is used for seeking information about wind power and how utilised search queries contribute to the understanding of this energy source. Adopting a practice theoretical perspective, the study explores search queries as doings and sayings, and understands search engines as an established part of everyday routinised information seeking-activities. Data collection was carried out in a trace ethnographic vein through the automatic retrieval of search queries enacted between November 2021 and October 2022. Through a digital methods approach, the search queries were analysed and visualised according to their prevalence and character string composition. A qualitative, multiple coding approach was moreover used for the identification and interpretation of themes and subthemes. The results show that geographical locations, wind power functions and small wind turbines comprise the most prominent subthemes of the search queries. This is replicated also in the search term frequencies, providing further insights to queries related to wind turbine’s efficiency as well as subthemes of advantages and disadvantages. Moreover, the study shows the tendency to phrase search queries as simple questions for complex matters, with nuances being lost in the pursuit of austere, uncomplicated answers. Altogether, the results contribute to a wider understanding of how environmental information seeking is conducted today

    Invisible Search and Online Search Engines

    Get PDF
    " Invisible Search and Online Search Engines considers the use of search engines in contemporary everyday life and the challenges this poses for media and information literacy. Looking for mediated information is mostly done online and arbitrated by the various tools and devices that people carry with them on a daily basis. Because of this, search engines have a significant impact on the structure of our lives, and personal and public memories. Haider and Sundin consider what this means for society, whilst also uniting research on information retrieval with research on how people actually look for and encounter information. Search engines are now one of society’s key infrastructures for knowing and becoming informed. While their use is dispersed across myriads of social practices, where they have acquired close to naturalised positions, they are commercially and technically centralised. Arguing that search, searching, and search engines have become so widely used that we have stopped noticing them, Haider and Sundin consider what it means to be so reliant on this all-encompassing and increasingly invisible information infrastructure. Invisible Search and Online Search Engines is the first book to approach search and search engines from a perspective that combines insights from the technical expertise of information science research with a social science and humanities approach. As such, the book should be essential reading for academics, researchers, and students working on and studying information science, library and information science (LIS), media studies, journalism, digital cultures, and educational sciences.

    Building query-based relevance sets without human intervention

    Get PDF
    A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophycollections are the standard framework used in the evaluation of an information retrieval system and the comparison between different systems. A text test collection consists of a set of documents, a set of topics, and a set of relevance assessments which is a list indicating the relevance of each document to each topic. Traditionally, forming the relevance assessments is done manually by human judges. But in large scale environments, such as the web, examining each document retrieved to determine its relevance is not possible. In the past there have been several studies that aimed to reduce the human effort required in building these assessments which are referred to as qrels (query-based relevance sets). Some research has also been done to completely automate the process of generating the qrels. In this thesis, we present different methodologies that lead to producing the qrels automatically without any human intervention. A first method is based on keyphrase (KP) extraction from documents presumed relevant; a second method uses Machine Learning classifiers, Naïve Bayes and Support Vector Machines. The experiments were conducted on the TREC-6, TREC-7 and TREC-8 test collections. The use of machine learning classifiers produced qrels resulting in information retrieval system rankings which were better correlated with those produced by TREC human assessments than any of the automatic techniques proposed in the literature. In order to produce a test collection which could discriminate between the best performing systems, an enhancement to the machine learning technique was made that used a small number of real or actual qrels as training sets for the classifiers. These actual relevant documents were selected by Losada et al.’s (2016) pooling technique. This modification led to an improvement in the overall system rankings and enabled discrimination between the best systems with only a little human effort. We also used the bpref-10 and infAP measures for evaluating the systems and comparing between the rankings, since they are more robust in incomplete judgment environments. We applied our new techniques to the French and Finnish test collections from CLEF2003 in order to confirm their reproducibility on non-English languages, and we achieved high correlations as seen for English
    corecore