3,424 research outputs found

    Search engine coverage bias: Evidence and possible causes

    Get PDF
    Commercial search engines are now playing an increasingly important role in Web information dissemination and access. Of particular interest to business and national governments is whether the big engines have coverage biased towards the US or other countries. In our study we tested for national biases in three major search engines and found significant differences in their coverage of commercial Web sites. The US sites were much better covered than the others in the study: sites from China, Taiwan and Singapore. We then examined the possible technical causes of the differences and found that the language of a site does not affect its coverage by search engines. However, the visibility of a site, measured by the number of links to it, affects its chance to be covered by search engines. We conclude that the coverage bias does exist but this is due not to deliberate choices of the search engines but occurs as a natural result of cumulative advantage effects of US sites on the Web. Nevertheless, the bias remains a cause for international concern. © 2003 Elsevier Ltd. All rights reserved

    Peer to Peer Information Retrieval: An Overview

    Get PDF
    Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

    Academic Libraries in Transition: Current Trends, Future Prospects

    Get PDF
    Academic libraries are in transition because of changes in the context of higher education. Changes in the world of information are even more radical: the displacement of paper, the primacy of the search engine, the emergence of the digital lifestyle, and innovative patterns of scholarly communication. Decreasing reliance on local collections is transforming the library as a physical destination.Traditional measures of library success have begun to be replaced. Given the superiority of other information professionals’ data management skills, the role of academic librarians will shift toward the enablement of learning.This environment of upheaval will pose both opportunities and challenges for academic librarians

    Information Retrieval and Query Ranking of Unstructured Data in Dataspace using Vector Space Model

    Get PDF
    There is a vast amount of data is available on the web in the form of WebPages, on the clouds or in the repositories of any organization. All data are stored digitally by any companies, enterprises or any organization, these data may be text data, streamed data, images, Facebook data, Twitter data, Videos and other documents available digitally on the Internet related any areas like manufacturing, engineering, medical, etc. collectively called Dataspace. The data available over the internet may be structured data, unstructured or without any format. The storing mechanism is different for each organization but searching and retrieval of data should be easy from the user�s point, they are able to find the relevant information efficiently and accurate information that should be satisfied them, so there should be a proper model, search engine or interface for finding the information. Retrieving information from the Internet and large databases are quite difficult and time-consuming especially if such information is unstructured. Several algorithms and techniques have been developed in the area of data mining and information retrieval yet retrieving data from large databases continue to be problematic. In this paper, the Vector Space Model (VSM) technique of information retrieval is used, by using VSM model documents and queries can be represented as a vector, whose dimension is considered as terms to build the index represent the unstructured data. VSM is widely used for retrieving the documents and data due to its simplicity and efficiency work on a large number of datasets. VSM is based on term weighting on document vectors using three steps 1) First step is used to create indexes of the documents to retrieve the relevant data, 2) In the second step weighting of the indexed terms is used to retrieve the appropriate document for the end user, and (3) In the Finally steps the similarity measures is between documents to rank the documents relevant to the end user query using. The cosine measure is often used. We then found out that it is easier to retrieve data or information based on their similarity measures and produces a better and more efficient technique or model for information retrieval

    Legal research in a changing information environment

    Get PDF
    Since the advent of the latest constitutional dispensation in South Africa, legalresearchers have been presented with new opportunities for research intoconstitutional issues, development and the relationship between constitutionallaw and other fields. This article investigates how information technologyapplications can support the legal research process and what the benefits oftechnology are likely to be to legal research. Furthermore, it investigates thechanges and the impact that electronic resources and the digital informationenvironment might have on legal research. This entails a study of the uniquecharacteristics of digital legal research and of the challenges that legalresearchers face in a changing information environment

    Scaling up search engine audits: Practical insights for algorithm auditing

    Get PDF
    Algorithm audits have increased in recent years due to a growing need to independently assess the performance of automatically curated services that process, filter and rank the large and dynamic amount of information available on the Internet. Among several methodologies to perform such audits, virtual agents stand out because they offer the ability to perform systematic experiments, simulating human behaviour without the associated costs of recruiting participants. Motivated by the importance of research transparency and replicability of results, this article focuses on the challenges of such an approach. It provides methodological details, recommendations, lessons learned and limitations based on our experience of setting up experiments for eight search engines (including main, news, image and video sections) with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections, with diverse experimental designs, and point to different changes and strategies that improve the quality of the method. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time, and we hope that this article can serve as a basis for further research in this area

    Economic consumption model revisited: infaq model based on al-Shaybani's levels of al-Kasb

    Get PDF
    This study attempts to investigate the economic ideas of al-Imām MuÍammad Ibn al-×asan al-Shaybānī (1986), focusing on his levels of al-Kasb. The study uses al-Shaybānī’s levels of al-Kasb to develop a theoretical Infāq model that integrates the material, spiritual, moral, social and legal dimensions. Thus the Infāq model is broader than the concept of consumption in modern economics. It also has some advantages over the Islamic consumption models developed by contemporary Muslim economists. The model identifies some major implications in terms of basic needs fulfillment, social Infāq and distributive justice. The primary features of this model are its simplicity and comprehensiveness. It is easy to understand yet it embodies the individual, social, material, spiritual, moral and legal dimensions into the individual’s spending decision making and behavior. The model is more realistic in understanding human behavior. It is growth friendly and instills the spirit of cooperation and social responsibility at the individual and social levels. It is suggested that future research further fine tune with some rigourous analysis JEL Classification: A13, A31, B11, D01, D10 Key words: Al-Kasb, Consumption, Infāq, Model buildin

    Overview of the CLEF 2018 Consumer Health Search Task

    Get PDF
    This paper details the collection, systems and evaluation methods used in the CLEF 2018 eHealth Evaluation Lab, Consumer Health Search (CHS) task (Task 3). This task investigates the effectiveness of search engines in providing access to medical information present on the Web for people that have no or little medical knowledge. The task aims to foster advances in the development of search technologies for Consumer Health Search by providing resources and evaluation methods to test and validate search systems. Built upon the the 2013-17 series of CLEF eHealth Information Retrieval tasks, the 2018 task considers both mono- and multilingual retrieval, embracing the Text REtrieval Conference (TREC) -style evaluation process with a shared collection of documents and queries, the contribution of runs from participants and the subsequent formation of relevance assessments and evaluation of the participants submissions. For this year, the CHS task uses a new Web corpus and a new set of queries compared to the previous years. The new corpus consists of Web pages acquired from the CommonCrawl and the new set of queries consists of 50 queries issued by the general public to the Health on the Net (HON) search services. We then manually translated the 50 queries to French, German, and Czech; and obtained English query variations of the 50 original queries. A total of 7 teams from 7 different countries participated in the 2018 CHS task: CUNI (Czech Republic), IMS Unipd (Italy), MIRACL (Tunisia), QUT (Australia), SINAI (Spain), UB-Botswana (Botswana), and UEvora (Portugal)
    corecore