6,821 research outputs found

    Web log file analysis: backlinks and queries

    Get PDF
    As has been described else where, web log files are a useful source of information about visitor site use, navigation behaviour, and, to some extent, demographics. But log files can also reveal the existence of both web pages and search engine queries that are sources of new visitors.This study extracts such information from a single web log files and uses it to illustrate its value, not only to th site owner but also to those interested in investigating the online behaviour of web users

    Structured Metadata for Direct Resource Location: A Case Study

    Get PDF
    This paper proposes that for scientific and technical information resources, a well-structured and high-quality metadata record contains enough information to find that resource on the Internet, and as a consequence, no additional human labour is needed to create or maintain any links. Research was performed by creating a control group of records from the Online Catalogue of the Food and Agriculture Organization of the United Nations and searching them in various ways in Google and Metacrawler. Based on results, this method was revised and used on the larger AGRIS database. Results showed not only that the method is successful; it is also highly useful for searching citations. A user interface is suggested, and changes to current cataloguing rules are discussed

    Exploring the academic invisible web

    Get PDF
    Purpose: To provide a critical review of Bergman's 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on informetric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman's size estimate of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimate is given. Research limitations/implications: The precision of our estimate is limited due to a small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the Academic Invisible Web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search engines with data on the size and attributes of the Academic Invisible Web.Comment: 13 pages, 3 figure

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Optimising metadata to make high-value content more accessible to Google users

    Get PDF
    Purpose: This paper shows how information in digital collections that have been catalogued using high-quality metadata can be retrieved more easily by users of search engines such as Google. Methodology/approach: The research and proposals described arose from an investigation into the observed phenomenon that pages from the Glasgow Digital Library (gdl.cdlr.strath.ac.uk) were regularly appearing near the top of Google search results shortly after publication, without any deliberate effort to achieve this. The reasons for this phenomenon are now well understood and are described in the second part of the paper. The first part provides context with a review of the impact of Google and a summary of recent initiatives by commercial publishers to make their content more visible to search engines. Findings/practical implications: The literature research provides firm evidence of a trend amongst publishers to ensure that their online content is indexed by Google, in recognition of its popularity with Internet users. The practical research demonstrates how search engine accessibility can be compatible with use of established collection management principles and high-quality metadata. Originality/value: The concept of data shoogling is introduced, involving some simple techniques for metadata optimisation. Details of its practical application are given, to illustrate how those working in academic, cultural and public-sector organisations could make their digital collections more easily accessible via search engines, without compromising any existing standards and practices

    The Online Computer Library Center's Open WorldCat Program

    Get PDF
    This article describes the Online Computer Library Center???s (OCLC) Open WorldCat program. WorldCat is a worldwide union catalog created and maintained collectively by more than 9,000 member institutions. Open WorldCat seeks to make library collections and services visible and available through popular search engines such as Yahoo! and Google and other heavily used sites on the open Web. In this capacity, Open WorldCat provides an important central connection between the shared information of the library network and the Web. The article describes the history and rationale of the project; explains how Open WorldCat works for information seekers, participating libraries, and partners; and reports on what OCLC has learned from the program to date.published or submitted for publicatio
    corecore