19 research outputs found

    Using Search Engine Technology to Improve Library Catalogs

    Get PDF
    This chapter outlines how search engine technology can be used in online public access library catalogs (OPACs) to help improve users’ experiences, to identify users’ intentions, and to indicate how it can be applied in the library context, along with how sophisticated ranking criteria can be applied to the online library catalog. A review of the literature and current OPAC developments form the basis of recommendations on how to improve OPACs. Findings were that the major shortcomings of current OPACs are that they are not sufficiently user-centered and that their results presentations lack sophistication. Further, these shortcomings are not addressed in current 2.0 developments. It is argued that OPAC development should be made search-centered before additional features are applied. While the recommendations on ranking functionality and the use of user intentions are only conceptual and not yet applied to a library catalogue, practitioners will find recommendations for developing better OPACs in this chapter. In short, readers will find a systematic view on how the search engines’ strengths can be applied to improving libraries’ online catalogs

    Cross-Language Information Retrieval und automatische Sacherschließung in Suchmaschinen am Beispiel der Bielefeld Academic Search Engine (BASE)

    Get PDF
    Pieper D. Cross-Language Information Retrieval und automatische Sacherschließung in Suchmaschinen am Beispiel der Bielefeld Academic Search Engine (BASE). In: 97. Deutscher Bibliothekartag. Mannheim; 2008

    Effective tool for exploring web: An Evaluation of Search engines

    Get PDF
    Evaluation of search engines is necessary to check the retrieval performance of search engines and to differentiate search engines from one another. The ability to retrieve and to rank the relevant result lists can be done by the process of evaluation and this process can take place in two ways viz; human based methods where one can evaluate search engines manually to calculate the significance of the returned results but this method is time consuming and expensive, while as the second is automatic method where one can make use of various techniques like retrieval measures can be used to assess the performance of search engines

    What Users See – Structures in Search Engine Results Pages

    Get PDF
    This paper investigates the composition of search engine results pages. We define what elements the most popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to which degree they are used for popular vs. rare queries. Therefore, we send 500 queries of both types to the major search engines Google, Yahoo, Live.com and Ask. We count how often the different elements are used by the individual engines. In total, our study is based on 42,758 elements. Findings include that search engines use quite different approaches to results pages composition and therefore, the user gets to see quite different results sets depending on the search engine and search query used. Organic results still play the major role in the results pages, but different shortcuts are of some importance, too. Regarding the frequency of certain host within the results sets, we find that all search engines show Wikipedia results quite often, while other hosts shown depend on the search engine used. Both Google and Yahoo prefer results from their own offerings (such as YouTube or Yahoo Answers). Since we used the .com interfaces of the search engines, results may not be valid for other country-specific interfaces

    Can we use Google Scholar to identify highly-cited documents?

    Get PDF
    The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950 to 2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1,000 per year). The strong correlation between a document’s citations and its position in the search results (r= -0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively. This, combined with Google Scholar’s unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user’s interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholar’s operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers

    Can we use Google Scholar to identify highly-cited documents?

    Get PDF
    The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950 to 2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1,000 per year). The strong correlation between a document’s citations and its position in the search results (r= -0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively. This, combined with Google Scholar’s unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user’s interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholar’s operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers

    Using Search Engine Technology to Improve Library Catalogs

    Get PDF
    This chapter outlines how search engine technology can be used in online public access library catalogs (OPACs) to help improve users’ experiences, to identify users’ intentions, and to indicate how it can be applied in the library context, along with how sophisticated ranking criteria can be applied to the online library catalog. A review of the literature and current OPAC developments form the basis of recommendations on how to improve OPACs. Findings were that the major shortcomings of current OPACs are that they are not sufficiently user-centered and that their results presentations lack sophistication. Further, these shortcomings are not addressed in current 2.0 developments. It is argued that OPAC development should be made search-centered before additional features are applied. While the recommendations on ranking functionality and the use of user intentions are only conceptual and not yet applied to a library catalogue, practitioners will find recommendations for developing better OPACs in this chapter. In short, readers will find a systematic view on how the search engines’ strengths can be applied to improving libraries’ online catalogs

    Using Search Engine Technology to Improve Library Catalogs

    Get PDF
    This chapter outlines how search engine technology can be used in online public access library catalogs (OPACs) to help improve users experiences, to identify users intentions, and to indicate how it can be applied in the library context, along with how sophisticated ranking criteria can be applied to the online library catalog. A review of the literature and current OPAC developments form the basis of recommendations on how to improve OPACs. Findings were that the major shortcomings of current OPACs are that they are not sufficiently user-centered and that their results presentations lack sophistication. Further, these shortcomings are not addressed in current 2.0 developments. It is argued that OPAC development should be made search-centered before additional features are applied. While the recommendations on ranking functionality and the use of user intentions are only conceptual and not yet applied to a library catalogue, practitioners will find recommendations for developing better OPACs in this chapter. In short, readers will find a systematic view on how the search engines strengths can be applied to improving libraries online catalogs.Comment: Search engines, online catalogs, ranking, information seeking behavior, query type

    Influence of language and file type on the web visibility of top European universities

    Full text link
    Purpose The purpose of this paper is to detect whether both file type (a set of rich and web files) and language (English, Spanish, German, French and Italian) influence the web visibility of European universities. Design/methodology/approach A webometrics analysis of the top 200 European universities (as ranked in the Ranking web of World Universities) was carried out by a manual query for each official URL identified by using the Google search engine (April 2012). A correlation analysis between visibility and file format page count is offered according to language. Finally, a prediction of visibility is shown by using the SMOreg function. Findings The results indicate that Spanish and English are the languages that correlate most highly with web visibility. This correlation becomes greater though moderate when considering only PDF files. Research limitations/implications The results are limited due to the low correlation between overall page count and visibility. The lack of an accurate search engine that would assist in link counting procedures makes this process difficult. Originality/value An observed increase in correlation although moderate while analysing PDF files (in English and Spanish) is considered to be meaningful. This may indirectly confirm that specific file formats and languages generate different web visibility behaviour on European university web sites.Orduña Malea, E.; Ortega, JL.; Aguillo, IF. (2014). Influence of language and file type on the web visibility of top European universities. Aslib Journal of Information Management. 66(1):96-116. doi:10.1108/AJIM-02-2013-0018S96116661Aguillo, I.F. and Granadino, B. (2006), “Indicadores web para medir la presencia de las universidades en la Red”, Revista de universidad y Sociedad del Conocimiento, Vol. 3 No. 1, pp. 68-75.Aguillo, I.F. , Granadino, B. , Ortega, J.L. and Prieto, J.A. (2006), “Scientific research activity and communication measured with cybermetrics indicators”, Journal of the American Society for Information Science and Tecnology, Vol. 57 No. 10, pp. 1296-1302.Aguillo, I.F. , Ortega, J.L. and FernĂĄndez, M. (2008), “Webometric ranking of World universities: introduction, methodology, and future developments”, Higher Education in Europe, Vol. 33 Nos 2-3, pp. 233-244.Angus, E., Thelwall, M., & Stuart, D. (2008). General patterns of tag usage among university groups in Flickr. Online Information Review, 32(1), 89-101. doi:10.1108/14684520810866001Araujo Serna, L. and MartĂ­nez Romo, J. (2009), “DetecciĂłn de Web Spam basada en la recuperaciĂłn automĂĄtica de enlaces”, Procesamiento del lenguaje natural, No. 42, pp. 39-46.Bar-Ilan, J. (2002), “Methods for measuring search engine performance over time”, Journal of the American Society for Information Science and Technology, Vol. 53 No. 4, pp. 308-319.Bar-Ilan, J. (2005), “What do we know about links and linking? A framework for studying links in academic environments”, Information Processing & Management, Vol. 41 No. 3, pp. 973-986.Cho, Y. and GarcĂ­a-Molina, H. (2000), “The evolution of the web and implications for an incremental crawler”, Proceedings of the 26th International Conference on Very Large Data Bases, pp. 200-209.Fetterly, D. , Manasse, M. , Najork, M. and Wiener, J. (2003), “A large scale study of the evolution of web pages”, Proceedings of the Twelfth International Conference on World Wide Web, pp. 669-678.Garfield, E. (1967), “English – An international language for science?”, Current Contents, pp. 19-20.Gerrand, P. (2007), “Estimating linguistic diversity on the internet: a taxonomy to avoid pitfalls and paradoxes”, Journal of Computer-Mediated Communication, Vol. 12 No. 4, pp. 1298-1321.Ingwersen, P. (1998). The calculation of web impact factors. Journal of Documentation, 54(2), 236-243. doi:10.1108/eum0000000007167Koehler, W. (2004), “A longitudinal study of web pages continued: a consideration of document persistence”, Information Research, Vol. 9 No. 2.Kousha, K. , Thelwall, M. and Abdoli, M. (2012), “The role of online videos in research communication: a content analysis of YouTube videos cited in academic publications”, Journal of the American Society for Information Science and Technology, Vol. 63 No. 9, pp. 1710-1727.Kousha, K. , Thelwall, M. and Rezaie, S. (2010), “Using the web for research evaluation: the integrated online impact indicator”, Journal of Informetrics, Vol. 4 No. 1, pp. 124-135.Lawrence, S. and Giles, L. (1999), “Accessibility of information on the web”, Nature, Vol. 400, pp. 107-109.Lazarinis, F. (2007), “Web retrieval systems and the Greek language: do they have an understanding?”, Journal of information science, Vol. 33 No. 5, pp. 622-636.Lewandowski, D. (2008). Problems with the use of web search engines to find results in foreign languages. Online Information Review, 32(5), 668-672. doi:10.1108/14684520810914034Martins, B. and Silva, M.J. (2005), “Language identification in web pages”, Proceedings of the ACM Symposium of Applied Computing, Santa Fe, NM, ACM, New York, NY, pp. 764-768.Moukdad, H. and Cui, H. (2005), “How do search engines handle Chinese queries?”, Webology, Vol. 2 No. 3, p.Ntoulas, A. , Najork, M. , Manasse, M. and Fetterly, D. (2006), “Detecting spam web pages through content analysis”, Proceedings of the 15th International Conference on World Wide Web, AMA, New York, NY, pp. 83-92.O'Neill, E.T. , Lavoie, B.F. and Bennett, R. (2003), “Trends in the evolution of the public Web: 1998-2002”, D-Lib Magazine, Vol. 9 No. 4, available at: www.dlib.org/dlib/april03/lavoie/04lavoie.html (accessed 11 February 2013).Orduña-Malea, E. (2012), “Graphic, multimedia, and blog-content presence in the Spanish academic web-space”, Cybermetrics, Vol. 15, available at: http://cybermetrics.cindoc.csic.es/articles/v16i1p3.pdf (accessed 11 February 2013).Orduña-Malea, E. and Ontalba-RuipĂ©rez, J-A. (2013), “Proposal for a multilevel university cybermetric analysis model”, Scientometrics, Vol. 95 No. 3, pp. 863-884.Orduña-Malea, E. , Serrano-Cobos, J. , Ontalba-RuipĂ©rez, J-A. and Lloret-Romero, N. (2010), “Presencia y visibilidad web de las universidades pĂșblicas españolas”, Revista española de documentaciĂłn cientĂ­fica, Vol. 33 No. 2, pp. 246-278.Payne, N. and Thelwall, M. (2007), “A longitudinal study of academic webs: growth and stabilization”, Scientometrics, Vol. 71 No. 3, pp. 523-539.Seeber, M. , Lepori, B. , Lomi, A. , Aguillo, I. and Barberio, V. (2012), “Factors affecting web links between European higher education institutions”, Journal of Informetrics, Vol. 6, pp. 435-447.Thelwall, M. (2008a), “Bibliometrics to webometrics”, Journal of Information Science, Vol. 34 No. 4, pp. 605-621.Thelwall, M. (2008b), “Quantitative comparisons of search engine results”, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1702-1710.Thelwall, M. and Tang, R. (2003), “Disciplinary and linguistic considerations for academic web linking: an exploratory hyperlink mediated study with Mainland China and Taiwan”, Scientometrics, Vol. 58 No. 1, pp. 155-181.Thelwall, M. , Tang, R. and Price, L. (2003), “Linguistic patterns of academic web use in Western Europe”, Scientometrics, Vol. 56 No. 3, pp. 417-432.Vaughan, L. (2006), “Visualizing linguistic and cultural differences using web co-link data”, Journal of the American Society for Information Science and Technology, Vol. 57 No. 9, pp. 1178-1193.Vaughan, L. and Thelwall, M. (2004), “Search engine coverage bias: evidence and possible causes”, Information Processing & Management, Vol. 40 No. 4, pp. 693-707.Vaughan, L. and Zhang, Y. (2007), “Equal representation by search engines? A comparison of Web sites across countries and domains”, Journal of Computer-Mediated Communication, Vol. 12 No. 3, pp. 888-909.Wilkinson, D. , Harries, G. , Thelwall, M. and Price, L. (2003), “Motivations for academic web site interlinking: evidence for the web as a novel source of information on informal scholarly communication”, Journal of information science, Vol. 29 No. 1, pp. 49-56
    corecore