19 research outputs found
Using Search Engine Technology to Improve Library Catalogs
This chapter outlines how search engine technology can be used in online public access library
catalogs (OPACs) to help improve usersâ experiences, to identify usersâ intentions, and to indicate
how it can be applied in the library context, along with how sophisticated ranking criteria can be
applied to the online library catalog. A review of the literature and current OPAC developments
form the basis of recommendations on how to improve OPACs. Findings were that the major
shortcomings of current OPACs are that they are not sufficiently user-centered and that their results
presentations lack sophistication. Further, these shortcomings are not addressed in current 2.0
developments. It is argued that OPAC development should be made search-centered before
additional features are applied. While the recommendations on ranking functionality and the use of
user intentions are only conceptual and not yet applied to a library catalogue, practitioners will find
recommendations for developing better OPACs in this chapter. In short, readers will find a
systematic view on how the search enginesâ strengths can be applied to improving librariesâ online
catalogs
Cross-Language Information Retrieval und automatische SacherschlieĂung in Suchmaschinen am Beispiel der Bielefeld Academic Search Engine (BASE)
Pieper D. Cross-Language Information Retrieval und automatische SacherschlieĂung in Suchmaschinen am Beispiel der Bielefeld Academic Search Engine (BASE). In: 97. Deutscher Bibliothekartag. Mannheim; 2008
Effective tool for exploring web: An Evaluation of Search engines
Evaluation of search engines is necessary to check the retrieval performance of search engines and to differentiate search engines from one another. The ability to retrieve and to rank the relevant result lists can be done by the process of evaluation and this process can take place in two ways viz; human based methods where one can evaluate search engines manually to calculate the significance of the returned results but this method is time consuming and expensive, while as the second is automatic method where one can make use of various techniques like retrieval measures can be used to assess the performance of search engines
What Users See â Structures in Search Engine Results Pages
This paper investigates the composition of search engine results pages. We define what elements the most
popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to
which degree they are used for popular vs. rare queries. Therefore, we send 500 queries of both types to the
major search engines Google, Yahoo, Live.com and Ask. We count how often the different elements are used by
the individual engines. In total, our study is based on 42,758 elements. Findings include that search engines use
quite different approaches to results pages composition and therefore, the user gets to see quite different results
sets depending on the search engine and search query used. Organic results still play the major role in the results
pages, but different shortcuts are of some importance, too. Regarding the frequency of certain host within the
results sets, we find that all search engines show Wikipedia results quite often, while other hosts shown depend
on the search engine used. Both Google and Yahoo prefer results from their own offerings (such as YouTube or
Yahoo Answers). Since we used the .com interfaces of the search engines, results may not be valid for other
country-specific interfaces
Can we use Google Scholar to identify highly-cited documents?
The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950 to 2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1,000 per year). The strong correlation between a documentâs citations and its position in the search results (r= -0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively.
This, combined with Google Scholarâs unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the userâs interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholarâs operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers
Can we use Google Scholar to identify highly-cited documents?
The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950 to 2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1,000 per year). The strong correlation between a documentâs citations and its position in the search results (r= -0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively.
This, combined with Google Scholarâs unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the userâs interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholarâs operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers
Using Search Engine Technology to Improve Library Catalogs
This chapter outlines how search engine technology can be used in online public access library
catalogs (OPACs) to help improve usersâ experiences, to identify usersâ intentions, and to indicate
how it can be applied in the library context, along with how sophisticated ranking criteria can be
applied to the online library catalog. A review of the literature and current OPAC developments
form the basis of recommendations on how to improve OPACs. Findings were that the major
shortcomings of current OPACs are that they are not sufficiently user-centered and that their results
presentations lack sophistication. Further, these shortcomings are not addressed in current 2.0
developments. It is argued that OPAC development should be made search-centered before
additional features are applied. While the recommendations on ranking functionality and the use of
user intentions are only conceptual and not yet applied to a library catalogue, practitioners will find
recommendations for developing better OPACs in this chapter. In short, readers will find a
systematic view on how the search enginesâ strengths can be applied to improving librariesâ online
catalogs
Using Search Engine Technology to Improve Library Catalogs
This chapter outlines how search engine technology can be used in online
public access library catalogs (OPACs) to help improve users experiences, to
identify users intentions, and to indicate how it can be applied in the library
context, along with how sophisticated ranking criteria can be applied to the
online library catalog. A review of the literature and current OPAC
developments form the basis of recommendations on how to improve OPACs.
Findings were that the major shortcomings of current OPACs are that they are
not sufficiently user-centered and that their results presentations lack
sophistication. Further, these shortcomings are not addressed in current 2.0
developments. It is argued that OPAC development should be made search-centered
before additional features are applied. While the recommendations on ranking
functionality and the use of user intentions are only conceptual and not yet
applied to a library catalogue, practitioners will find recommendations for
developing better OPACs in this chapter. In short, readers will find a
systematic view on how the search engines strengths can be applied to improving
libraries online catalogs.Comment: Search engines, online catalogs, ranking, information seeking
behavior, query type
Influence of language and file type on the web visibility of top European universities
Purpose The purpose of this paper is to detect whether both file type (a set of rich and web files) and language (English, Spanish, German, French and Italian) influence the web visibility of European universities.
Design/methodology/approach A webometrics analysis of the top 200 European universities (as ranked in the Ranking web of World Universities) was carried out by a manual query for each official URL identified by using the Google search engine (April 2012). A correlation analysis between visibility and file format page count is offered according to language. Finally, a prediction of visibility is shown by using the SMOreg function.
Findings The results indicate that Spanish and English are the languages that correlate most highly with web visibility. This correlation becomes greater though moderate when considering only PDF files.
Research limitations/implications The results are limited due to the low correlation between overall page count and visibility. The lack of an accurate search engine that would assist in link counting procedures makes this process difficult.
Originality/value An observed increase in correlation although moderate while analysing PDF files (in English and Spanish) is considered to be meaningful. This may indirectly confirm that specific file formats and languages generate different web visibility behaviour on European university web sites.Orduña Malea, E.; Ortega, JL.; Aguillo, IF. (2014). Influence of language and file type on the web visibility of top European universities. Aslib Journal of Information Management. 66(1):96-116. doi:10.1108/AJIM-02-2013-0018S96116661Aguillo, I.F. and Granadino, B. (2006), âIndicadores web para medir la presencia de las universidades en la Redâ, Revista de universidad y Sociedad del Conocimiento, Vol. 3 No. 1, pp. 68-75.Aguillo, I.F. , Granadino, B. , Ortega, J.L. and Prieto, J.A. (2006), âScientific research activity and communication measured with cybermetrics indicatorsâ, Journal of the American Society for Information Science and Tecnology, Vol. 57 No. 10, pp. 1296-1302.Aguillo, I.F. , Ortega, J.L. and FernĂĄndez, M. (2008), âWebometric ranking of World universities: introduction, methodology, and future developmentsâ, Higher Education in Europe, Vol. 33 Nos 2-3, pp. 233-244.Angus, E., Thelwall, M., & Stuart, D. (2008). General patterns of tag usage among university groups in Flickr. Online Information Review, 32(1), 89-101. doi:10.1108/14684520810866001Araujo Serna, L. and MartĂnez Romo, J. (2009), âDetecciĂłn de Web Spam basada en la recuperaciĂłn automĂĄtica de enlacesâ, Procesamiento del lenguaje natural, No. 42, pp. 39-46.Bar-Ilan, J. (2002), âMethods for measuring search engine performance over timeâ, Journal of the American Society for Information Science and Technology, Vol. 53 No. 4, pp. 308-319.Bar-Ilan, J. (2005), âWhat do we know about links and linking? A framework for studying links in academic environmentsâ, Information Processing & Management, Vol. 41 No. 3, pp. 973-986.Cho, Y. and GarcĂa-Molina, H. (2000), âThe evolution of the web and implications for an incremental crawlerâ, Proceedings of the 26th International Conference on Very Large Data Bases, pp. 200-209.Fetterly, D. , Manasse, M. , Najork, M. and Wiener, J. (2003), âA large scale study of the evolution of web pagesâ, Proceedings of the Twelfth International Conference on World Wide Web, pp. 669-678.Garfield, E. (1967), âEnglish â An international language for science?â, Current Contents, pp. 19-20.Gerrand, P. (2007), âEstimating linguistic diversity on the internet: a taxonomy to avoid pitfalls and paradoxesâ, Journal of Computer-Mediated Communication, Vol. 12 No. 4, pp. 1298-1321.Ingwersen, P. (1998). The calculation of web impact factors. Journal of Documentation, 54(2), 236-243. doi:10.1108/eum0000000007167Koehler, W. (2004), âA longitudinal study of web pages continued: a consideration of document persistenceâ, Information Research, Vol. 9 No. 2.Kousha, K. , Thelwall, M. and Abdoli, M. (2012), âThe role of online videos in research communication: a content analysis of YouTube videos cited in academic publicationsâ, Journal of the American Society for Information Science and Technology, Vol. 63 No. 9, pp. 1710-1727.Kousha, K. , Thelwall, M. and Rezaie, S. (2010), âUsing the web for research evaluation: the integrated online impact indicatorâ, Journal of Informetrics, Vol. 4 No. 1, pp. 124-135.Lawrence, S. and Giles, L. (1999), âAccessibility of information on the webâ, Nature, Vol. 400, pp. 107-109.Lazarinis, F. (2007), âWeb retrieval systems and the Greek language: do they have an understanding?â, Journal of information science, Vol. 33 No. 5, pp. 622-636.Lewandowski, D. (2008). Problems with the use of web search engines to find results in foreign languages. Online Information Review, 32(5), 668-672. doi:10.1108/14684520810914034Martins, B. and Silva, M.J. (2005), âLanguage identification in web pagesâ, Proceedings of the ACM Symposium of Applied Computing, Santa Fe, NM, ACM, New York, NY, pp. 764-768.Moukdad, H. and Cui, H. (2005), âHow do search engines handle Chinese queries?â, Webology, Vol. 2 No. 3, p.Ntoulas, A. , Najork, M. , Manasse, M. and Fetterly, D. (2006), âDetecting spam web pages through content analysisâ, Proceedings of the 15th International Conference on World Wide Web, AMA, New York, NY, pp. 83-92.O'Neill, E.T. , Lavoie, B.F. and Bennett, R. (2003), âTrends in the evolution of the public Web: 1998-2002â, D-Lib Magazine, Vol. 9 No. 4, available at: www.dlib.org/dlib/april03/lavoie/04lavoie.html (accessed 11 February 2013).Orduña-Malea, E. (2012), âGraphic, multimedia, and blog-content presence in the Spanish academic web-spaceâ, Cybermetrics, Vol. 15, available at: http://cybermetrics.cindoc.csic.es/articles/v16i1p3.pdf (accessed 11 February 2013).Orduña-Malea, E. and Ontalba-RuipĂ©rez, J-A. (2013), âProposal for a multilevel university cybermetric analysis modelâ, Scientometrics, Vol. 95 No. 3, pp. 863-884.Orduña-Malea, E. , Serrano-Cobos, J. , Ontalba-RuipĂ©rez, J-A. and Lloret-Romero, N. (2010), âPresencia y visibilidad web de las universidades pĂșblicas españolasâ, Revista española de documentaciĂłn cientĂfica, Vol. 33 No. 2, pp. 246-278.Payne, N. and Thelwall, M. (2007), âA longitudinal study of academic webs: growth and stabilizationâ, Scientometrics, Vol. 71 No. 3, pp. 523-539.Seeber, M. , Lepori, B. , Lomi, A. , Aguillo, I. and Barberio, V. (2012), âFactors affecting web links between European higher education institutionsâ, Journal of Informetrics, Vol. 6, pp. 435-447.Thelwall, M. (2008a), âBibliometrics to webometricsâ, Journal of Information Science, Vol. 34 No. 4, pp. 605-621.Thelwall, M. (2008b), âQuantitative comparisons of search engine resultsâ, Journal of the American Society for Information Science and Technology, Vol. 59 No. 11, pp. 1702-1710.Thelwall, M. and Tang, R. (2003), âDisciplinary and linguistic considerations for academic web linking: an exploratory hyperlink mediated study with Mainland China and Taiwanâ, Scientometrics, Vol. 58 No. 1, pp. 155-181.Thelwall, M. , Tang, R. and Price, L. (2003), âLinguistic patterns of academic web use in Western Europeâ, Scientometrics, Vol. 56 No. 3, pp. 417-432.Vaughan, L. (2006), âVisualizing linguistic and cultural differences using web co-link dataâ, Journal of the American Society for Information Science and Technology, Vol. 57 No. 9, pp. 1178-1193.Vaughan, L. and Thelwall, M. (2004), âSearch engine coverage bias: evidence and possible causesâ, Information Processing & Management, Vol. 40 No. 4, pp. 693-707.Vaughan, L. and Zhang, Y. (2007), âEqual representation by search engines? A comparison of Web sites across countries and domainsâ, Journal of Computer-Mediated Communication, Vol. 12 No. 3, pp. 888-909.Wilkinson, D. , Harries, G. , Thelwall, M. and Price, L. (2003), âMotivations for academic web site interlinking: evidence for the web as a novel source of information on informal scholarly communicationâ, Journal of information science, Vol. 29 No. 1, pp. 49-56