40 research outputs found
Citation Analysis: A Comparison of Google Scholar, Scopus, and Web of Science
When faculty members are evaluated, they are judged in part by the impact and quality of their scholarly publications. While all academic institutions look to publication counts and venues as well as the subjective opinions of peers, many hiring, tenure, and promotion committees also rely on citation analysis to obtain a more objective assessment of an author’s work. Consequently, faculty members try to identify as many citations to their published works as possible to provide a comprehensive assessment of their publication impact on the scholarly and professional communities. The Institute for Scientific Information’s (ISI) citation databases, which are widely used as a starting point if not the only source for locating citations, have several limitations that may leave gaps in the coverage of citations to an author’s work. This paper presents a case study comparing citations found in Scopus and Google Scholar with those found in Web of Science (the portal used to search the three ISI citation databases) for items published by two Library and Information Science full-time faculty members. In addition, the paper presents a brief overview of a prototype system called CiteSearch, which analyzes combined data from multiple citation databases to produce citation-based quality evaluation measures
Quantifying Quality: Research Performance Evaluation in Korean Universities
Research performance evaluation in Korean universities follows strict guidelines that specify scoring systems for publication venue categories and formulas for co-authorship credit allocation. To find out how the standards differ across universities and how they differ from bibliometric research evaluation measures, this study analyzed 25 standards from major Korean universities and rankings produced by applying standards and bibliometric measures such as publication and citation counts, normalized impact score, and h-index to the publication data of 195 tenure-track professors of library and information science departments in 35 Korean universities. The study also introduced a novel impact score normalization method to refine the methodology from prior studies. The results showed the university standards to be mostly similar to one another but quite different from citation-driven measures, which suggests the standards are not quite successful in quantifying the quality of research as originally intended
Recommended from our members
WIDIT in TREC-2007 Blog Track: Combining Lexicon-based Methods to Detect Opinionated Blogs
In TREC-2007, Indiana University‟s WIDIT Lab1 participated in the Blog track‟s opinion task and the polarity subtask. For the opinion task, whose goal is to "uncover the public sentiment towards a given entity/target", we focused on combining multiple sources of evidence to detect opinionated blog postings. Since detecting opinionated blogs on a given topic (i.e., entity/target) involves not only retrieving topically relevant blogs but also identifying those that contain opinions about the target, our approach to the opinion finding task consisted of first applying traditional IR methods to retrieve on-topic blogs and then boosting the ranks of opinionated blogs based on combined opinion scores generated by multiple opinion detection methods. The key idea underlying our opinion detection method is to rely on a variety of complementary evidences rather than trying to optimize a single approach. This fusion approach to opinionated blog detection is motivated by our past experience that suggested no single approach, whether lexicon-based or classifier-driven, is well-suited for the blog opinion retrieval task. To accomplish the polarity subtask, which requires classification of the retrieved blogs into positive or negative orientation, our opinion detection module was extended to generate polarity scores to be used for polarity determination
Recommended from our members
WIDIT in TREC-2006 Blog track
Web Information Discovery Integrated Tool (WIDIT) Laboratory at the Indiana University School of Library and Information Science participated in the Blog track’s opinion task in TREC- 2006. The goal of opinion task is to "uncover the public sentiment towards a given entity/target", which involves not only retrieving topically relevant blogs but also identifying those that contain opinions about the target. To further complicate the matter, the blog test collection contains considerable amount of noise, such as blogs with non-English content and non-blog content (e.g., advertisement, navigational text), which may misdirect retrieval systems.
Based on our hypothesis that noise reduction (e.g., exclusion of non-English blogs, navigational text) will improve both on-topic and opinion retrieval performances, we explored various noise reduction approaches that can effectively eliminate the noise in blog data without inadvertently excluding valid content. After creating two separate indexes (with and without noise) to assess the noise reduction effect, we tackled the opinion blog retrieval task by breaking it down to two sequential subtasks: on-topic retrieval followed by opinion classification. Our opinion retrieval approach was to first apply traditional IR methods to retrieve on-topic blogs, and then boost the ranks of opinionated blogs based on opinion scores generated by opinion assessment methods. Our opinion module consists of Opinion Term Module, which identify opinions based on the frequency of opinion terms (i.e., terms that only occur frequently in opinion blogs), Rare Term Module, which uses uncommon/rare terms (e.g., “sooo good”) for opinion classification, IU Module, which uses IU (I and you) collocations, and Adjective-Verb Module, which uses computational linguistics’ distribution similarity approach to learn the subjective language from training data
Recommended from our members
Fusion Approach to Finding Opinions in Blogosphere
In this paper, we describe a fusion approach to finding opinion about a given target in blog postings. We tackled the opinion blog retrieval task by breaking it down to two sequential subtasks: on- topic retrieval followed by opinion classification. Our opinion retrieval approach was to first apply traditional IR methods to retrieve on-topic blogs, and then boost the ranks of opinionated blogs using combined opinion scores generated by four opinion assessment methods. Our opinion module consists of Opinion Term Module, which identify opinions based on the frequency of opinion terms (i.e., terms that only occur frequently in opinion blogs), Rare Term Module, which uses uncommon/rare terms (e.g., “sooo good”) for opinion classification, IU Module, which uses IU (I and you) collocations, and Adjective-Verb Module, which uses computational linguistics’ distribution similarity approach to learn the subjective language from training data.This paper was presented by the author(s) at the International Conference on Weblogs and Social Media on March 27, 2007, in Boulder, Colorado, U.S.A. This paper has also been published as: Yang, K., Yu, N., Valerio, A., Zhang, H., & Ke, W. (2007). Fusion approach to finding opinionated blogs. Proceedings of the American Society for Information Science and Technology, 44(1), 1–14. doi: 10.1002/meet.1450440254Keywords: Opinion Identification, Method Fusion, Rank-boosting, Dynamic Tunin
Recommended from our members
Fusion Approach to Finding Opinionated Blogs
In this paper, we describe a fusion approach to finding opinionated blog postings. Our approach to opinion blog retrieval consisted of first applying traditional IR methods to retrieve on-topic blogs and then boosting the ranks of opinionated blogs based on combined opinion scores generated by multiple assessment methods. Our opinion module is composed of the Opinion Term Module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinion blogs), the Rare Term Module, which uses uncommon/rare terms (e.g., “sooo good”) for opinion classification, the IU Module, which uses IU (I and you) collocations, and the Adjective-Verb Module, which uses computational linguistics’ distribution similarity approach to learn the subjective language from training data
Recommended from our members
WIDIT in TREC-2005 HARD, Robust, and SPAM tracks
Web Information Discovery Tool (WIDIT) Laboratory at the Indiana University School of Library and Information Science participated in the HARD, Robust, and SPAM tracks in TREC- 2005. The basic approach of WIDIT is to combine multiple methods as well as to leverage multiple sources of evidence. Our main strategies for the tracks were: query expansion and fusion optimization for the HARD and Robust tracks; and combination of probabilistic, rule-based, pattern-based, and blacklist email filters for the SPAM track
Impact of Data Sources on Citation Counts and Rankings of LIS Faculty: Web of Science vs. Scopus and Google Scholar
The Institute for Scientific Information's (ISI) citation databases have been used for decades as a starting point and often as the only tools for locating citations and/or conducting citation analyses. ISI databases (or Web of Science [WoS]), however, may no longer be sufficient because new databases and tools that allow citation searching are now available. Using citations to the work of 25 library and information science faculty members as a case study, this paper examines the effects of using Scopus and Google Scholar (GS) on the citation counts and rankings of scholars as measured by WoS. Overall, more than 10,000 citing and purportedly citing documents were examined. Results show that Scopus significantly alters the relative ranking of those scholars that appear in the middle of the rankings and that GS stands out in its coverage of conference proceedings as well as international, non-English language journals. The use of Scopus and GS, in addition to WoS, helps reveal a more accurate and comprehensive picture of the scholarly impact of authors. WoS data took about 100 hours of collecting and processing time, Scopus consumed 200 hours, and GS a grueling 3,000 hours