3,062 research outputs found
Identifying Citation Sentiment and its Influence while Indexing Scientific Papers
Sentiment analysis has proven to be a popular research area for analyzing social media texts, newspaper articles, and product reviews. However, sentiment analysis of citation instances is a relatively unexplored area of research. For scientific papers, it is often assumed that the sentiment associated with citation instances is inherently positive. This assumption is due to the hedged nature of sentiment in citations, which is difficult to identify and classify. As a result, most of the existing indexes focus only on the frequency of citation. In this paper, we highlight the importance of considering the sentiment of citation while preparing ranking indexes for scientific literature. We perform automatic sentiment classification of citation instances on the ACL Anthology collection of papers. Next, we use the sentiment score in addition to the frequency of citation to build a ranking index for this collection of scientific papers. By using various baselines, we highlight the impact of our index on the ACL Anthology collection of papers. Our research contributes toward building more sentiment sensitive ranking index which better underlines the influence and usefulness of research papers
Measuring academic influence: Not all citations are equal
The importance of a research article is routinely measured by counting how
many times it has been cited. However, treating all citations with equal weight
ignores the wide variety of functions that citations perform. We want to
automatically identify the subset of references in a bibliography that have a
central academic influence on the citing paper. For this purpose, we examine
the effectiveness of a variety of features for determining the academic
influence of a citation. By asking authors to identify the key references in
their own work, we created a data set in which citations were labeled according
to their academic influence. Using automatic feature selection with supervised
machine learning, we found a model for predicting academic influence that
achieves good performance on this data set using only four features. The best
features, among those we evaluated, were those based on the number of times a
reference is mentioned in the body of a citing paper. The performance of these
features inspired us to design an influence-primed h-index (the hip-index).
Unlike the conventional h-index, it weights citations by how many times a
reference is mentioned. According to our experiments, the hip-index is a better
indicator of researcher performance than the conventional h-index
Recommended from our members
Incidental or influential? â A decade of using text-mining for citation function classification.
This work looks in depth at several studies that have attempted to automate the process of citation importance classification based on the publicationsâ full text. We offer a comparison of their individual similarities, strengths and weaknesses. We analyse a range of features that have been previously used in this task. Our experimental results confirm that the number of in-text references are highly predictive of influence. Contrary to the work of Valenzuela et al. (2015), we find abstract similarity one of the most predictive features. Overall, we show that many of the features previously described in literature have been either reported as not particularly predictive, cannot be reproduced based on their existing descriptions or should not be used due to their reliance on external changing evidence. Additionally, we find significant variance in the results provided by the PDF extraction tools used in the pre-processing stages of citation extraction. This has a direct and significant impact on the classification features that rely on this extraction process. Consequently, we discuss challenges and potential improvements in the classification pipeline, provide a critical review of the performance of individual features and address the importance of constructing a large-scale gold-standard reference dataset
Characterizing Interdisciplinarity of Researchers and Research Topics Using Web Search Engines
Researchers' networks have been subject to active modeling and analysis.
Earlier literature mostly focused on citation or co-authorship networks
reconstructed from annotated scientific publication databases, which have
several limitations. Recently, general-purpose web search engines have also
been utilized to collect information about social networks. Here we
reconstructed, using web search engines, a network representing the relatedness
of researchers to their peers as well as to various research topics.
Relatedness between researchers and research topics was characterized by
visibility boost-increase of a researcher's visibility by focusing on a
particular topic. It was observed that researchers who had high visibility
boosts by the same research topic tended to be close to each other in their
network. We calculated correlations between visibility boosts by research
topics and researchers' interdisciplinarity at individual level (diversity of
topics related to the researcher) and at social level (his/her centrality in
the researchers' network). We found that visibility boosts by certain research
topics were positively correlated with researchers' individual-level
interdisciplinarity despite their negative correlations with the general
popularity of researchers. It was also found that visibility boosts by
network-related topics had positive correlations with researchers' social-level
interdisciplinarity. Research topics' correlations with researchers'
individual- and social-level interdisciplinarities were found to be nearly
independent from each other. These findings suggest that the notion of
"interdisciplinarity" of a researcher should be understood as a
multi-dimensional concept that should be evaluated using multiple assessment
means.Comment: 20 pages, 7 figures. Accepted for publication in PLoS On
The linguistic patterns and rhetorical structure of citation context : an approach using n-grams
Using the full-text corpus of more than 75,000 research articles published by seven PLOS journals, this paper
proposes a natural language processing approach for identifying the function of citations. Citation contexts are
assigned based on the frequency of n-gram co-occurrences located near the citations. Results show that the most
frequent linguistic patterns found in the citation contexts of papers vary according to their location in the IMRaD
structure of scientific articles. The presence of negative citations is also dependent on this structure. This
methodology offers new perspectives to locate these discursive forms according to the rhetorical structure of
scientific articles, and will lead to a better understanding of the use of citations in scientific articles
TOPIC MODELLING METHODOLOGY: ITS USE IN INFORMATION SYSTEMS AND OTHER MANAGERIAL DISCIPLINES
Over the last decade, quantitative text mining approaches to content analysis have gained increasing traction within information systems research, and related fields, such as business administration. Recently, topic models, which are supposed to provide their user with an overview of themes being dis-cussed in documents, have gained popularity. However, while convenient tools for the creation of this model class exist, the evaluation of topic models poses significant challenges to their users. In this research, we investigate how questions of model validity and trustworthiness of presented analyses are addressed across disciplines. We accomplish this by providing a structured review of methodological approaches across the Financial Times 50 journal ranking. We identify 59 methodological research papers, 24 implementations of topic models, as well as 33 research papers using topic models in In-formation Systems (IS) research, and 29 papers using such models in other managerial disciplines. Results indicate a need for model implementations usable by a wider audience, as well as the need for more implementations of model validation techniques, and the need for a discussion about the theoretical foundations of topic modelling based research
- âŠ