57 research outputs found
Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)
The large scale of scholarly publications poses a challenge for scholars in
information seeking and sensemaking. Bibliometrics, information retrieval (IR),
text mining and NLP techniques could help in these search and look-up
activities, but are not yet widely used. This workshop is intended to stimulate
IR researchers and digital library professionals to elaborate on new approaches
in natural language processing, information retrieval, scientometrics, text
mining and recommendation techniques that can advance the state-of-the-art in
scholarly document understanding, analysis, and retrieval at scale. The BIRNDL
workshop at SIGIR 2017 will incorporate an invited talk, paper sessions and the
third edition of the Computational Linguistics (CL) Scientific Summarization
Shared Task.Comment: 2 pages, workshop paper accepted at the SIGIR 201
Leveraging full-text article exploration for citation analysis
Scientific articles often include in-text citations quoting from external sources. When the cited source is an article, the citation context can be analyzed by exploring the article full-text. To quickly access the key information, researchers are often interested in identifying the sections of the cited article that are most pertinent to the text surrounding the citation in the citing article. This paper first performs a data-driven analysis of the correlation between the textual content of the sections of the cited article and the text snippet where the citation is placed. The results of the correlation analysis show that the title and abstract of the cited article are likely to include content highly similar to the citing snippet. However, the subsequent sections of the paper often include cited text snippets as well. Hence, there is a need to understand the extent to which an exploration of the full-text of the cited article would be beneficial to gain insights into the citing snippet, considering also the fact that the full-text access could be restricted. To this end, we then propose a classification approach to automatically predicting whether the cited snippets in the full-text of the paper contain a significant amount of new content beyond abstract and title. The proposed approach could support researchers in leveraging full-text article exploration for citation analysis. The experiments conducted on real scientific articles show promising results: the classifier has a 90% chance to correctly distinguish between the full-text exploration and only title and abstract cases
- …