8,734 research outputs found
Editorial for the First Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics
The workshop "Mining Scientific Papers: Computational Linguistics and
Bibliometrics" (CLBib 2015), co-located with the 15th International Society of
Scientometrics and Informetrics Conference (ISSI 2015), brought together
researchers in Bibliometrics and Computational Linguistics in order to study
the ways Bibliometrics can benefit from large-scale text analytics and sense
mining of scientific papers, thus exploring the interdisciplinarity of
Bibliometrics and Natural Language Processing (NLP). The goals of the workshop
were to answer questions like: How can we enhance author network analysis and
Bibliometrics using data obtained by text analytics? What insights can NLP
provide on the structure of scientific writing, on citation networks, and on
in-text citation analysis? This workshop is the first step to foster the
reflection on the interdisciplinarity and the benefits that the two disciplines
Bibliometrics and Natural Language Processing can drive from it.Comment: 4 pages, Workshop on Mining Scientific Papers: Computational
Linguistics and Bibliometrics at ISSI 201
The Closer the Better: Similarity of Publication Pairs at Different Co-Citation Levels
We investigate the similarities of pairs of articles which are co-cited at
the different co-citation levels of the journal, article, section, paragraph,
sentence and bracket. Our results indicate that textual similarity,
intellectual overlap (shared references), author overlap (shared authors),
proximity in publication time all rise monotonically as the co-citation level
gets lower (from journal to bracket). While the main gain in similarity happens
when moving from journal to article co-citation, all level changes entail an
increase in similarity, especially section to paragraph and paragraph to
sentence/bracket levels. We compare results from four journals over the years
2010-2015: Cell, the European Journal of Operational Research, Physics Letters
B and Research Policy, with consistent general outcomes and some interesting
differences. Our findings motivate the use of granular co-citation information
as defined by meaningful units of text, with implications for, among others,
the elaboration of maps of science and the retrieval of scholarly literature
WikiM: Metapaths based Wikification of Scientific Abstracts
In order to disseminate the exponential extent of knowledge being produced in
the form of scientific publications, it would be best to design mechanisms that
connect it with already existing rich repository of concepts -- the Wikipedia.
Not only does it make scientific reading simple and easy (by connecting the
involved concepts used in the scientific articles to their Wikipedia
explanations) but also improves the overall quality of the article. In this
paper, we present a novel metapath based method, WikiM, to efficiently wikify
scientific abstracts -- a topic that has been rarely investigated in the
literature. One of the prime motivations for this work comes from the
observation that, wikified abstracts of scientific documents help a reader to
decide better, in comparison to the plain abstracts, whether (s)he would be
interested to read the full article. We perform mention extraction mostly
through traditional tf-idf measures coupled with a set of smart filters. The
entity linking heavily leverages on the rich citation and author publication
networks. Our observation is that various metapaths defined over these networks
can significantly enhance the overall performance of the system. For mention
extraction and entity linking, we outperform most of the competing
state-of-the-art techniques by a large margin arriving at precision values of
72.42% and 73.8% respectively over a dataset from the ACL Anthology Network. In
order to establish the robustness of our scheme, we wikify three other datasets
and get precision values of 63.41%-94.03% and 67.67%-73.29% respectively for
the mention extraction and the entity linking phase
- …