14 research outputs found

    Beyond Citations: Measuring Novel Scientific Ideas and their Impact in Publication Text

    Full text link
    New scientific ideas fuel economic progress, yet their identification and measurement remains challenging. In this paper, we use natural language processing to identify the origin and impact of new scientific ideas in the text of scientific publications. To validate the new techniques and their improvement over traditional metrics based on citations, we first leverage Nobel prize papers that likely pioneered new scientific ideas with a major impact on scientific progress. Second, we use literature review papers that typically summarize existing knowledge rather than pioneer new scientific ideas. Finally, we demonstrate that papers introducing new scientific ideas are more likely to become highly cited by both publications and patents. We provide open access to code and data for all scientific papers up to December 2020

    How to do research on the societal impact of research? Studies from a semantic perspective

    Get PDF
    We review some recent works of our research lab that have applied novel text mining techniques to the issue of research impact assessment. The techniques are Semantic Hypergraphs and Lexicon-based Named Entity Recognition. By using these techniques, we address two distinct and open issues in research impact assessment: the epistemological and logical status of impact assessment, and the construction of quantitative indicators. © 2021 18th International Conference on Scientometrics and Informetrics, ISSI 2021. All rights reserved

    Text Mining for Innovation Measurement

    No full text
    L'innovazione è ampiamente riconosciuta come uno dei driver principali della crescita economica. Tuttavia, la misurazione dell'innovazione rimane una sfida, dovuta in parte alla difficoltà di identificare e quantificare le diverse manifestazioni dell'innovazione. Il Natural Language Processing offre una potenziale soluzione a questo problema, fornendo un mezzo per estrarre automaticamente le informazioni sull'innovazione da grandi volumi di dati non strutturati. Tuttavia, l'analisi dei dati non strutturati è soggetta a numerose sfide metodologiche che non sono ancora completamente risolte. Questo lavoro esplora l'utilizzo del Text Mining per la misurazione dell'innovazione, sviluppando metodi per dimostrare che le fonti di dati tradizionali (brevetti e pubblicazioni) e quelle innovative (come i social media, le job descriptions o gli standard) possono essere utilizzate per identificare e quantificare i fenomeni correlati alle principali fasi dell'innovazione (generazione, diffusione e maturità). I risultati di questo lavoro suggeriscono che il Text Mining può quindi essere considerato come uno strumento utile per la misurazione dell'innovazione, e ha il potenziale di fornire preziose informazioni sul processo di innovazione per manager, accademici e policy maker. Innovation is widely recognized as a driver of economic growth and competitiveness. However, measuring innovation remains a challenge, due in part to the difficulty of identifying and quantifying the many different manifestations of innovation. Natural Language Processing offers a potential solution to this problem, by providing a means of automatically extracting information on innovation from large volumes of unstructured text data. However, the analysis of unstructured text data is subject to many methodological challenges which are not yet fully resolved. This work explores the use of Text Mining for Innovation Measurement by developing methods to prove that traditional (patents and publications) and novel data sources (such as social media, job descriptions or standards) can be used to identifying and quantifying phenomena over the main innovation stages (generation, development diffusion and maturity). The findings of this work suggest that Text Mining can therefore be seen as a useful tool for Innovation Measurement, and has the potential to provide valuable insights of the innovation process for managers, academics and policy makers

    Detecting interdisciplinarity in top-class research via topic models

    No full text
    In the twenty-first century, innovations arise from problem-oriented research, whose approach is oriented to cross over traditional faculties and disciplines. This leads to assume that research is increasingly moving toward more interdisciplinary endeavours. The aim of this work is to understand the conjunctural factors that favour the development of interdisciplinarity research. To achieve this objective, rather than relying on bibliometric-based measurements, as done by recent literature, we detect interdisciplinarity by analysing the textual content of a sample of the top-class european research production. Particularly, we focus on the dataset composed of the summary, final report and publications of all research projects funded by the European Research Council (ERC), and we use the machine learning technique of Topic Modeling to extract research topics and evaluate interdisciplinarity. We first built a formal model of academic research scenarios that takes into account the contextual factors that contribute to the development of interdisciplinarity research. After linking, for each academic institution awarded with an ERC project, the interdisciplinary measurements obtained via Topic Modelling with contextual variables that captured the institutional characteristics, we are able to test a series of hypotheses on which is the role played by each factor in shaping interdisciplinarity research

    Enhancing Industry 4.0 standards interoperability via knowledge graphs with natural language processing

    No full text
    Industry 4.0 (I4.0) has brought several challenges related to the need to acquire and integrate large amounts of data from multiple sources in order to integrate these elements into an automated manufacturing system. Establishing interoperability is crucial to meet these challenges, and standards development and adoptions play a key role in achieving this. Therefore, academics and industrial stakeholders must join their forces in order to develop methods to enhance interoperability and to mitigate possible conflicts between standards. The aim of this paper is to propose an approach that enhances interoperability between standards through the combined use of Natural Language Processing (NLP) and Knowledge Graphs (KG). In particular, the proposed method is based on the measurement of semantic similarity among the textual content of standards documents belonging to different standardization frameworks. The present study contributes to the research and practice in three ways. First, it fills research gaps concerning the synergy of NLP, KGs and I4.0. Second, it provides an automatic method that improves the process of creating, curating and enriching a KG. Third, it provides qualitative and quantitative evidence of Semantic Interoperability Conflicts (SICs). The experimental results of the application of the proposed method to the I4.0 Standards Knowledge Graph (I40KG) show that different standards are still struggling to use a shared language and that there exists a strong different in the view of I4.0 proposed by the two main standardization frameworks (RAMI and IIRA). By automatically enriching the I40KG with a solid experimental approach, we are paving the way for actionable knowledge which has been extracted from the PDFs and made available in the I40KG

    Publication text: code, data, and new measures

    No full text
    <p>This Zenodo page describes data collection, processing, and different open access data files related to the text of scientific publications from Microsoft Academic Graph (MAG) (now OpenAlex). If you use the code or data, please cite the following paper: </p><p>Arts S, Melluso N, Veugelers R (2023). Beyond Citations: Measuring Novel Scientific Ideas and their Impact in Publication Text. <a href="http://doi.org/10.48550/arXiv.2309.16437 ">https://doi.org/10.48550/arXiv.2309.16437 </a></p&gt
    corecore