17 research outputs found

    Does deep learning help topic extraction? A kernel k-means clustering method with word embedding

    Full text link
    © 2018 All rights reserved. Topic extraction presents challenges for the bibliometric community, and its performance still depends on human intervention and its practical areas. This paper proposes a novel kernel k-means clustering method incorporated with a word embedding model to create a solution that effectively extracts topics from bibliometric data. The experimental results of a comparison of this method with four clustering baselines (i.e., k-means, fuzzy c-means, principal component analysis, and topic models) on two bibliometric datasets demonstrate its effectiveness across either a relatively broad range of disciplines or a given domain. An empirical study on bibliometric topic extraction from articles published by three top-tier bibliometric journals between 2000 and 2017, supported by expert knowledge-based evaluations, provides supplemental evidence of the method's ability on topic extraction. Additionally, this empirical analysis reveals insights into both overlapping and diverse research interests among the three journals that would benefit journal publishers, editorial boards, and research communities

    Characterizing the potential of being emerging generic technologies: A Bi-Layer Network Analytics-based Prediction Method

    Full text link
    © 2019 17th International Conference on Scientometrics and Informetrics, ISSI 2019 - Proceedings. All rights reserved. Despite tremendous involvement of bibliometrics in profiling technological landscapes and identifying emerging topics, how to predict potential technological change is still unclear. This paper proposes a bi-layer network analytics-based prediction method to characterize the potential of being emerging generic technologies. Initially, based on the innovation literature, three technological characteristics are defined, and quantified by topological indicators in network analytics; a link prediction approach is applied for reconstructing the network with weighted missing links, and such reconstruction will also result in the change of related technological characteristics; the comparison between the two ranking lists of terms can help identify potential emerging generic technologies. A case study on predicting emerging generic technologies in information science demonstrates the feasibility and reliability of the proposed method

    VERDAD Y VALIDEZ DEL CONOCIMIENTO DESDE LA TEORÍA DE SEÑUELOS POR SUPUESTAS PREMISAS FALSAS

    Get PDF
    The aim of the study was to describe the truth and validity of knowledge from the theory of decoys by supposed false premises. Th e study was carried out between January and February 2022 where the following was considered, from the ScScienceatabase (Editorial Elsevier) and through the search equation and without fi lters: the truth of knowledge. The corresponding articles were selected through a non-probabilistic sampling for convenience: 1st) period of 2019 - 2021, and 2nd) review. Then, a systematic random probabilistic sampling with a rank of 5 in the arithmetic progression was carried out where 35 articles (10% / total) were selected for the conceptual analysis of truth and validity. It was considered, the recognition of the contrast of hypotheses from an almost maximum probability (0.01) and valued in the process of searching for the truth and the existence of validity, the test on two lures of false premises (P1*, P2* and P1**, P2**), since their analysis of logical reasoning determines that they are true. It is concluded that the truth and validity of knowledge can be, from accepting or rejecting the premises, but a criterion of judgment for the decision to contrast is to indicate any determination where the rejection is considered from the interpretive logic itself among what is selected. to prove, and the premises that are thought to be “allegedly false.”El objetivo del estudio fue describir la verdad y validez del conocimiento desde la teoría de señuelos por supuestas premisas falsas. El estudio se realizó entre enero y febrero de 2022 donde se consideró, desde la base de datos Sciendirect (Editorial Elsevier) y mediante la ecuación de búsqueda y sin filtros lo siguiente: truth of knowledge. Se seleccionó mediante un muestro no probabilístico por conveniencia, los artículos que correspondieron: 1ro) periodo de 2019 – 2021, y 2do) de revisión. Luego, se realizó un muestreo probabilístico aleatorio sistemático con rango de 5 en la progresión aritmética donde se seleccionaron 35 artículos (10 % / total) para el análisis conceptual de la verdad y la validez. Se consideró, el reconocimiento del contraste de hipótesis desde una probabilidad casi máxima (0,01) y valorarse en el proceso de búsqueda de la verdad y la existencia de la validez, el examen sobre dos señuelos de premisas falsas (P1*, P2* y P1**, P2**), pues su análisis de razonamiento lógico, determina que sean verdaderas. Se concluye, que la verdad y validez del conocimiento pueden ser, desde aceptar o rechazar las premisas, pero un criterio de juicio para la decisión a contrastar, es indicarse cualquier determinación donde se considere el rechazo desde la propia lógica interpretativa entre lo que se selecciona para demostrar, y las premisas que se piensan sean “supuestas falsas”

    Semi-automated extraction of research topics and trends from NCI funding in radiological sciences from 2000-2020

    Full text link
    Investigators, funders, and the public desire knowledge on topics and trends in publicly funded research but current efforts in manual categorization are limited in scale and understanding. We developed a semi-automated approach to extract and name research topics, and applied this to \$1.9B of NCI funding over 21 years in the radiological sciences to determine micro- and macro-scale research topics and funding trends. Our method relies on sequential clustering of existing biomedical-based word embeddings, naming using subject matter experts, and visualization to discover trends at a macroscopic scale above individual topics. We present results using 15 and 60 cluster topics, where we found that 2D projection of grant embeddings reveals two dominant axes: physics-biology and therapeutic-diagnostic. For our dataset, we found that funding for therapeutics- and physics-based research have outpaced diagnostics- and biology-based research, respectively. We hope these results may (1) give insight to funders on the appropriateness of their funding allocation, (2) assist investigators in contextualizing their work and explore neighboring research domains, and (3) allow the public to review where their tax dollars are being allocated.Comment: Presented at the American Society of Radiation Oncology annual meeting in 2021 ((doi: 10.1016/j.ijrobp.2021.07.263) and the Practical Big Data Workshop 202

    Hierarchical topic tree: A hybrid model comprising network analysis and density peak search

    Full text link
    Topic hierarchies can help researchers to develop a quick and concise understanding of the main themes and concepts in a field of interest. This is especially useful for newcomers to a field or those with a passing need for basic knowledge of a research landscape. Yet, despite a plethora of studies into hierarchical topic identification, there still lacks a model that is comprehensive enough or adaptive enough to extract the topics from a corpus, deal with the concepts shared by multiple topics, arrange the topics in a hierarchy, and give each topic an appropriate name. Hence, this paper presents a one-stop framework for generating fully-conceptualized hierarchical topic trees. First, we generate a co-occurrence network based on key terms extracted from a corpus of documents. Then a density peak search algorithm is developed and applied to identify the core topic terms, which are subsequently used as topic labels. An overlapping community allocation algorithm follows to detect topics and possible overlaps between them. Lastly, the density peak search and overlapping community allocation algorithms run recursively to structure the topics into a hierarchical tree. The feasibility, reliability, and extensibility of the proposed framework are demonstrated through a case study on the field of computer science

    Verdad y validez del conocimiento. Premisas para la consultoría administrativa.

    Get PDF
    This research shows relevant information to get into the current situation of MSMEs in Latin America, regarding the use of digital marketing strategies. The interest in exploring this topic is due to the relevance of the use of digital media, since currently many micro and small businesses are still afraid to enter these areas, because they have limited knowledge of their properties and management. This research seeks to promote the continued adoption of these digital tools, by applying appropriate e-marketing solutions to ensure the sustained success of the business over time and thus be prepared to face the challenges of the environment. For the selection of the bibliographic sources different filtering was used, specifying to choose researches according to our study topic that have been published from the year 2016 to the year 2021. At the end of this process, a total of 12 indexed journal articles, 2 undergraduate theses and 1 book were obtained. According to the results found, it can be observed that, in most of the countries analyzed, large companies are the ones that have obtained the best and greatest benefit from digital marketing, on the other hand, micro and small companies have not yet been able to efficiently apply this type of digital tools, due to the degree of ignorance of these and also because of the scarcity of their budget.El estudio tiene como objetivo sintetizar las argumentaciones acerca de verdad y conocimiento científico, desde teorías y apuntes de diferentes autores; para referir el proceso de validez del conocimiento como resultado de investigación. A través de tres motores de búsquedas (ScienceDirect, Scopus y SciELO), se realizó una exploración pormenorizada con la ecuación en español e inglés: “verdad”, “validez cognoscitiva”, “conocimiento científico” y “consultoría administrativa”. Se revisaron 29 artículos para el análisis conceptual de las palabras claves, mediante 28 revistas científicas, considerándose el contraste entre la relación de las mismas con la finalidad de construcción de premisas para la instrumentación e implementación de la consultoría administrativa.  Pues, a partir del conocimiento empírico, se realiza un análisis para la construcción del conocimiento científico, definiendo las cualidades superiores entre ellos. Se constata la relación del conocimiento con la administración de consultoría empresarial, constituyendo premisas en la gestión del conocimiento, debido a que, los análisis referenciales estudiados constituyen aportes, que permiten al gestor en consultoría administrativa contribuir a la planeación y concepción estratégica, en la gestión integral de los procesos organizacionales. Se concluye con supuestas premisas de aceptar o rechazar, a partir de la necesidad de interacción de diferentes disciplinas para validar conocimiento como fuente originaria para la administración

    Data for: Does deep learning help topic extraction? A kernel k-means clustering method with word embedding

    No full text
    The 4770 dataset includes 4770 articles in the Web of Science database, covering 10 disciplines, such as artificial intelligence, business, history, and chemistry.The 577 dataset includes 577 proposals granted by the National Science Foundation of the United States, and all the 577 proposals are within the area of computer science but are in different sub areas of computer science.The 6767 dataset includes 6767 articles published in Journal of the Association for Information Science and Technology, Journal of Informetrics, and Scientometrics from 2000 to 2016. No labels are given for this dataset.THIS DATASET IS ARCHIVED AT DANS/EASY, BUT NOT ACCESSIBLE HERE. TO VIEW A LIST OF FILES AND ACCESS THE FILES IN THIS DATASET CLICK ON THE DOI-LINK ABOV

    Dynamic network analytics for recommending scientific collaborators

    Full text link
    Collaboration is one of the most important contributors to scientific advancement and a crucial aspect of an academic’s career. However, the explosion in academic publications has, for some time, been making it more challenging to find suitable research partners. Recommendation approaches to help academics find potential collaborators are not new. However, the existing methods operate on static data, which can render many suggestions less useful or out of date. The approach presented in this paper simulates a dynamic network from static data to gain further insights into the changing research interests, activities and co-authorships of scholars in a field–all insights that can improve the quality of the recommendations produced. Following a detailed explanation of the entire framework, from data collection through to recommendation modelling, we provide a case study on the field of information science to demonstrate the reliability of the proposed method, and the results provide empirical insights to support decision-making in related stakeholders—e.g., scientific funding agencies, research institutions and individual researchers in the field
    corecore