3 research outputs found

    MEMPREDIKSI PENINGKATAN H-INDEKS UNTUK JURNAL PENELITIAN DENGAN MENGGUNAKAN ALGORITMA COST-SENSITIVE SELECTIVE NAIVE BAYES CLASSIFIERS

    Get PDF
    Machine learning community is not only interested in maximizing classification accuracy, but also in minimizing the distances between the actual and the predicted class. Some ideas, like the cost-sensitive learning approach, are proposed to face this problem. In this paper, we propose two greedy wrapper forward cost-sensitive selective naive Bayes approaches. Both approaches readjust the probability thresholds of each class to select the class with the minimum-expected cost. The first algorithm (CSSNB-Accuracy) considers adding each variable to the model and measures the performance of the resulting model on the training data. The variable that most improves the accuracy, that is, the percentage of well classified instances between the readjusted class and actual class, is permanently added to the model. In contrast, the second algorithm (CS-SNB-Cost) considers adding variables that reduce the misclassification cost, that is, the distance between the readjusted class and actual class. We have tested our algorithms on the bibliometric indices prediction area. Considering the popularity of the well-known h-index, we have researched and built several prediction models to forecast the annual increase of the h-index for Neurosciences journals in a four-year time horizon. Results show that our approaches, particularly CS-SNB-Accuracy, achieved higher accuracy values than the analyzed cost sensitive classifiers and Bayesian classifiers. Furthermore, we also noted that the CS-SNB-Cost always achieved a lower average cost than all analyzed cost-sensitive and cost-insensitive classifiers. These cost sensitive selective naive Bayes approaches outperform the selective naive Bayes in terms of accuracy and average cost, so the cost-sensitive learning approach could be also applied in different probabilistic classification approaches

    Genetic algorithms and Gaussian Bayesian networks to uncover the predictive core set of bibliometric indices

    Get PDF
    The diversity of bibliometric indices today poses the challenge of exploiting the relationships among them. Our research uncovers the best core set of relevant indices for predicting other bibliometric indices. An added difficulty is to select the role of each variable, that is, which bibliometric indices are predictive variables and which are response variables. This results in a novel multioutput regression problem where the role of each variable (predictor or response) is unknown beforehand. We use Gaussian Bayesian networks to solve the this problem and discover multivariate relationships among bibliometric indices. These networks are learnt by a genetic algorithm that looks for the optimal models that best predict bibliometric data. Results show that the optimal induced Gaussian Bayesian networks corroborate previous relationships between several indices, but also suggest new, previously unreported interactions. An extended analysis of the best model illustrates that a set of 12 bibliometric indices can be accurately predicted using only a smaller predictive core subset composed of citations, g-index, q2-index, and hr-index. This research is performed using bibliometric data on Spanish full professors associated with the computer science area

    Productividad y Visibilidad Científica de los Profesores Funcionarios de las Universidades Públicas Españolas en el Área de Tecnologías Informáticas

    Full text link
    Este informe describe y analiza de una manera estructurada, la productividad y visibilidad científica internacional de los profesores (Catedrático de Universidad (CU), Titular de Universidad (TU), Catedrático de Escuela Universitaria (CEU) y Titular de Escuela Universitaria (TEU)) de las universidades públicas españolas adscritos a las áreas de conocimiento de Arquitectura y Tecnología de los Computadores (ATC), Ciencia de la Computación e Inteligencia Artificial (CCIA), y Lenguajes y Sistemas Informáticos (LSI), detectando tanto las fortalezas como debilidades de los mismos. El análisis se realiza tanto a nivel nacional como por comunidades autónomas, universidades, áreas de conocimiento y categorías profesionales. Con ello se consigue una visión global y detallada de la situación actual en el área de las Tecnologías Informática
    corecore